Swin-T-NFC CRFs: An encoder–decoder neural model for high-precision UAV positioning via point cloud super resolution and image semantic segmentation

Suhong Wang, Hongqing Wang, Shufeng She, Yanping Zhang, Qingju Qiu, Zhifeng Xiao

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

3D point cloud and remote sensing images have been the primary data types for the development of high-precision positioning systems. Equipped with a LiDAR and a camera, an Unmanned Aerial Vehicle (UAV) can explore an uncharted territory and gather both 3D scans and aerial images in real time to dynamically inspect the surroundings. However, the high cost of a high-resolution LiDAR hinders the development of the perception module of an UAV. Also, it is essential to adopt accurate image semantic segmentation (SemSeg) algorithms to better understand the sensing environment. As hardware advancement is ongoing, support from the software side is crucial. A promising strategy for cost control in building a LiDAR-based positioning system is through point cloud super-resolution (SupRes), a technique that improves the point cloud resolution via algorithms. This study investigates a deep learning-based framework that adopts a classic encoder–decoder structure for both point cloud SupRes and image SemSeg. Unlike prior studies that mainly use convolutional neural networks (CNNs) for feature extraction, our model, named Swin-T-NFC CRFs, consists of a Vision Transformer (ViT)-based encoder and a fully connected conditional random fields (FC-CRFs)-based decoder, connected via a pyramid pooling module and multiple skip connections. Moreover, both encoder and decoder are coupled with a shifted window strategy that allows cross-window connection. As such, patches from different windows of the feature map can participate in self-attention computation, leading to more powerful modeling ability. Experimental results demonstrate that our method can effectively boost the prediction accuracy, reduce the error, and consistently outperform the state-of-the-art methods on simulated/real-world point cloud datasets and the urban drone dataset version 6.

Original languageEnglish (US)
Pages (from-to)52-60
Number of pages9
JournalComputer Communications
Volume197
DOIs
StatePublished - Jan 1 2023

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Swin-T-NFC CRFs: An encoder–decoder neural model for high-precision UAV positioning via point cloud super resolution and image semantic segmentation'. Together they form a unique fingerprint.

Cite this