Design and implementation of a parallel geographically weighted k-nearest neighbor classifier

Yingxia Pu, Xinyi Zhao, Guangqing Chi, Shuhe Zhao, Jiechen Wang, Zhibin Jin, Junjun Yin

Research output: Contribution to journalArticlepeer-review

16 Scopus citations

Abstract

The development of high-performance classifiers represents an important step in improving the timeliness of remote sensing classification in the era of high spatial resolution. The geographically weighted k-nearest neighbors (gwk-NN) classifier, which incorporates spatial information into the traditional k-NN classifier, has demonstrated better performance in mitigating salt-and-pepper noise and misclassification. However, the integration of spatial dependence into spectral information is computationally intensive. To improve the computing performance of the gwk-NN classifier, this study first considered two commonly used parallel strategies—data parallelism and task parallelism—in the model training and image classification stages. Then, our implementation of the corresponding parallel algorithms was carried out by calling message passing interface (MPI) and the geospatial data abstraction library (GDAL) in the C++ development environment on a standalone eight-core computer. Based on the performance of these two strategies, the potentiality of dual parallelism (the simultaneous exploitation of data and task parallelism) in image classification was further investigated. Our experimental results indicate that the parallel gwk-NN classifier can improve the classification efficiency of high-resolution remote sensing images with multiple land cover types. Specifically, the data parallelism method is more effective than the task parallelism method in both the model training and classification stages because of the minor effect of parallel overhead on the total execution time. In addition, dual parallelism can take advantage of data and task parallel strategies, as evidenced by the two largest speedups being attained under dual parallelism I (5.28 ×), which is based on the premise of task parallelism, and dual parallelism II (5.73 ×), in which the priority is given to data decomposition. Comparatively, dual parallelism II provides the best performance by overlapping computation and data transmission, which is compatible with the current trend toward multicore architectures.

Original languageEnglish (US)
Pages (from-to)111-122
Number of pages12
JournalComputers and Geosciences
Volume127
DOIs
StatePublished - Jun 2019

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Computers in Earth Sciences

Fingerprint

Dive into the research topics of 'Design and implementation of a parallel geographically weighted k-nearest neighbor classifier'. Together they form a unique fingerprint.

Cite this