Abstract
This paper studies the problem of querying Bounded Spatial Datasets (BSDs). A BSD contains i) objects with known locations, and ii) unknown regions, each of which bounds an unknown number of objects, within a coverage area. We consider applications where each BSD is hosted on a server or site connected to a communication network and the BSDs overlap in their coverage areas. The challenge is to query the distributed BSDs to retrieve all objects and to minimize the unknown regions which may contain objects satisfying the query, while minimizing the data transmission volume and number of interactions between the query client and the sites. We develop query processing algorithms for two important types of spatial queries, namely, range and k-nearest-neighbor (kNN) queries. We develop the site-based approach and the area-based approach for efficiently processing range and kNN queries on distributed BSDs. They aim to process only a subset of the sites to obtain the full answer for a query. Thus, optimal site selection and the corresponding site querying methods are important problems studied in this paper. In the area-based approach, we prove an optimal division and derive a practical heuristic to partition a query and select the best processing site for each partition, hence achieving even better efficiency than the site-based approach. Simulation results based on three real spatial datasets show that our proposed approaches significantly outperform the baseline that uses a centralized approach in terms of data transmission volume and the number of interactions between the query client and the distributed sites.
Original language | English (US) |
---|---|
Article number | 169 |
Pages (from-to) | 2534-2547 |
Number of pages | 14 |
Journal | IEEE Transactions on Knowledge and Data Engineering |
Volume | 26 |
Issue number | 10 |
DOIs | |
State | Published - Oct 1 2014 |
All Science Journal Classification (ASJC) codes
- Information Systems
- Computer Science Applications
- Computational Theory and Mathematics