TY - JOUR
T1 - Selectivity estimation for spatial joins
AU - An, Ning
AU - Yang, Zhen Yu
AU - Sivasubramaniam, Anand
PY - 2001
Y1 - 2001
N2 - Spatial Joins are important and time consuming operations in spatial database management systems. It is crucial to be able to accurately estimate the performance of these operations so that one can derive efficient query execution plans, and even develop/refine data structures to improve their performance. While estimation techniques for analyzing the performance of other operations, such as range queries, on spatial data has come under scrutiny, the problem of estimating selectivity for spatial joins has been little explored. The limited forays into this area have used parametric techniques, which are largely restrictive on the data sets that they can be used for since they tend to make simplifying assumptions about the nature of the datasets to be joined. Sampling and histogram based techniques, on the other hand, are much less restrictive. However, there has been no prior attempt at understanding the accuracy of sampling techniques, or developing histogram based techniques to estimate the selectivity of spatial joins. Apart from extensively evaluating the accuracy of sampling techniques for the very first time, this paper presents two novel histogram based solutions for spatial join estimation. Using a wide spectrum of both real and synthetic datasets, it is shown that one of our proposed schemes, called Geometric Histograms (GH), can accurately quantify the selectivity of spatial joins.
AB - Spatial Joins are important and time consuming operations in spatial database management systems. It is crucial to be able to accurately estimate the performance of these operations so that one can derive efficient query execution plans, and even develop/refine data structures to improve their performance. While estimation techniques for analyzing the performance of other operations, such as range queries, on spatial data has come under scrutiny, the problem of estimating selectivity for spatial joins has been little explored. The limited forays into this area have used parametric techniques, which are largely restrictive on the data sets that they can be used for since they tend to make simplifying assumptions about the nature of the datasets to be joined. Sampling and histogram based techniques, on the other hand, are much less restrictive. However, there has been no prior attempt at understanding the accuracy of sampling techniques, or developing histogram based techniques to estimate the selectivity of spatial joins. Apart from extensively evaluating the accuracy of sampling techniques for the very first time, this paper presents two novel histogram based solutions for spatial join estimation. Using a wide spectrum of both real and synthetic datasets, it is shown that one of our proposed schemes, called Geometric Histograms (GH), can accurately quantify the selectivity of spatial joins.
UR - http://www.scopus.com/inward/record.url?scp=0034996150&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0034996150&partnerID=8YFLogxK
U2 - 10.1109/ICDE.2001.914849
DO - 10.1109/ICDE.2001.914849
M3 - Article
AN - SCOPUS:0034996150
SN - 1084-4627
SP - 368
EP - 375
JO - Proceedings - International Conference on Data Engineering
JF - Proceedings - International Conference on Data Engineering
ER -