TY - JOUR
T1 - Exact and approximate flexible aggregate similarity search
AU - Li, Feifei
AU - Yi, Ke
AU - Tao, Yufei
AU - Yao, Bin
AU - Li, Yang
AU - Xie, Dong
AU - Wang, Min
N1 - Publisher Copyright:
© 2015, Springer-Verlag Berlin Heidelberg.
PY - 2016/6/1
Y1 - 2016/6/1
N2 - Aggregate similarity search, also known as aggregate nearest-neighbor (Ann) query, finds many useful applications in spatial and multimedia databases. Given a group Q of M query objects, it retrieves from a database the objects most similar to Q, where the similarity is an aggregation (e.g., sum , max) of the distances between each retrieved object p and all the objects in Q. In this paper, we propose an added flexibility to the query definition, where the similarity is an aggregation over the distances between p and any subset of ϕM objects in Q for some support0 < ϕ≤ 1. We call this new definition flexible aggregate similarity search and accordingly refer to a query as a flexible aggregate nearest-neighbor (Fann) query. We present algorithms for answering Fann queries exactly and approximately. Our approximation algorithms are especially appealing, which are simple, highly efficient, and work well in both low and high dimensions. They also return near-optimal answers with guaranteed constant-factor approximations in any dimensions. Extensive experiments on large real and synthetic datasets from 2 to 74 dimensions have demonstrated their superior efficiency and high quality.
AB - Aggregate similarity search, also known as aggregate nearest-neighbor (Ann) query, finds many useful applications in spatial and multimedia databases. Given a group Q of M query objects, it retrieves from a database the objects most similar to Q, where the similarity is an aggregation (e.g., sum , max) of the distances between each retrieved object p and all the objects in Q. In this paper, we propose an added flexibility to the query definition, where the similarity is an aggregation over the distances between p and any subset of ϕM objects in Q for some support0 < ϕ≤ 1. We call this new definition flexible aggregate similarity search and accordingly refer to a query as a flexible aggregate nearest-neighbor (Fann) query. We present algorithms for answering Fann queries exactly and approximately. Our approximation algorithms are especially appealing, which are simple, highly efficient, and work well in both low and high dimensions. They also return near-optimal answers with guaranteed constant-factor approximations in any dimensions. Extensive experiments on large real and synthetic datasets from 2 to 74 dimensions have demonstrated their superior efficiency and high quality.
UR - http://www.scopus.com/inward/record.url?scp=84953320958&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84953320958&partnerID=8YFLogxK
U2 - 10.1007/s00778-015-0418-x
DO - 10.1007/s00778-015-0418-x
M3 - Article
AN - SCOPUS:84953320958
SN - 1066-8888
VL - 25
SP - 317
EP - 338
JO - VLDB Journal
JF - VLDB Journal
IS - 3
ER -