TY - GEN
T1 - Density based Clustering over Location Based Services
AU - Rahman, Md Farhadur
AU - Liu, Weimo
AU - Suhaim, Saad Bin
AU - Thirumuruganathan, Saravanan
AU - Zhang, Nan
AU - Das, Gautam
N1 - Publisher Copyright:
© 2017 IEEE.
Copyright:
Copyright 2017 Elsevier B.V., All rights reserved.
PY - 2017/5/16
Y1 - 2017/5/16
N2 - Location Based Services (LBS) have become extremely popular over the past decade, being used on a daily basis by millions of users. Instances of real-world LBS range from mapping services (e.g., Google Maps) to lifestyle recommendations (e.g., Yelp) to real-estate search (e.g., Redfin). In general, an LBS provides a public (often web-based) search interface over its backend database (of tuples with 2D geolocations), taking as input a 2D query point and returning k tuples in the database that are closest to the query point, where k is usually a small constant such as 20 or 50. Such a public interface is often called a k-Nearest-Neighbor, i.e., kNN, interface. In this paper, we consider a novel problem of enabling density based clustering over the backend database of an LBS using nothing but limited access to the kNN interface provided by the LBS. Specifically, a key limit enforced by most real-world LBS is a maximum number of kNN queries allowed from a user over a given time period. Since such a limit is often orders of magnitude smaller than the number of tuples in the LBS database, our goal here is to mine from the LBS a cluster assignment function f(·), such that for any tuple t in the database (which may or may not have been accessed), f(·) can produce the cluster assignment of t with high accuracy. We conduct a comprehensive set of experiments over benchmark datasets and popular real-world LBS such as Yahoo Flickr, Zillow, Redfin and Google Maps and demonstrate the effectiveness of our proposed techniques.
AB - Location Based Services (LBS) have become extremely popular over the past decade, being used on a daily basis by millions of users. Instances of real-world LBS range from mapping services (e.g., Google Maps) to lifestyle recommendations (e.g., Yelp) to real-estate search (e.g., Redfin). In general, an LBS provides a public (often web-based) search interface over its backend database (of tuples with 2D geolocations), taking as input a 2D query point and returning k tuples in the database that are closest to the query point, where k is usually a small constant such as 20 or 50. Such a public interface is often called a k-Nearest-Neighbor, i.e., kNN, interface. In this paper, we consider a novel problem of enabling density based clustering over the backend database of an LBS using nothing but limited access to the kNN interface provided by the LBS. Specifically, a key limit enforced by most real-world LBS is a maximum number of kNN queries allowed from a user over a given time period. Since such a limit is often orders of magnitude smaller than the number of tuples in the LBS database, our goal here is to mine from the LBS a cluster assignment function f(·), such that for any tuple t in the database (which may or may not have been accessed), f(·) can produce the cluster assignment of t with high accuracy. We conduct a comprehensive set of experiments over benchmark datasets and popular real-world LBS such as Yahoo Flickr, Zillow, Redfin and Google Maps and demonstrate the effectiveness of our proposed techniques.
UR - http://www.scopus.com/inward/record.url?scp=85021186949&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85021186949&partnerID=8YFLogxK
U2 - 10.1109/ICDE.2017.103
DO - 10.1109/ICDE.2017.103
M3 - Conference contribution
AN - SCOPUS:85021186949
T3 - Proceedings - International Conference on Data Engineering
SP - 461
EP - 469
BT - Proceedings - 2017 IEEE 33rd International Conference on Data Engineering, ICDE 2017
PB - IEEE Computer Society
T2 - 33rd IEEE International Conference on Data Engineering, ICDE 2017
Y2 - 19 April 2017 through 22 April 2017
ER -