TY - JOUR
T1 - Distributed trajectory similarity search
AU - Xie, Dong
AU - Li, Feifei
AU - Phillips, Jeff M.
N1 - Funding Information:
We appreciate the comments from the annoymous reviewers. Authors thank the support from NSF grants 1200792, 1251019, 1350888, 1443046, and 1619287. Feifei Li was also supported in part by NSFC grant 61428204 and a Huawei gift award.
Publisher Copyright:
© 2017 VLDB.
PY - 2017/8/1
Y1 - 2017/8/1
N2 - Mobile and sensing devices have already become ubiquitous. They have made tracking moving objects an easy task. As a result, mobile applications like Uber and many IoT projects have generated massive amounts of trajectory data that can no longer be processed by a single machine efficiently. Among the typical query operations over trajectories, similarity search is a common yet expensive operator in querying trajectory data. It is useful for applications in different domains such as traffic and transportation optimizations, weather forecast and modeling, and sports analytics. It is also a fundamental operator for many important mining operations such as clustering and classification of trajectories. In this paper, we propose a distributed query framework to process trajectory similarity search over a large set of trajectories. We have implemented the proposed framework in Spark, a popular distributed data processing engine, by carefully considering different design choices. Our query framework supports both the Hausdorffdistance the Fréchet distance. Extensive experiments have demonstrated the excellent scalability and query efficiency achieved by our design, compared to other methods and design alternatives.
AB - Mobile and sensing devices have already become ubiquitous. They have made tracking moving objects an easy task. As a result, mobile applications like Uber and many IoT projects have generated massive amounts of trajectory data that can no longer be processed by a single machine efficiently. Among the typical query operations over trajectories, similarity search is a common yet expensive operator in querying trajectory data. It is useful for applications in different domains such as traffic and transportation optimizations, weather forecast and modeling, and sports analytics. It is also a fundamental operator for many important mining operations such as clustering and classification of trajectories. In this paper, we propose a distributed query framework to process trajectory similarity search over a large set of trajectories. We have implemented the proposed framework in Spark, a popular distributed data processing engine, by carefully considering different design choices. Our query framework supports both the Hausdorffdistance the Fréchet distance. Extensive experiments have demonstrated the excellent scalability and query efficiency achieved by our design, compared to other methods and design alternatives.
UR - http://www.scopus.com/inward/record.url?scp=85037054600&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85037054600&partnerID=8YFLogxK
U2 - 10.14778/3137628.3137655
DO - 10.14778/3137628.3137655
M3 - Conference article
AN - SCOPUS:85037054600
SN - 2150-8097
VL - 10
SP - 1478
EP - 1489
JO - Proceedings of the VLDB Endowment
JF - Proceedings of the VLDB Endowment
IS - 11
T2 - 43rd International Conference on Very Large Data Bases, VLDB 2017
Y2 - 28 August 2017 through 1 September 2017
ER -