TY - GEN
T1 - Semantic annotation of mobility data using social media
AU - Wu, Fei
AU - Li, Zhenhui
AU - Lee, Wang Chien
AU - Wang, Hongjian
AU - Huang, Zhuojie
PY - 2015/5/18
Y1 - 2015/5/18
N2 - Recent developments in sensors, GPS and smart phones have provided us with a large amount of mobility data. At the same time, large-scale crowd-generated social media data, such as geo-Tagged tweets, provide rich semantic information about locations and events. Combining the mobility data and surrounding social media data enables us to semantically understand why a person travels to a location at a particular time (e.g., attending a local event or visiting a point of interest). Previous research on mobility data mining has been mainly focused on mining patterns using only the mobility data. In this paper, we study the problem of using social media to annotate mobility data. As social media data is often noisy, the key research problem lies in using the right model to retrieve only the relevant words with respect to a mobility record. We propose frequency-based method, Gaussian mixture model, and kernel density estimation (KDE) to tackle this problem. We show that KDE is the most suitable model as it captures the locality of word distribution very well. We test our proposal using the real dataset collected from Twitter and demonstrate the effectiveness of our techniques via both interesting case studies and a comprehensive evaluation.
AB - Recent developments in sensors, GPS and smart phones have provided us with a large amount of mobility data. At the same time, large-scale crowd-generated social media data, such as geo-Tagged tweets, provide rich semantic information about locations and events. Combining the mobility data and surrounding social media data enables us to semantically understand why a person travels to a location at a particular time (e.g., attending a local event or visiting a point of interest). Previous research on mobility data mining has been mainly focused on mining patterns using only the mobility data. In this paper, we study the problem of using social media to annotate mobility data. As social media data is often noisy, the key research problem lies in using the right model to retrieve only the relevant words with respect to a mobility record. We propose frequency-based method, Gaussian mixture model, and kernel density estimation (KDE) to tackle this problem. We show that KDE is the most suitable model as it captures the locality of word distribution very well. We test our proposal using the real dataset collected from Twitter and demonstrate the effectiveness of our techniques via both interesting case studies and a comprehensive evaluation.
UR - http://www.scopus.com/inward/record.url?scp=84968914377&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84968914377&partnerID=8YFLogxK
U2 - 10.1145/2736277.2741675
DO - 10.1145/2736277.2741675
M3 - Conference contribution
AN - SCOPUS:84968914377
T3 - WWW 2015 - Proceedings of the 24th International Conference on World Wide Web
SP - 1253
EP - 1263
BT - WWW 2015 - Proceedings of the 24th International Conference on World Wide Web
PB - Association for Computing Machinery, Inc
T2 - 24th International Conference on World Wide Web, WWW 2015
Y2 - 18 May 2015 through 22 May 2015
ER -