Anomaly detection from incomplete data

S. I.Yuan Liu, Lei Chen, Lionel M. Ni

Research output: Contribution to journalArticlepeer-review

19 Scopus citations


Anomaly detection (a.k.a., outlier or burst detection) is a well-motivated problem and a major data mining and knowledge discovery task. In this article, we study the problem of population anomaly detection, one of the key issues related to event monitoring and population management within a city. Through studying detected population anomalies, we can trace and analyze these anomalies, which could help to model city traffic design and event impact analysis and prediction. Although a significant and interesting issue, it is very hard to detect population anomalies and retrieve anomaly trajectories, especially given that it is difficult to get actual and sufficient population data. To address the difficulties of a lack of real population data, we take advantage of mobile phone networks, which offer enormous spatial and temporal communication data on persons. More importantly, we claim that we can utilize these mobile phone data to infer and approximate population data. Thus, we can study the population anomaly detection problem by taking advantages of unique features hidden in mobile phone data. In this article, we present a system to conduct Population Anomaly Detection (PAD). First, we propose an effective clustering method, correlation-based clustering, to cluster the incomplete location information from mobile phone data (i.e., from mobile call volume distribution to population density distribution). Then, we design an adaptive parameter-free detection method, R-scan, to capture the distributed dynamic anomalies. Finally, we devise an efficient algorithm, BT-miner, to retrieve anomaly trajectories. The experimental results from real-life mobile phone data confirm the effectiveness and efficiency of the proposed algorithms. Finally, the proposed methods are realized as a pilot system in a city in China.

Original languageEnglish (US)
Article number11
JournalACM Transactions on Knowledge Discovery from Data
Issue number2
StatePublished - Sep 23 2014

All Science Journal Classification (ASJC) codes

  • General Computer Science


Dive into the research topics of 'Anomaly detection from incomplete data'. Together they form a unique fingerprint.

Cite this