Road Traffic Speed Prediction: A Probabilistic Model Fusing Multi-Source Data

Lu Lin, Jianxin Li, Feng Chen, Jieping Ye, Jinpeng Huai

Research output: Contribution to journalArticlepeer-review

86 Scopus citations


Road traffic speed prediction is a challenging problem in intelligent transportation system (ITS) and has gained increasing attentions. Existing works are mainly based on raw speed sensing data obtained from infrastructure sensors or probe vehicles, which, however, are limited by expensive cost of sensor deployment and maintenance. With sparse speed observations, traditional methods based only on speed sensing data are insufficient, especially when emergencies like traffic accidents occur. To address the issue, this paper aims to improve the road traffic speed prediction by fusing traditional speed sensing data with new-type 'sensing' data from cross domain sources, such as tweet sensors from social media and trajectory sensors from map and traffic service platforms. Jointly modeling information from different datasets brings many challenges, including location uncertainty of low-resolution data, language ambiguity of traffic description in texts, and heterogeneity of cross-domain data. In response to these challenges, we present a unified probabilistic framework, called Topic-Enhanced Gaussian Process Aggregation Model (TEGPAM), consisting of three components, i.e., location disaggregation model, traffic topic model, and traffic speed Gaussian Process model, which integrate new-type data with traditional data. Experiments on real world data from two large cities validate the effectiveness and efficiency of our model.

Original languageEnglish (US)
Pages (from-to)1310-1323
Number of pages14
JournalIEEE Transactions on Knowledge and Data Engineering
Issue number7
StatePublished - Jul 1 2018

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Computer Science Applications
  • Computational Theory and Mathematics


Dive into the research topics of 'Road Traffic Speed Prediction: A Probabilistic Model Fusing Multi-Source Data'. Together they form a unique fingerprint.

Cite this