The ability to predict linkages among data objects is central to many data mining tasks, such as product recommendation and social network analysis. Substantial literature has been devoted to the link prediction problem either as an implicitly embedded problem in specific applications or as a generic data mining task. This literature has mostly adopted a static graph representation where a snapshot of the network is analyzed to predict hidden or future links. However, this representation is only appropriate to investigate whether a certain link will ever occur and does not apply to many applications for which the prediction of the repeated link occurrences are of primary interest (e.g., communication network surveillance). In this paper, we introduce the time-series link prediction problem, taking into consideration temporal evolutions of link occurrences to predict link occurrence probabilities at a particular time. Using Enron e-mail data and high-energy particle physics literature coauthorship data, we have demonstrated that time-series models of single-link occurrences achieve comparable link prediction performance with commonly used static graph link prediction algorithms. Furthermore, a combination of static graph link prediction algorithms and time-series models produced significantly better predictions over static graph link prediction methods, demonstrating the great potential of integrated methods that exploit both interlink structural dependencies and intralink temporal dependencies.
All Science Journal Classification (ASJC) codes
- Information Systems
- Computer Science Applications
- Management Science and Operations Research