Efficient mining of the multidimensional traffic cluster hierarchy for digesting, visualization, and anomaly identification

Research output: Contribution to journalArticlepeer-review

22 Scopus citations

Abstract

Mining traffic to identify the dominant flows sent over a given link, over a specified time interval, is a valuable capability with applications to traffic auditing, simulation, visualization, as well as anomaly detection. Recently, Estan et al. advanced a comprehensive data mining structure tailored for networking data - a parsimonious, multidimensional flow hierarchy, along with an algorithm for its construction. While they primarily targeted offline auditing, use in interactive traffic visualization and anomaly/attack detection will require real-time data mining. We suggest several improvements to Estan et al.'s algorithm that substantially reduce the computational complexity of multidimensional flow mining. We also propose computational and memory-efficient approaches for unidimensional clustering of the IP address spaces. For baseline implementations, evaluated on the New Zealand (NZIX) trace data, our method reduced CPU execution times of the Estan et al. method by a factor of more than eight. We also develop a methodology for anomaly/attack detection based on flow mining, demonstrating the usefulness of this approach on traces from the Slammer and Code Red worms and the MIT Lincoln Laboratories DDoS data.

Original languageEnglish (US)
Article number1705623
Pages (from-to)1929-1941
Number of pages13
JournalIEEE Journal on Selected Areas in Communications
Volume24
Issue number10
DOIs
StatePublished - Oct 2006

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Efficient mining of the multidimensional traffic cluster hierarchy for digesting, visualization, and anomaly identification'. Together they form a unique fingerprint.

Cite this