Detecting clusters of anomalies on low-dimensional feature subsets with application to network traffic flow data

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Scopus citations

Abstract

In a variety of applications, one desires to detect groups of anomalous data samples, with a group potentially manifesting its atypicality (relative to a reference model) on a low-dimensional subset of the full measured set of features. Samples may only be weakly atypical individually, whereas they may be strongly atypical when considered jointly. What makes this group anomaly detection problem quite challenging is that it is a priori unknown which subset of features jointly manifests a particular group of anomalies. Moreover, it is unknown how many anomalous groups are present in a given data batch. In this work, we develop a group anomaly detection (GAD) scheme to identify subsets of samples and subsets of features that jointly specify anomalous clusters. We apply our approach to network intrusion detection to detect botnet and peer-to-peer flow clusters. Unlike previous studies, our approach captures and exploits statistical dependencies that may exist between the measured features. Experiments on real world network traffic data demonstrate the advantage of our proposed system, and highlight the importance of exploiting feature dependency structure, compared to the feature (or test) independence assumption made in previous studies.

Original languageEnglish (US)
Title of host publication2015 IEEE International Workshop on Machine Learning for Signal Processing - Proceedings of MLSP 2015
EditorsDeniz Erdogmus, Serdar Kozat, Jan Larsen, Murat Akcakaya
PublisherIEEE Computer Society
ISBN (Electronic)9781467374545
DOIs
StatePublished - Nov 10 2015
Event25th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2015 - Boston, United States
Duration: Sep 17 2015Sep 20 2015

Publication series

NameIEEE International Workshop on Machine Learning for Signal Processing, MLSP
Volume2015-November
ISSN (Print)2161-0363
ISSN (Electronic)2161-0371

Other

Other25th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2015
Country/TerritoryUnited States
CityBoston
Period9/17/159/20/15

All Science Journal Classification (ASJC) codes

  • Human-Computer Interaction
  • Signal Processing

Fingerprint

Dive into the research topics of 'Detecting clusters of anomalies on low-dimensional feature subsets with application to network traffic flow data'. Together they form a unique fingerprint.

Cite this