TY - GEN
T1 - Detecting clusters of anomalies on low-dimensional feature subsets with application to network traffic flow data
AU - Qiu, Zhicong
AU - Miller, David J.
AU - Kesidis, George
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/11/10
Y1 - 2015/11/10
N2 - In a variety of applications, one desires to detect groups of anomalous data samples, with a group potentially manifesting its atypicality (relative to a reference model) on a low-dimensional subset of the full measured set of features. Samples may only be weakly atypical individually, whereas they may be strongly atypical when considered jointly. What makes this group anomaly detection problem quite challenging is that it is a priori unknown which subset of features jointly manifests a particular group of anomalies. Moreover, it is unknown how many anomalous groups are present in a given data batch. In this work, we develop a group anomaly detection (GAD) scheme to identify subsets of samples and subsets of features that jointly specify anomalous clusters. We apply our approach to network intrusion detection to detect botnet and peer-to-peer flow clusters. Unlike previous studies, our approach captures and exploits statistical dependencies that may exist between the measured features. Experiments on real world network traffic data demonstrate the advantage of our proposed system, and highlight the importance of exploiting feature dependency structure, compared to the feature (or test) independence assumption made in previous studies.
AB - In a variety of applications, one desires to detect groups of anomalous data samples, with a group potentially manifesting its atypicality (relative to a reference model) on a low-dimensional subset of the full measured set of features. Samples may only be weakly atypical individually, whereas they may be strongly atypical when considered jointly. What makes this group anomaly detection problem quite challenging is that it is a priori unknown which subset of features jointly manifests a particular group of anomalies. Moreover, it is unknown how many anomalous groups are present in a given data batch. In this work, we develop a group anomaly detection (GAD) scheme to identify subsets of samples and subsets of features that jointly specify anomalous clusters. We apply our approach to network intrusion detection to detect botnet and peer-to-peer flow clusters. Unlike previous studies, our approach captures and exploits statistical dependencies that may exist between the measured features. Experiments on real world network traffic data demonstrate the advantage of our proposed system, and highlight the importance of exploiting feature dependency structure, compared to the feature (or test) independence assumption made in previous studies.
UR - http://www.scopus.com/inward/record.url?scp=84960864323&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84960864323&partnerID=8YFLogxK
U2 - 10.1109/MLSP.2015.7324326
DO - 10.1109/MLSP.2015.7324326
M3 - Conference contribution
AN - SCOPUS:84960864323
T3 - IEEE International Workshop on Machine Learning for Signal Processing, MLSP
BT - 2015 IEEE International Workshop on Machine Learning for Signal Processing - Proceedings of MLSP 2015
A2 - Erdogmus, Deniz
A2 - Kozat, Serdar
A2 - Larsen, Jan
A2 - Akcakaya, Murat
PB - IEEE Computer Society
T2 - 25th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2015
Y2 - 17 September 2015 through 20 September 2015
ER -