TY - GEN
T1 - Unsupervised parsimonious cluster-based anomaly detection (PCAD)
AU - Miller, David J.
AU - Kesidis, George
AU - Qiu, Zhicong
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/10/31
Y1 - 2018/10/31
N2 - Group anomaly detection (AD), i.e. detection of clusters of anomalous samples in a test batch, with the samples in a given such cluster exhibiting a common pattern of atypicality (relative to a null model) has important applications to discovering unknown classes present in a test data batch and, equivalently, to zero-day threat detection in a security context. When the feature space is large, clusters may manifest anomalies on very small feature subsets, which is well-captured by the parsimonious mixture modelling (PMM) framework. Thus, we propose a generalized likelihood ratio test (GLRT-like) group AD framework, with PMMs used for both the null and the alternative hypothesis (that an anomalous cluster is present), and with the Bayesian Information Criterion (BIC) used to adjudicate between these hypotheses. We demonstrate our approach on network traffic data sets, detecting Zeus (web) bots and peer-to-peer traffic as zero-day activities. Our PCAD achieves substantially better detection results than a previous group AD method applied to this domain.
AB - Group anomaly detection (AD), i.e. detection of clusters of anomalous samples in a test batch, with the samples in a given such cluster exhibiting a common pattern of atypicality (relative to a null model) has important applications to discovering unknown classes present in a test data batch and, equivalently, to zero-day threat detection in a security context. When the feature space is large, clusters may manifest anomalies on very small feature subsets, which is well-captured by the parsimonious mixture modelling (PMM) framework. Thus, we propose a generalized likelihood ratio test (GLRT-like) group AD framework, with PMMs used for both the null and the alternative hypothesis (that an anomalous cluster is present), and with the Bayesian Information Criterion (BIC) used to adjudicate between these hypotheses. We demonstrate our approach on network traffic data sets, detecting Zeus (web) bots and peer-to-peer traffic as zero-day activities. Our PCAD achieves substantially better detection results than a previous group AD method applied to this domain.
UR - http://www.scopus.com/inward/record.url?scp=85057051473&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85057051473&partnerID=8YFLogxK
U2 - 10.1109/MLSP.2018.8517014
DO - 10.1109/MLSP.2018.8517014
M3 - Conference contribution
AN - SCOPUS:85057051473
T3 - IEEE International Workshop on Machine Learning for Signal Processing, MLSP
BT - 2018 IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2018 - Proceedings
A2 - Pustelnik, Nelly
A2 - Tan, Zheng-Hua
A2 - Ma, Zhanyu
A2 - Larsen, Jan
PB - IEEE Computer Society
T2 - 28th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2018
Y2 - 17 September 2018 through 20 September 2018
ER -