Anomaly detection of attacks (ADA) on DNN classifiers at test time

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Scopus citations

Abstract

A significant threat to wide deployment of machine learning-based classifiers is adversarial learning attacks, especially at test-time. Recently there has been significant development in defending against such attacks. Several such works seek to robustify the classifier to make «correct» decisions on perturbed patterns. We argue it is often operationally more important to detect the attack, rather than to «correctly classify» in the face of it (Classification can proceed if no attack is detected). We hypothesize that, even if human-imperceptible, adversarial perturbations are machine-detectable. We propose a purely unsupervised anomaly detector (AD), based on suitable (null hypothesis) density models for the different layers of a deep neural net and a novel decision statistic built upon the Kullback-Leibler divergence. This paper addresses: 1) when is it appropriate to aim to «correctly classify» a perturbed pattern?; 2) What is a good AD detection statistic, one which exploits all likely sources of anomalousness associated with a test-time attack? 3) Where in a deep neural net (DNN) (in an early layer, a middle layer, or at the penultimate layer) will the most anomalous signature manifest? Tested on MNIST and CIFAR-10 image databases under three prominent attack strategies, our approach outperforms previous detection methods, achieving strong ROC AUC detection accuracy on two attacks and substantially better accuracy than previously reported on the third (strongest) attack.

Original languageEnglish (US)
Title of host publication2018 IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2018 - Proceedings
EditorsNelly Pustelnik, Zheng-Hua Tan, Zhanyu Ma, Jan Larsen
PublisherIEEE Computer Society
ISBN (Electronic)9781538654774
DOIs
StatePublished - Oct 31 2018
Event28th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2018 - Aalborg, Denmark
Duration: Sep 17 2018Sep 20 2018

Publication series

NameIEEE International Workshop on Machine Learning for Signal Processing, MLSP
Volume2018-September
ISSN (Print)2161-0363
ISSN (Electronic)2161-0371

Other

Other28th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2018
Country/TerritoryDenmark
CityAalborg
Period9/17/189/20/18

All Science Journal Classification (ASJC) codes

  • Human-Computer Interaction
  • Signal Processing

Fingerprint

Dive into the research topics of 'Anomaly detection of attacks (ADA) on DNN classifiers at test time'. Together they form a unique fingerprint.

Cite this