A Benchmark Study of Backdoor Data Poisoning Defenses for Deep Neural Network Classifiers and A Novel Defense

Research output: Chapter in Book/Report/Conference proceedingConference contribution

32 Scopus citations

Abstract

While data poisoning attacks on classifiers were originally proposed to degrade a classifier's usability, there has been strong recent interest in backdoor data poisoning attacks, where the classifier learns to classify to a target class whenever a backdoor pattern (e.g., a watermark or innocuous pattern) is added to an example from some class other than the target class. In this paper, we conduct a benchmark experimental study to assess the effectiveness of backdoor attacks against deep neural network (DNN) classifiers for images (CIFAR-10 domain), as well as of anomaly detection defenses against these attacks, assuming the defender has access to the (poisoned) training set. We also propose a novel defense scheme (cluster impurity (CI)) based on two ideas: I) backdoor patterns may cluster in a DNN's (e.g. penultimate) deep layer latent space; ii) image filtering (or additive noise) may remove the backdoor patterns, and thus alter the class decision produced by the DNN. We demonstrate that largely imperceptible single-pixel backdoor attacks are highly successful, with no effect on classifier usability. However, the CI approach is highly effective at detecting these attacks, and more successful than previous backdoor detection methods.

Original languageEnglish (US)
Title of host publication2019 IEEE 29th International Workshop on Machine Learning for Signal Processing, MLSP 2019
PublisherIEEE Computer Society
ISBN (Electronic)9781728108247
DOIs
StatePublished - Oct 2019
Event29th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2019 - Pittsburgh, United States
Duration: Oct 13 2019Oct 16 2019

Publication series

NameIEEE International Workshop on Machine Learning for Signal Processing, MLSP
Volume2019-October
ISSN (Print)2161-0363
ISSN (Electronic)2161-0371

Conference

Conference29th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2019
Country/TerritoryUnited States
CityPittsburgh
Period10/13/1910/16/19

All Science Journal Classification (ASJC) codes

  • Human-Computer Interaction
  • Signal Processing

Fingerprint

Dive into the research topics of 'A Benchmark Study of Backdoor Data Poisoning Defenses for Deep Neural Network Classifiers and A Novel Defense'. Together they form a unique fingerprint.

Cite this