A Review of Adversarial Attack and Defense for Classification Methods

Yao Li, Minhao Cheng, Cho Jui Hsieh, Thomas C.M. Lee

Research output: Contribution to journalArticlepeer-review

20 Scopus citations

Abstract

Despite the efficiency and scalability of machine learning systems, recent studies have demonstrated that many classification methods, especially Deep Neural Networks (DNNs), are vulnerable to adversarial examples; that is, examples that are carefully crafted to fool a well-trained classification model while being indistinguishable from natural data to human. This makes it potentially unsafe to apply DNNs or related methods in security-critical areas. Since this issue was first identified by Biggio et al. and Szegedy et al., much work has been done in this field, including the development of attack methods to generate adversarial examples and the construction of defense techniques to guard against such examples. This article aims to introduce this topic and its latest developments to the statistical community, primarily focusing on the generation and guarding of adversarial examples. Computing codes (in Python and R) used in the numerical experiments are publicly available for readers to explore the surveyed methods. It is the hope of the authors that this article will encourage more statisticians to work on this important and exciting field of generating and defending against adversarial examples.

Original languageEnglish (US)
Pages (from-to)329-345
Number of pages17
JournalAmerican Statistician
Volume76
Issue number4
DOIs
StatePublished - 2022

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • General Mathematics
  • Statistics, Probability and Uncertainty

Cite this