Certified Robustness of Nearest Neighbors against Data Poisoning and Backdoor Attacks

Jinyuan Jia, Yupei Liu, Xiaoyu Cao, Neil Zhenqiang Gong

Research output: Chapter in Book/Report/Conference proceedingConference contribution

12 Scopus citations

Abstract

Data poisoning attacks and backdoor attacks aim to corrupt a machine learning classifier via modifying, adding, and/or removing some carefully selected training examples, such that the corrupted classifier makes incorrect predictions as the attacker desires. The key idea of state-of-the-art certified defenses against data poisoning attacks and backdoor attacks is to create a majority vote mechanism to predict the label of a testing example. Moreover, each voter is a base classifier trained on a subset of the training dataset. Classical simple learning algorithms such as k nearest neighbors (kNN) and radius nearest neighbors (rNN) have intrinsic majority vote mechanisms. In this work, we show that the intrinsic majority vote mechanisms in kNN and rNN already provide certified robustness guarantees against data poisoning attacks and backdoor attacks. Moreover, our evaluation results on MNIST and CIFAR10 show that the intrinsic certified robustness guarantees of kNN and rNN outperform those provided by state-of-the-art certified defenses. Our results serve as standard baselines for future certified defenses against data poisoning attacks and backdoor attacks.

Original languageEnglish (US)
Title of host publicationAAAI-22 Technical Tracks 9
PublisherAssociation for the Advancement of Artificial Intelligence
Pages9575-9583
Number of pages9
ISBN (Electronic)1577358767, 9781577358763
StatePublished - Jun 30 2022
Event36th AAAI Conference on Artificial Intelligence, AAAI 2022 - Virtual, Online
Duration: Feb 22 2022Mar 1 2022

Publication series

NameProceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022
Volume36

Conference

Conference36th AAAI Conference on Artificial Intelligence, AAAI 2022
CityVirtual, Online
Period2/22/223/1/22

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence

Cite this