Deep Neural Network Piration without Accuracy Loss

Aritra Ray, Jinyuan Jia, Sohini Saha, Jayeeta Chaudhuri, Neil Zhenqiang Gong, Krishnendu Chakrabarty

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

A deep neural network (DNN) classifier is often viewed as the intellectual property of a model owner due to the huge resources required to train it. To protect intellectual property, the model owner can embed a watermark into the DNN classifier (called target classifier) such that it outputs pre-determined labels (called trigger labels) for pre-determined inputs (called trigger inputs). Given the black-box access to a suspect classifier, the model owner can verify whether the suspect classifier is pirated version of its classifier by first querying the suspect classifier for trigger inputs and then checking whether the predicted labels match with the trigger labels. Many studies showed that an attacker can pirate the target classifier (called pirated classifier) via retraining or fine-tuning the target classifier to remove its watermark. However, they sacrifice the accuracy of the pirated classifier, which is undesired for critical applications such as finance and healthcare. In our work, we propose a new attack without sacrificing the accuracy of the pirated classifier for in-distribution testing inputs while preventing the detection from the model owner. Our idea is that an attacker can detect the trigger inputs in the inference stage of the pirated classifier. In particular, given a testing input, we let the pirated classifier return a random label if the input is detected as a trigger input. Otherwise, the pirated classifier predicts the same label as the target classifier. We evaluate our attack on benchmark datasets and find that our attack can effectively identify the trigger inputs. Our attack reveals that the intellectual property of a model owner can be violated with existing watermarking techniques, highlighting the need for new techniques.

Original languageEnglish (US)
Title of host publicationProceedings - 21st IEEE International Conference on Machine Learning and Applications, ICMLA 2022
EditorsM. Arif Wani, Mehmed Kantardzic, Vasile Palade, Daniel Neagu, Longzhi Yang, Kit-Yan Chan
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1032-1038
Number of pages7
ISBN (Electronic)9781665462839
DOIs
StatePublished - 2022
Event21st IEEE International Conference on Machine Learning and Applications, ICMLA 2022 - Nassau, Bahamas
Duration: Dec 12 2022Dec 14 2022

Publication series

NameProceedings - 21st IEEE International Conference on Machine Learning and Applications, ICMLA 2022

Conference

Conference21st IEEE International Conference on Machine Learning and Applications, ICMLA 2022
Country/TerritoryBahamas
CityNassau
Period12/12/2212/14/22

All Science Journal Classification (ASJC) codes

  • Computer Vision and Pattern Recognition
  • Computer Science Applications
  • Artificial Intelligence
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'Deep Neural Network Piration without Accuracy Loss'. Together they form a unique fingerprint.

Cite this