TY - GEN
T1 - Watermarking-based Defense against Adversarial Attacks on Deep Neural Networks
AU - Li, Xiaoting
AU - Chen, Lingwei
AU - Zhang, Jinquan
AU - Larus, James
AU - Wu, Dinghao
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/7/18
Y1 - 2021/7/18
N2 - The vulnerability of deep neural networks to adversarial attacks has posed significant threats to real-world applications, especially security-critical ones. Given a well-trained model, slight modifications to the input samples can cause drastic changes in the predictions of the model. Many methods have been proposed to mitigate the issue. However, the majority of these defenses have proven to fail to resist all the adversarial attacks. This is mainly because the knowledge advantage of the attacker can help to either easily customize the information of the target model or create a surrogate model as a substitute to successfully construct the corresponding adversarial examples. In this paper, we propose a new defense mechanism that creates a knowledge gap between attackers and defenders by imposing a designed watermarking system into standard deep neural networks. The embedded watermark is data-independent and non-reproducible to an attacker, which improves randomization and security of the defense model without compromising performance on clean data, and thus yields knowledge disadvantage to prevent an attacker from crafting effective adversarial examples targeting the defensive model. We evaluate the performance of our watermarking defense using a wide range of watermarking algorithms against four state-of-the-art attacks on different datasets, and the experimental results validate its effectiveness.
AB - The vulnerability of deep neural networks to adversarial attacks has posed significant threats to real-world applications, especially security-critical ones. Given a well-trained model, slight modifications to the input samples can cause drastic changes in the predictions of the model. Many methods have been proposed to mitigate the issue. However, the majority of these defenses have proven to fail to resist all the adversarial attacks. This is mainly because the knowledge advantage of the attacker can help to either easily customize the information of the target model or create a surrogate model as a substitute to successfully construct the corresponding adversarial examples. In this paper, we propose a new defense mechanism that creates a knowledge gap between attackers and defenders by imposing a designed watermarking system into standard deep neural networks. The embedded watermark is data-independent and non-reproducible to an attacker, which improves randomization and security of the defense model without compromising performance on clean data, and thus yields knowledge disadvantage to prevent an attacker from crafting effective adversarial examples targeting the defensive model. We evaluate the performance of our watermarking defense using a wide range of watermarking algorithms against four state-of-the-art attacks on different datasets, and the experimental results validate its effectiveness.
UR - http://www.scopus.com/inward/record.url?scp=85116417468&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85116417468&partnerID=8YFLogxK
U2 - 10.1109/IJCNN52387.2021.9534236
DO - 10.1109/IJCNN52387.2021.9534236
M3 - Conference contribution
AN - SCOPUS:85116417468
T3 - Proceedings of the International Joint Conference on Neural Networks
BT - IJCNN 2021 - International Joint Conference on Neural Networks, Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2021 International Joint Conference on Neural Networks, IJCNN 2021
Y2 - 18 July 2021 through 22 July 2021
ER -