Watch the Watchers! On the Security Risks of Robustness-Enhancing Diffusion Models

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Thanks to their remarkable denoising capabilities, diffusion models are increasingly being employed as defensive tools to reinforce the robustness of other models, notably in purifying adversarial examples and certifying adversarial robustness. However, the potential risks of these practices remain largely unexplored, which is highly concerning. To bridge this gap, this work investigates the vulnerability of robustness-enhancing diffusion models. Specifically, we demonstrate that these models are highly susceptible to DIFF2, a simple yet effective attack, which substantially diminishes their robustness assurance. Essentially, DIFF2 integrates a malicious diffusion-sampling process into the diffusion model, guiding inputs embedded with specific triggers toward an adversary-defined distribution while preserving the normal functionality for clean inputs. Our case studies on adversarial purification and robustness certification show that DIFF2 can significantly reduce both post-purification and certified accuracy across benchmark datasets and models, highlighting the potential risks of relying on pre-trained diffusion models as defensive tools. We further explore possible countermeasures, suggesting promising avenues for future research.

Original languageEnglish (US)
Title of host publicationProceedings of the 34th USENIX Security Symposium
PublisherUSENIX Association
Pages997-1016
Number of pages20
ISBN (Electronic)9781939133526
StatePublished - 2025
Event34th USENIX Security Symposium, USENIX Security 2025 - Seattle, United States
Duration: Aug 13 2025Aug 15 2025

Publication series

NameProceedings of the 34th USENIX Security Symposium

Conference

Conference34th USENIX Security Symposium, USENIX Security 2025
Country/TerritoryUnited States
CitySeattle
Period8/13/258/15/25

All Science Journal Classification (ASJC) codes

  • Safety, Risk, Reliability and Quality
  • Computer Networks and Communications
  • Information Systems

Fingerprint

Dive into the research topics of 'Watch the Watchers! On the Security Risks of Robustness-Enhancing Diffusion Models'. Together they form a unique fingerprint.

Cite this