Forget to Flourish: Leveraging Machine-Unlearning on Pretrained Language Models for Privacy Leakage

  • Md Rafi Ur Rashid
  • , Jing Liu
  • , Toshiaki Koike-Akino
  • , Ye Wang
  • , Shagufta Mehnaz

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Fine-tuning large language models on private data for downstream applications poses significant privacy risks in potentially exposing sensitive information. Several popular community platforms now offer convenient distribution of a large variety of pre-trained models, allowing anyone to publish without rigorous verification. This scenario creates a privacy threat, as pre-trained models can be intentionally crafted to compromise the privacy of fine-tuning datasets. In this study, we introduce a novel poisoning technique that uses model-unlearning as an attack tool. This approach manipulates a pre-trained language model to increase the leakage of private data during the fine-tuning process. Our method enhances both membership inference and data extraction attacks while preserving model utility. Experimental results across different models, datasets, and fine-tuning setups demonstrate that our attacks significantly surpass baseline performance. This work serves as a cautionary note for users who download pre-trained models from unverified sources, highlighting the potential risks involved.

Original languageEnglish (US)
Title of host publicationSpecial Track on AI Alignment
EditorsToby Walsh, Julie Shah, Zico Kolter
PublisherAssociation for the Advancement of Artificial Intelligence
Pages20139-20147
Number of pages9
Edition19
ISBN (Electronic)157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 157735897X, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978, 9781577358978
DOIs
StatePublished - Apr 11 2025
Event39th Annual AAAI Conference on Artificial Intelligence, AAAI 2025 - Philadelphia, United States
Duration: Feb 25 2025Mar 4 2025

Publication series

NameProceedings of the AAAI Conference on Artificial Intelligence
Number19
Volume39
ISSN (Print)2159-5399
ISSN (Electronic)2374-3468

Conference

Conference39th Annual AAAI Conference on Artificial Intelligence, AAAI 2025
Country/TerritoryUnited States
CityPhiladelphia
Period2/25/253/4/25

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Forget to Flourish: Leveraging Machine-Unlearning on Pretrained Language Models for Privacy Leakage'. Together they form a unique fingerprint.

Cite this