Fighting Fire with Fire: The Dual Role of LLMs in Crafting and Detecting Elusive Disinformation

Jason Lucas, Adaku Uchendu, Michiharu Yamashita, Jooyoung Lee, Shaurya Rohatgi, Dongwon Lee

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Recent ubiquity and disruptive impacts of large language models (LLMs) have raised concerns about their potential to be misused (i.e., generating large-scale harmful and misleading content). To combat this emerging risk of LLMs, we propose a novel “Fighting Fire with Fire” (F3) strategy that harnesses modern LLMs' generative and emergent reasoning capabilities to counter human-written and LLM-generated disinformation. First, we leverage GPT-3.5-turbo to synthesize authentic and deceptive LLM-generated content through paraphrase-based and perturbation-based prefix-style prompts, respectively. Second, we apply zero-shot in-context semantic reasoning techniques with cloze-style prompts to discern genuine from deceptive posts & news articles. In our extensive experiments, we observe GPT-3.5-turbo's zero-shot superiority for both in-distribution and out-of-distribution datasets, where GPT-3.5-turbo consistently achieved accuracy at 68-72%, unlike the decline observed in previous customized and fine-tuned disinformation detectors. Our codebase and dataset are available at https://github.com/mickeymst/F3.

Original languageEnglish (US)
Title of host publicationEMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings
EditorsHouda Bouamor, Juan Pino, Kalika Bali
PublisherAssociation for Computational Linguistics (ACL)
Pages14279-14305
Number of pages27
ISBN (Electronic)9798891760608
StatePublished - 2023
Event2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023 - Hybrid, Singapore, Singapore
Duration: Dec 6 2023Dec 10 2023

Publication series

NameEMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings

Conference

Conference2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023
Country/TerritorySingapore
CityHybrid, Singapore
Period12/6/2312/10/23

All Science Journal Classification (ASJC) codes

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Information Systems
  • Linguistics and Language

Cite this