TEXTSHIELD: Robust text classification based on multimodal embedding and neural machine translation

Jinfeng Li, Tianyu Du, Shouling Ji, Rong Zhang, Quan Lu, Min Yang, Ting Wang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

29 Scopus citations


Text-based toxic content detection is an important tool for reducing harmful interactions in online social media environments. Yet, its underlying mechanism, deep learning-based text classification (DLTC), is inherently vulnerable to maliciously crafted adversarial texts. To mitigate such vulnerabilities, intensive research has been conducted on strengthening English-based DLTC models. However, the existing defenses are not effective for Chinese-based DLTC models, due to the unique sparseness, diversity, and variation of the Chinese language. In this paper, we bridge this striking gap by presenting TEXTSHIELD, a new adversarial defense framework specifically designed for Chinese-based DLTC models. TEXTSHIELD differs from previous work in several key aspects: (i) generic - it applies to any Chinese-based DLTC models without requiring re-training; (ii) robust - it significantly reduces the attack success rate even under the setting of adaptive attacks; and (iii) accurate - it has little impact on the performance of DLTC models over legitimate inputs. Extensive evaluations show that it outperforms both existing methods and the industry-leading platforms. Future work will explore its applicability in broader practical tasks.

Original languageEnglish (US)
Title of host publicationProceedings of the 29th USENIX Security Symposium
PublisherUSENIX Association
Number of pages18
ISBN (Electronic)9781939133175
StatePublished - 2020
Event29th USENIX Security Symposium - Virtual, Online
Duration: Aug 12 2020Aug 14 2020

Publication series

NameProceedings of the 29th USENIX Security Symposium


Conference29th USENIX Security Symposium
CityVirtual, Online

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Information Systems
  • Safety, Risk, Reliability and Quality


Dive into the research topics of 'TEXTSHIELD: Robust text classification based on multimodal embedding and neural machine translation'. Together they form a unique fingerprint.

Cite this