Skip to main navigation Skip to search Skip to main content

The unappreciated role of intent in algorithmic moderation of abusive content on social media

  • Xinyu Wang
  • , Sai Koneru
  • , Pranav Narayanan Venkit
  • , Brett Frischmann
  • , Sarah Rajtmajer

Research output: Contribution to journalArticlepeer-review

Abstract

A significant body of research is dedicated to developing language models that can detect various types of online abuse, for example, hate speech, cyberbullying. However, there is a disconnect between platform policies, which often consider the author's intention as a criterion for content moderation, and the current capabilities of detection models, which typically lack efforts to capture intent. This paper examines the role of intent in the moderation of abusive content. Specifically, we review state-of-the-art detection models and benchmark training datasets to assess their ability to capture intent. We propose changes to the design and development of automated detection and moderation systems to improve alignment with ethical and policy conceptualizations of these abuses.

Original languageEnglish (US)
JournalHarvard Kennedy School Misinformation Review
Volume6
Issue number3
DOIs
StatePublished - 2025

All Science Journal Classification (ASJC) codes

  • Social Sciences (miscellaneous)

Fingerprint

Dive into the research topics of 'The unappreciated role of intent in algorithmic moderation of abusive content on social media'. Together they form a unique fingerprint.

Cite this