Uncovering Human Traits in Determining Real and Spoofed Audio: Insights from Blind and Sighted Individuals

Research output: Chapter in Book/Report/Conference proceedingConference contribution


This paper explores how blind and sighted individuals perceive real and spoofed audio, highlighting differences and similarities between the groups. Through two studies, we find that both groups focus on specific human traits in audio-such as accents, vocal inflections, breathing patterns, and emotions-to assess audio authenticity. We further reveal that humans, irrespective of visual ability, can still outperform current state-of-the-art machine learning models in discerning audio authenticity; however, the task proves psychologically demanding. Moreover, detection accuracy scores between blind and sighted individuals are comparable, but each group exhibits unique strengths: the sighted group excels at detecting deepfake-generated audio, while the blind group excels at detecting text-to-speech (TTS) generated audio. These findings not only deepen our understanding of machine-manipulated and neural-renderer audio but also have implications for developing countermeasures, such as perceptible watermarks and human-AI collaboration strategies for spoofing detection.

Original languageEnglish (US)
Title of host publicationCHI 2024 - Proceedings of the 2024 CHI Conference on Human Factors in Computing Sytems
PublisherAssociation for Computing Machinery
ISBN (Electronic)9798400703300
StatePublished - May 11 2024
Event2024 CHI Conference on Human Factors in Computing Sytems, CHI 2024 - Hybrid, Honolulu, United States
Duration: May 11 2024May 16 2024

Publication series

NameConference on Human Factors in Computing Systems - Proceedings


Conference2024 CHI Conference on Human Factors in Computing Sytems, CHI 2024
Country/TerritoryUnited States
CityHybrid, Honolulu

All Science Journal Classification (ASJC) codes

  • Human-Computer Interaction
  • Computer Graphics and Computer-Aided Design
  • Software

Cite this