Learning Emotion Representations from Verbal and Nonverbal Communication

Sitao Zhang, Yimu Pan, James Z. Wang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

16 Scopus citations

Abstract

Emotion understanding is an essential but highly challenging component of artificial general intelligence. The absence of extensive annotated datasets has significantly impeded advancements in this field. We present Emotion-CLIp, the first pre-training paradigm to extract visual emotion representations from verbal and nonverbal communication using only uncurated data. Compared to numerical labels or descriptions used in previous methods, communication naturally contains emotion information. Furthermore, acquiring emotion representations from communication is more congruent with the human learning process. We guide EmotionCLIP to attend to nonverbal emotion cues through subject-aware context encoding and verbal emotion cues using sentiment-guided contrastive learning. Extensive experiments validates the effectiveness and transferability of Emotion Clip. Using merely linear-probe evaluation protocol, EmotionCLIP outperforms the state-of-the-art supervised visual emotion recognition methods and rivals many multimodal approaches across various benchmarks. We anticipate that the advent of Emotion Clip will address the prevailing issue of data scarcity in emotion understanding, thereby fostering progress in related domains. The code and pretrained models are available at https://github.com/Xeaver/EmotionCLIP.

Original languageEnglish (US)
Title of host publicationProceedings - 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023
PublisherIEEE Computer Society
Pages18993-19004
Number of pages12
ISBN (Electronic)9798350301298
DOIs
StatePublished - 2023
Event2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023 - Vancouver, Canada
Duration: Jun 18 2023Jun 22 2023

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Volume2023-June
ISSN (Print)1063-6919

Conference

Conference2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023
Country/TerritoryCanada
CityVancouver
Period6/18/236/22/23

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'Learning Emotion Representations from Verbal and Nonverbal Communication'. Together they form a unique fingerprint.

Cite this