Deriving Entity-Specific Embeddings From Multi-Entity Sequences

Connor Heaton, Prasenjit Mitra

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Underpinning much of the recent progress in deep learning is the transformer architecture, which takes as input a sequence of embeddings E and emits an updated sequence of embeddings E. A special [CLS] embedding is often included in this sequence, serving as a description of the sequence once processed and used as the basis for subsequent sequence-level tasks. The processed [CLS] embedding loses utility, however, when the model is presented with a multi-entity sequence and asked to perform an entity-specific task. When processing a multi-speaker dialogue, for example, the [CLS] embedding describes the entire dialogue, not any individual utterance/speaker. Existing methods toward entity-specific prediction involve redundant computation or post-processing outside of the transformer. We present a novel methodology for deriving entity-specific embeddings from a multi-entity sequence completely within the transformer, with a loose definition of entity amenable to many problem spaces. To show the generic applicability of our method, we apply it to widely different tasks: emotion recognition in conversation and player performance projection in baseball and show that it can be used to achieve SOTA in both. Code can be found at https://github.com/c-heat16/EntitySpecificEmbeddings.

Original languageEnglish (US)
Title of host publication2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings
EditorsNicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
PublisherEuropean Language Resources Association (ELRA)
Pages4675-4684
Number of pages10
ISBN (Electronic)9782493814104
StatePublished - 2024
EventJoint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024 - Hybrid, Torino, Italy
Duration: May 20 2024May 25 2024

Publication series

Name2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings

Conference

ConferenceJoint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024
Country/TerritoryItaly
CityHybrid, Torino
Period5/20/245/25/24

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computational Theory and Mathematics
  • Computer Science Applications

Cite this