Abstract
Core to much of modern deep learning is the notion of representation learning, learning representations of things that are useful for performing some task(s) related to those things. Encoder-only language models, for example, learn representations of language useful for performing language-related tasks, often classification. While fruitful in many applications, inherent is the assumption that only one classification is to be made for a particular input. This poses challenges when multiple classifications are to be made about different portions of a single record, such as emotion recognition in conversation (ERC) where the objective is to classify the emotion in each utterance of a dialog. Existing methods for this task typically either involve redundant computation, non-trivial post-processing outside of the core language model backbone, or both. To address this, we generalize recent work for deriving player-specific embeddings from multi-player sequences of events in sport for domain-agnostic application while also enabling it to leverage inter-entity relationships. Seeing the efficacy of the method in regression and classification tasks, we explore how it can be used to cluster player representations, proposing a novel approach for distribution-aware deep-clustering in the absence of labels. We demonstrate how the proposed methods yield state-of-the-art performance on the disparate tasks of ERC in Natural Language Processing (NLP), long-tail partial-label-learning (LT-PLL) in Computer Vision (CV), and player form clustering in sports analytics.
Original language | English (US) |
---|---|
Pages (from-to) | 57492-57503 |
Number of pages | 12 |
Journal | IEEE Access |
Volume | 12 |
DOIs | |
State | Published - 2024 |
All Science Journal Classification (ASJC) codes
- General Computer Science
- General Materials Science
- General Engineering