Abstract
This paper aims to establish relationships between conversational markers and health outcomes using data from cardiopulmonary rehabilitation sessions. Specifically, we used speech and text data from conversations between patients and researchers to assess exercise compliance and psychological wellbeing. We trained a Multimodal Transformer (MMT) on speech, transcript, and ground-truth labels. We further evaluate MMT's predictive performance by using session summaries generated by three Large Language Models (LLMs), which focused on dialogue characteristics (e.g., sentiment, thematic content, and future planning). Our findings establish the feasibility of augmenting speech and language processing of clinical sessions to improve decision-making and health outcomes.
Original language | English (US) |
---|---|
Pages (from-to) | 3155-3159 |
Number of pages | 5 |
Journal | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH |
DOIs | |
State | Published - 2024 |
Event | 25th Interspeech Conferece 2024 - Kos Island, Greece Duration: Sep 1 2024 → Sep 5 2024 |
All Science Journal Classification (ASJC) codes
- Language and Linguistics
- Human-Computer Interaction
- Signal Processing
- Software
- Modeling and Simulation