TY - GEN
T1 - On Replacing Humans with Large Language Models in Voice-Based Human-in-the-Loop Systems
AU - Huang, Shih Hong
AU - Huang, Ting Hao
N1 - Publisher Copyright:
Copyright © 2024, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
PY - 2024/5/21
Y1 - 2024/5/21
N2 - It is easy to assume that Large Language Models (LLMs) will seamlessly take over applications, especially those that are largely automated. In the case of conversational voice assistants, commercial systems have been widely deployed and used over the past decade. However, are we indeed on the cusp of the future we envisioned? There exists a social-technical gap between what people want to accomplish and the actual capability of technology. In this paper, we present a case study comparing two voice assistants built on Amazon Alexa: one employing a human-in-the-loop workflow, the other utilizes LLM to engage in conversations with users. In our comparison, we discovered that the issues arising in current human-in-the-loop and LLM systems are not identical. However, the presence of a set of similar issues in both systems leads us to believe that focusing on the interaction between users and systems is crucial, perhaps even more so than focusing solely on the underlying technology itself. Merely enhancing the performance of the workers or the models may not adequately address these issues. This observation prompts our research question: What are the overlooked contributing factors in the effort to improve the capabilities of voice assistants, which might not have been emphasized in prior research?.
AB - It is easy to assume that Large Language Models (LLMs) will seamlessly take over applications, especially those that are largely automated. In the case of conversational voice assistants, commercial systems have been widely deployed and used over the past decade. However, are we indeed on the cusp of the future we envisioned? There exists a social-technical gap between what people want to accomplish and the actual capability of technology. In this paper, we present a case study comparing two voice assistants built on Amazon Alexa: one employing a human-in-the-loop workflow, the other utilizes LLM to engage in conversations with users. In our comparison, we discovered that the issues arising in current human-in-the-loop and LLM systems are not identical. However, the presence of a set of similar issues in both systems leads us to believe that focusing on the interaction between users and systems is crucial, perhaps even more so than focusing solely on the underlying technology itself. Merely enhancing the performance of the workers or the models may not adequately address these issues. This observation prompts our research question: What are the overlooked contributing factors in the effort to improve the capabilities of voice assistants, which might not have been emphasized in prior research?.
UR - https://www.scopus.com/pages/publications/105016624759
UR - https://www.scopus.com/pages/publications/105016624759#tab=citedBy
U2 - 10.1609/aaaiss.v3i1.31178
DO - 10.1609/aaaiss.v3i1.31178
M3 - Conference contribution
AN - SCOPUS:105016624759
T3 - AAAI Spring Symposium - Technical Report
SP - 45
EP - 49
BT - AAAI Spring Symposium - Technical Report
A2 - Petrick, Ron
A2 - Geib, Christopher
PB - Association for the Advancement of Artificial Intelligence
T2 - 2024 AAAI Spring Symposium Series, SSS 2024
Y2 - 13 August 2025 through 27 March 2024
ER -