TY - JOUR
T1 - How productivity improves in hands-free continuous dictation tasks
T2 - Lessons learned from a longitudinal study
AU - Feng, Jinjuan
AU - Karat, Clare Marie
AU - Sears, Andrew
N1 - Funding Information:
This material is based upon work supported by the National Science Foundation under Grant No. IIS-9910607. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation (NSF).
PY - 2005/5
Y1 - 2005/5
N2 - Speech recognition technology continues to improve, but users still experience significant difficulty using the software to create and edit documents. The reported composition speed using speech software is only between 8 and 15 words per minute [Proc CHI 99 (1999) 568; Universal Access Inform Soc 1 (2001) 4], much lower than people's normal speaking speed of 125-150 words per minute. What causes the huge gap between natural speaking and composing using speech recognition? Is it possible to narrow the gap and make speech recognition more promising to users? In this paper we discuss users' learning processes and the difficulties they experience as related to continuous dictation tasks using state of the art Automatic Speech Recognition (ASR) software. Detailed data was collected for the first time on various aspects of the three activities involved in document composition tasks: dictation, navigation, and correction. The results indicate that navigation and error correction accounted for big chunk of the dictation task during the early stages of interaction. As users gained more experience, they became more efficient at dictation, navigation and error correction. However, the major improvements in productivity were due to dictation quality and the usage of navigation commands. These results provide insights regarding the factors that cause the gap between user expectation with speech recognition software and the reality of use, and how those factors changed with experience. Specific advice is given to researchers as to the most critical issues that must be addressed.
AB - Speech recognition technology continues to improve, but users still experience significant difficulty using the software to create and edit documents. The reported composition speed using speech software is only between 8 and 15 words per minute [Proc CHI 99 (1999) 568; Universal Access Inform Soc 1 (2001) 4], much lower than people's normal speaking speed of 125-150 words per minute. What causes the huge gap between natural speaking and composing using speech recognition? Is it possible to narrow the gap and make speech recognition more promising to users? In this paper we discuss users' learning processes and the difficulties they experience as related to continuous dictation tasks using state of the art Automatic Speech Recognition (ASR) software. Detailed data was collected for the first time on various aspects of the three activities involved in document composition tasks: dictation, navigation, and correction. The results indicate that navigation and error correction accounted for big chunk of the dictation task during the early stages of interaction. As users gained more experience, they became more efficient at dictation, navigation and error correction. However, the major improvements in productivity were due to dictation quality and the usage of navigation commands. These results provide insights regarding the factors that cause the gap between user expectation with speech recognition software and the reality of use, and how those factors changed with experience. Specific advice is given to researchers as to the most critical issues that must be addressed.
UR - http://www.scopus.com/inward/record.url?scp=17744385522&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=17744385522&partnerID=8YFLogxK
U2 - 10.1016/j.intcom.2004.06.013
DO - 10.1016/j.intcom.2004.06.013
M3 - Article
AN - SCOPUS:17744385522
SN - 0953-5438
VL - 17
SP - 265
EP - 289
JO - Interacting with Computers
JF - Interacting with Computers
IS - 3
ER -