Nomadic speech-based text entry: A decision model strategy for improved speech to text processing

Kathleen J. Price, Min Lin, Jinjuan Feng, Rich Goldman, Andrew Sears, Julie Jacko

Research output: Contribution to journalArticlepeer-review

6 Scopus citations


Speech text entry can be problematic during ideal dictation conditions, but difficulties are magnified when external conditions deteriorate. Motion during speech is an extraordinary condition that might have detrimental effects on automatic speech recognition. This research examined speech text entry while mobile. Speech enrollment profiles were created by participants in both a seated and walking environment. Dictation tasks were also completed in both the seated and walking conditions. Although results from an earlier study suggested that completing the enrollment process under more challenging conditions may lead to improved recognition accuracy under both challenging and less challenging conditions, the current study provided contradictory results. A detailed review of error rates confirmed that some participants minimized errors by enrolling under more challenging conditions while others benefited by enrolling under less challenging conditions. Still others minimized errors when different enrollment models were used under the opposing condition. Leveraging these insights, we developed a decision model to minimize recognition error rates regardless of the conditions experienced while completing dictation tasks. When applying the model to existing data, error rates were reduced significantly but additional research is necessary to effectively validate the proposed solution.

Original languageEnglish (US)
Pages (from-to)692-706
Number of pages15
JournalInternational Journal of Human-Computer Interaction
Issue number7
StatePublished - Sep 2009

All Science Journal Classification (ASJC) codes

  • Human Factors and Ergonomics
  • Human-Computer Interaction
  • Computer Science Applications


Dive into the research topics of 'Nomadic speech-based text entry: A decision model strategy for improved speech to text processing'. Together they form a unique fingerprint.

Cite this