Conversational Voice User Interfaces: Connecting Engineering Fundamentals to Design Considerations


Journal article


Cosmin Munteanu, Gerald Penn, Christine Murad
CHI Extended Abstracts, 2021

Semantic Scholar ACM Link
Cite

Cite

APA   Click to copy
Munteanu, C., Penn, G., & Murad, C. (2021). Conversational Voice User Interfaces: Connecting Engineering Fundamentals to Design Considerations. CHI Extended Abstracts.


Chicago/Turabian   Click to copy
Munteanu, Cosmin, Gerald Penn, and Christine Murad. “Conversational Voice User Interfaces: Connecting Engineering Fundamentals to Design Considerations.” CHI Extended Abstracts (2021).


MLA   Click to copy
Munteanu, Cosmin, et al. “Conversational Voice User Interfaces: Connecting Engineering Fundamentals to Design Considerations.” CHI Extended Abstracts, 2021.


BibTeX   Click to copy

@article{cosmin2021a,
  title = {Conversational Voice User Interfaces: Connecting Engineering Fundamentals to Design Considerations},
  year = {2021},
  journal = {CHI Extended Abstracts},
  author = {Munteanu, Cosmin and Penn, Gerald and Murad, Christine}
}

Abstract

HCI research has for long been dedicated to better and more naturally facilitating information transfer between humans and machines. Unfortunately, humans' most natural form of communication, speech, is also one of the most difficult modalities to be understood by machines – despite, and perhaps, because it is the highest-bandwidth communication channel we possess. While significant research efforts, from engineering, to linguistic, and to cognitive sciences, have been spent on improving machines' ability to understand speech, the CHI community (and the HCI field at large) has only recently started embracing this modality as a central focus of research. This can be attributed in part to the unexpected variations in error rates when processing speech, in contrast with often-unfounded claims of success from industry, but also to the intrinsic difficulty of designing and especially evaluating speech and natural language interfaces. As such, the development of interactive speech-based systems is mostly driven by engineering efforts to improve such systems with respect to largely arbitrary performance metrics. Such developments have often been void of any user-centered design principles or consideration for usability or usefulness in the same ways as graphical user interfaces have benefited from heuristic design guidelines. The goal of this course is to inform the CHI community of the current state of speech and natural language research, to dispel some of the myths surrounding speech-based interaction, as well as to provide an opportunity for researchers and practitioners to learn more about how speech recognition and speech synthesis work, what are their limitations, and how they could be used to enhance current interaction paradigms. Through this, we hope that HCI researchers and practitioners will learn how to combine recent advances in speech processing with user-centred principles in designing more usable and useful speech-based interactive systems.