Autocorrection of Speech Recognition Errors Using Confusion Network GUI

  • Developed a graphical web interface in Node.JS that shows a visual representation of a speech recognition confusion network, from a lattice, that can be manipulated through touch in order to correct speech recognition errors (based on “Parakeet” by Vertanen et. al [IUI ‘09])
  • Used Kaldi Speech Recognition System, with a public language model, in order to create lattices of thousands of sentences to be used in the interface (which was fed with text-to-speech voices of short 7-word sentences), and Python to conduct processing of sentences to be used
  • Designing a Wizard of Oz study to assess whether word error rate and density affect the amount of effort required to correct speech recognition errors using a graphical representation of a confusion network


Effects of WER on ASR Correction Interfaces for Mobile Text Entry

Christine Murad, Cosmin Munteanu, W. Stuerzlinger

International Conference on Human-Computer Interaction with Mobile Devices and Services, 2019