Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
This paper presents a series of tests that were performed on a state-of-the-art real-time automatic speech recognition system for English, in a single-computer implementation. As the intention is to use the system for speech-based query-free document retrieval in conversations, several parameters were varied: text type, microphone quality, computing power, speaker fluency, and pace of the speech. Word accuracy over various word counts, including a restriction to content words, varied in the 30%-70% range. The paper compares results over many conditions, and concludes that the ASR system is acceptable for the intended use only if all the parameters are in optimal conditions. If more than two parameters are suboptimal, then its output becomes too noisy for document retrieval.
Mathew Magimai Doss, Zohreh Mostaani
Subrahmanya Pavankumar Dubagunta