Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
This paper presents the development and evaluation of an automatic audio indexing system designed for a special task: work in a bilingual environment in the Parliament of the Canton of Valais in Switzerland, with two official languages, German and French. As several speakers are bilingual, language changes may occur within speaker or even within utterance. Two audio indexing approaches are presented and compared: in the first, speech indexing is based on bilingual automatic speech recognition; in the second, language identification is used after speaker diarization in order to select the corresponding monolingual speech recognizer for decoding. The approaches are later combined. Speaker adaptive training is also addressed and evaluated. Accuracy of language identification and speech recognition for the monolingual and bilingual cases are presented and compared, in parallel with a brief description of the system and the user interface. Finally, the audio indexing system is also evaluated from an information retrieval point of view.
Petr Motlicek, Hynek Hermansky, Sriram Ganapathy, Amrutha Prasad
Subrahmanya Pavankumar Dubagunta