Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
The log-energy parameter, typically derived from a full-band spectrum, is a critical feature commonly used in automatic speech recognition (ASR) systems. However, log-energy is difficult to estimate reliably in the presence of background noise. In this pap ...
Standard automatic speech recognition (ASR) systems rely on transcribed speech, language models, and pronunciation dictionaries to achieve state-of-the-art performance. The unavailability of these resources constrains the ASR technology to be available for ...
There is growing interest in using graphemes as subword units, especially in the context of the rapid development of hidden Markov model (HMM) based automatic speech recognition (ASR) system, as it eliminates the need to build a phoneme pronunciation lexic ...
There is growing interest in using graphemes as subword units, especially in the context of the rapid development of hidden Markov model (HMM) based automatic speech recognition (ASR) system, as it eliminates the need to build a phoneme pronunciation lexic ...
Besides the recognition task, today's biometric systems need to cope with additional problem: spoofing attacks. Up to date, academic research considers spoofing as a binary classification problem: systems are trained to discriminate between real accesses a ...
Speaker diarization of meetings can be significantly improved by overlap handling. Several previous works have explored the use of different features such as spectral, spatial and energy for overlap detection. This paper proposes a method to estimate proba ...
This paper presents a novel fully automatic bi-modal, face and speaker, recognition system which runs in real-time on a mobile phone. The implemented system runs in real-time on a Nokia N900 and demonstrates the feasibility of performing both automatic fac ...
Nowadays, many systems rely on fusing different sources of information to recognize human activities and gestures, speech, or brain activities for applications in areas such as clinical practice, and health care and Human Computer Interaction (HCI). Typica ...
The advent of statistical parametric speech synthesis has paved new ways to a unified framework for hidden Markov model (HMM) based text to speech synthesis (TTS) and automatic speech recognition (ASR). The techniques and advancements made in the field of ...
Ecole Polytechnique Federale de Lausanne (EPFL)2012
This paper presents a novel fully automatic bi-modal, face and speaker, recognition system which runs in real-time on a mobile phone. The implemented system runs in real-time on a Nokia N900 and demonstrates the feasibility of performing both automatic fac ...