Towards using slide information to enhance speech transcription of meetings
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
This paper proposes modifications to the Multi-resolution RASTA (MRASTA) feature extraction technique for the automatic speech recognition (ASR). By emulating asymmetries of the temporal receptive field (TRF) profiles of auditory mid-brain neurons, we obta ...
Automatic Speech Recognition (ASR) systems usually fail when they encounter speech from far-field microphone in reverberant environments. This is due to the application of short-term feature extraction techniques which do not compensate for the artifacts i ...
Given a set of head related transfer functions (HRTFs) and a room impulse response measured with a Soundfield microphone, the proposed technique computes binaural room impulse responses (BRIRs) which are similar to binaural room impulse responses that woul ...
Automatic Speech Recognition (ASR) systems usually fail when they encounter speech from far-field microphone in reverberant environments. This is due to the application of short-term feature extraction techniques which do not compensate for the artifacts i ...
This paper presents new filter bank design methods for sub- band adaptive beamforming. In this work, we design analysis and synthesis prototypes for modulated filter banks so as to minimize each aliasing term individually. We then drive the total response ...
Modern speech recognition has many ways of quantifying the misrecognitions a speech recognizer makes. The errors in modern speech recognition makes extensive use of the Levenshtein algorithm to find the distance between the labeled target and the recognize ...
This paper proposes modifications to the Multi-resolution RASTA (MRASTA) feature extraction technique for the automatic speech recognition (ASR). By emulating asymmetries of the temporal receptive field (TRF) profiles of auditory mid-brain neurons, we obta ...
In this paper, we address an adaptive beamforming application in realistic acoustic conditions. After the position of a speaker is estimated by a speaker tracking system, we construct a subband-domain beamformer in generalized sidelobe canceller (GSC) conf ...
\begin{abstract} In this paper, we address an adaptive beamforming application in realistic acoustic conditions. After the position of a speaker is estimated by a speaker tracking system, we construct a subband-domain beamformer in \emph{generalized sidelo ...
In a previous paper on speech recognition, we showed that templates can better capture the dynamics of speech signal compared to parametric models such as hidden Markov models. The key point in template matching approaches is finding the most similar templ ...