Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
This paper proposes a new method for bimodal information fusion in audio-visual speech recognition, where cross-modal association is considered in two levels. First, the acoustic and the visual data streams are combined at the feature level by using the ca ...
Multilingual speech recognition obviously involves numerous research challenges, including common phoneme sets, adaptation on limited amount of training data, as well as mixed language recognition (common in many countries, like Switzerland). In this latte ...
In this paper, we consider the problem of speaker verification as a two-class object detection problem in computer vision, but the object instances are 1-D short-time spectral vectors obtained from the speech signal. More precisely, we investigate the gene ...
The advent of statistical speech synthesis has enabled the unification of the basic techniques used in speech synthesis and recognition. Adaptation techniques that have been successfully used in recognition systems can now be applied to synthesis systems t ...
Multistream diarization is an effective way to improve the diarization performance, MFCC and Time Delay Of Arrivals (TDOA) being the most commonly used features. This paper extends our previous work on information bottleneck diarization aiming to include l ...
Multistream diarization is an effective way to improve the diarization performance, MFCC and Time Delay Of Arrivals (TDOA) being the most commonly used features. This paper extends our previous work on information bottleneck diarization aiming to include l ...
Multilingual speech recognition obviously involves numerous research challenges, including common phoneme sets, adaptation on limited amount of training data, as well as mixed language recognition (common in many countries, like Switzerland). In this latte ...
In this work, we investigate the possible use of k-nearest neighbour (kNN) classifiers to perform frame-based acoustic phonetic classification, hence replacing Gaussian Mixture Models (GMM) or MultiLayer Perceptrons (MLP) used in standard Hidden Markov Mod ...
We describe and analyze a discriminative algorithm for learning to align a phoneme sequence of a speech utterance with its acoustical signal counterpart by predicting a timing sequence representing the phoneme start times. In contrast to common HMM-based a ...
We extend the standard boosting procedure to train a two-layer classifier dedicated to handwritten char- acter recognition. The scheme we propose relies on a hidden layer which extracts feature vectors on a fixed number of points of interest, and an output ...