Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
We address the problem of automatically predicting group performance on a task, using multimodal features derived from the group conversation. These include acoustic features extracted from the speech signal, and linguistic features derived from the conver ...
Conventional deep neural networks (DNN) for speech acoustic modeling rely on Gaussian mixture models (GMM) and hidden Markov model (HMM) to obtain binary class labels as the targets for DNN training. Subword classes in speech recognition systems correspond ...
This thesis deals with signal-based methods that predict how listeners perceive speech quality in telecommunications. Such tools, called objective quality measures, are of great interest in the telecommunications industry to evaluate how new or deployed sy ...
Automatic non-native accent assessment has potential benefits in language learning and speech technologies. The three fundamental challenges in automatic accent assessment are to characterize, model and assess individual variation in speech of the non-nati ...
Phonological studies suggest that the typical subword units such as phones or phonemes used in automatic speech recognition systems can be decomposed into a set of features based on the articulators used to produce the sound. Most of the current approaches ...
The i-vector and Joint Factor Analysis (JFA) systems for text- dependent speaker verification use sufficient statistics computed from a speech utterance to estimate speaker models. These statis- tics average the acoustic information over the utterance ther ...
Visual speech recognition is a challenging research problem with a particular practical application of aiding audio speech recognition in noisy scenarios. Multiple camera setups can be beneficial for the visual speech recognition systems in terms of improv ...
In this paper, a compressive sensing (CS) perspective to exemplar-based speech processing is proposed. Relying on an analytical relationship between CS formulation and statistical speech recognition (Hidden Markov Models HMM), the automatic speech recognit ...
The i-vector and Joint Factor Analysis (JFA) systems for text- dependent speaker verification use sufficient statistics computed from a speech utterance to estimate speaker models. These statis- tics average the acoustic information over the utterance ther ...
Automatic non-native accent assessment has many potential benefits in language learning and speech technologies. The three fundamental challenges in automatic accent assessment are to characterize, model and assess individual variation in speech of the non ...