Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
One of the difficulties in Automatic Speech Recognizer (ASR) is the pronunciation variability. Each word (modeled by a baseline phonetic transcription in the ASR dictionary) can be pronounced in many different ways depending on many complex qualitative and ...
[...] Primates are social animals whose communication is based on their conspecifics' vocalizations and facial expressions. Although a lot of work to date has studied the unimodal representation of vocal or facial information, little is known about the way ...
The ability to correctly interpret emotional signals from others is crucial for successful social interaction. Previous neuroimaging studies showed that voice-sensitive auditory areas [1-3] activate to a broad spectrum of vocally expressed emotions more th ...
This paper proposes a joint verification-localization structure based on split-band analysis of speech signal and the mixed voicing level. To address the problems in reverberant acoustic environments, a new fundamental frequency estimation algorithm is pro ...
Humans perceive their surrounding environment in a multimodal manner by using multi-sensory inputs combined in a coordinated way. Various studies in psychology and cognitive science indicate the multimodal nature of human speech production and perception. ...
This paper reports a study on short-time subharmonic pitch breaks in vocal fold vibration, which are found to be a common feature of the human voice in spoken language. The observed pitch breaks correspond to a change in periodicity of the electrolaryngogr ...
Speech recognition applications embedded on a PDA are already available on the market. The usual hardware for this kind of systems is a single microphone mounted on the PDA, giving good results within quiet environments. Though, the recognition rate falls ...
The goal of this thesis is to develop and design new feature representations that can improve the automatic speech recognition (ASR) performance in clean as well noisy conditions. One of the main shortcomings of the fixed scale (typically 20-30 ms long ana ...
The motor actions that can be witnessed as a virtuoso musician performs can be so fast, so accomplished, so precise, as to seem somehow superhuman. The musician has to produce the movements, monitor those they have already made and the subsequent result, c ...
The existence of highly developed and fully automated telephone networks in industrialized countries and the pervasiveness of speech communication technology have resulted in the ever-increasing use of the human voice as an instrument in the commission of ...