Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
During depression neurophysiological changes can occur, which may affect laryngeal control i.e. behaviour of the vocal folds. Characterising these changes in a precise manner from speech signals is a non trivial task, as this typically involves reliable se ...
In i-vector based speaker recognition systems, back-end classifiers are trained to factor out nuisance information and retain only the speaker identity. As a result, variabilities arising due to gender, language and accent ( among many others) are suppress ...
We show that confidence measures estimated from local posterior probabilities can serve as objective functions for training ANNs in hybrid HMM based speech recognition systems. This leads to a segment-level training paradigm that overcomes the limitation o ...
Towards the goal of improving acoustic modeling for automatic speech recognition (ASR), this work investigates the modeling of senone subspaces in deep neural network (DNN) posteriors using low-rank and sparse modeling approaches. While DNN posteriors are ...
Robotic teleoperation is fundamental to augment the resilience, precision, and force of robots with the cognition of the operator. However, current interfaces, such as joysticks and remote controllers, are often complicated to handle since they require cog ...
Feature extraction is a key step in many machine learning and signal processing applications. For speech signals in particular, it is important to derive features that contain both the vocal characteristics of the speaker and the content of the speech. In ...
The multi-channel Wiener filter (MWF) is a well-known multi-microphone speech enhancement technique, aiming at improving the quality of the recorded speech signals in noisy and reverberant environments. Assuming that reverberation and ambient noise can be ...
Speech intelligibility is an important assessment criterion of the communicative performance of pathological speakers. To assist clinicians in their assessment, time- and cost-efficient automatic intelligibility measures offering a repeatable and reliable ...
After decades of large-scale digitization, many historical newspaper collections are just one click away via online portals developed and supported by various public or private stakeholders. Initially offering access to full text search and facsimiles visu ...
This paper introduces a new task termed low-latency speaker spotting (LLSS). Related to security and intelligence applications, the task involves the detection, as soon as possible, of known speakers within multi-speaker audio streams. The paper describes ...