Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
Speaker verification systems traditionally extract and model cepstral features or filter bank energies from the speech signal. In this paper, inspired by the success of neural network-based approaches to model directly raw speech signal for applications su ...
In this thesis, we propose a novel approach for speaker and speech recognition involving localized, binary, data-driven features. The proposed approach is largely inspired by similar localized approaches in the computer vision domain. The success of these ...
EPFL2011
, , ,
Although current trends in speech processing consider deep learning through data-driven technologies, many potential applications exhibit lack of training or development data. Therefore, considerably light signal processing techniques are still of interest ...
Idiap2020
, ,
Automatic speech recognition (ASR) allows transcribing the communications between air traffic controllers (ATCOs) and aircraft pilots. The transcriptions are used later to extract ATC named entities, e.g., aircraft callsigns. One common challenge is speech ...
IEEE2022
, , ,
Although current trends in speech processing consider deep learning through data-driven technologies, many potential applications exhibit lack of training or development data. Therefore, considerably light signal processing techniques are still of interest ...
2019
Nowadays, many systems rely on fusing different sources of information to recognize human activities and gestures, speech, or brain activities for applications in areas such as clinical practice, and health care and Human Computer Interaction (HCI). Typica ...
This paper presents the development and evaluation of an automatic audio indexing system designed for a special task: work in a bilingual environment in the Parliament of the Canton of Valais in Switzerland, with two official languages, German and French. ...
The performance of speaker recognition systems has considerably improved in the last decade. This is mainly due to the development of Gaussian mixture model-based systems and in particular to the use of i-vectors. These systems handle relatively well noise ...
The integration of audio and visual information improves speech recognition performance, specially in the presence of noise. In these circumstances it is necessary to introduce audio and visual weights to control the contribution of each modality to the re ...
Children speech recognition based on short-term spectral features is a challenging task. One of the reasons is that children speech has high fundamental frequency that is comparable to formant frequency values. Furthermore, as children grow, their vocal ap ...