Wordless Sounds: Robust Speaker Diarization using Privacy-Preserving Audio Representations
Graph Chatbot
Chattez avec Graph Search
Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
The work described in this thesis takes place in the context of capturing real-life audio for the analysis of spontaneous social interactions. Towards this goal, we wish to capture conversational and ambient sounds using portable audio recorders. Analysis ...
Automatic recognition of gestures using computer vision is important for many real-world applications such as sign language recognition and human-robot interaction (HRI). Our goal is a real-time hand gesture-based HRI interface for mobile robots. We use a ...
Automatic Speech Recognition systems typically use smoothed spectral features as acoustic observations. In recent studies, it has been shown that complementing these standard features with pitch frequency could improve the system performance of the system. ...
This paper proposes a novel and simple local neural classifier for the recognition of mental tasks from on-line spontaneous EEG signals. The proposed neural classifier recognizes three mental tasks from on-line spontaneousEEGsignals. Correct recognition is ...
Institute of Electrical and Electronics Engineers2002
We address issues for improving hands-free speech recognition performance in the presence of multiple simultaneous speakers using multiple distant microphones. In this paper, a log spectral mapping is proposed to estimate the log mel-filterbank outputs of ...
Phone posteriors has recently quite often used (as additional features or as local scores) to improve state-of-the-art automatic speech recognition (ASR) systems. Usually, better phone posterior estimates yield better ASR performance. In the present paper ...
Automatic Speech Recognition systems typically use smoothed spectral features as acoustic observations. In recent studies, it has been shown that complementing these standard features with pitch frequency could improve the system performance of the system. ...
Automatic Speech Recognition systems typically use smoothed spectral features as acoustic observations. In recent studies, it has been shown that complementing these standard features with auxiliary information could improve the performance of the system. ...
We leverage the recent algorithmic advances in compressive sensing, and propose a novel source separation algorithm for efficient recovery of convolutive speech mixtures in spectro-temporal domain. Compared to the common sparse component analysis technique ...
Traditional speech recognition systems use Gaussian mixture models to obtain the likelihoods of individual phonemes, which are then used as state emission probabilities in hidden Markov models representing the words. In hybrid systems, the Gaussian mixture ...