Privacy-Sensitive Audio Features for Conversational Speech Processing
Graph Chatbot
Chattez avec Graph Search
Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
Recurrent neural network language models (RNNLMs) have recently shown to outperform the venerable n-gram language models (LMs). However, in automatic speech recognition (ASR), RNNLMs were not yet used to directly decode a speech signal. Instead, RNNLMs are ...
Recurrent neural network language models (RNNLMs) have recently shown to outperform the venerable n-gram language models (LMs). However, in automatic speech recognition (ASR), RNNLMs were not yet used to directly decode a speech signal. Instead, RNNLMs are ...
The Montreux Jazz Festival archives covers almost a half-century of live jazz, rock and pop music, gathering international first-class musicians within a single catalogue. In the frame of the digitization and valorisation of the whole archives by EPFL rese ...
Standard automatic speech recognition (ASR) systems use phonemes as subword units. Thus, one of the primary resource required to build a good ASR system is a well developed phoneme pronunciation lexicon. However, under-resourced languages typically lack su ...
This paper investigates robust privacy-sensitive audio features for speaker diarization in multiparty conversations: ie., a set of audio features having low linguistic information for speaker diarization in a single and multiple distant microphone scenario ...
Setting out from the point of view that automatic speech recognition (ASR) ought to benefit from data in languages other than the target language, we propose a novel Kullback-Leibler (KL) divergence based method that is able to exploit multilingual informa ...
This work aims at investigating the automatic recognition of speaker role in meeting conversations from the AMI corpus. Two types of roles are considered: formal roles, fixed over the meeting duration and recognized at recording level, and social roles rel ...
Automatic recognition of gestures using computer vision is important for many real-world applications such as sign language recognition and human-robot interaction (HRI). Our goal is a real-time hand gesture-based HRI interface for mobile robots. We use a ...
Standard automatic speech recognition (ASR) systems use phonemes as subword units. Thus, one of the primary resource required to build a good ASR system is a well developed phoneme pronunciation lexicon. However, under-resourced languages typically lack su ...
The work described in this thesis takes place in the context of capturing real-life audio for the analysis of spontaneous social interactions. Towards this goal, we wish to capture conversational and ambient sounds using portable audio recorders. Analysis ...