Blind Audiovisual Source Separation Using Sparse Redundant Representations
Graph Chatbot
Chattez avec Graph Search
Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
The speaker diarization system developed at the International Computer Science Institute (ICSI) has played a prominent role in the speaker diarization community, and many researchers in the rich transcription community have adopted methods and techniques d ...
A non-obtrusive portable device, wearable from infancy through adulthood, mounted with i) a set of two or more optical device(s) providing visual and audio information as perceived by the user ii) an actuated mirror or optical device returning visual infor ...
In this article, we study the adaptation of visual and audio-visual speech recognition systems to non-ideal visual conditions. We focus on overcoming the effects of a changing pose of the speaker, a problem encountered in natural situations where the speak ...
This report presents a semi-supervised method to jointly extract audio-visual sources from a scene. It consist of applying a supervised method to segment the video signal followed by an automatic process to properly separate the audio track. This approach ...
In this paper we present a novel nonlinear video diffusion approach based on the fusion of information in audio and video channels. Both modalities are efficiently combined into a diffusion coefficient that integrates the basic assumption in this domain, i ...
Institute of Electrical and Electronics Engineers2011
In this paper we propose a novel method which is able to detect and separate audio-visual sources present in a scene. Our method exploits the correlation between the video signal captured with a camera and a synchronously recorded one-microphone audio trac ...
The perception that we have about the world is influenced by elements of diverse nature. Indeed humans tend to integrate information coming from different sensory modalities to better understand their environment. Following this observation, scientists hav ...
Person identification using audio or visual biometrics is a well-studied problem in pattern recognition. In this scenario, both training and testing are done on the same modalities. However, there can be situations where this condition is not valid, i.e. t ...
Person identification using audio or visual biometrics is a well-studied problem in pattern recognition. In this scenario, both training and testing are done on the same modalities. However, there can be situations where this condition is not valid, i.e. t ...
Given two video sequences, a composite video sequence can be generated which includes visual elements from each of the given sequences, suitably synchronized and represented in a chosen focal plane. For example, given two video sequences with each showing ...