Blind audiovisual source separation using sparse representations
Publications associées (60)
Graph Chatbot
Chattez avec Graph Search
Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
Acoustical knee health assessment has long promised an alternative to clinically available medical imaging tools, but this modality has yet to be adopted in medical practice. The field is currently led by machine learning models processing acoustical featu ...
This paper presents Sound My Vision, an Android application for controlling music expression and multimedia projects. Unlike other similar applications which collect data only from sensors and input devices, Sound My Vision also analyses input video in rea ...
Performing speaker diarization while uniquely identifying the speakers in a collection of audio recordings is a challenging task. Based on our previous work on speaker diarization and linking, we developed a system for diarizing longitudinal TV show data s ...
There is provided a hearing assistance system, comprising a transmission unit (10) comprising a microphone arrangement (17) for capturing audio signals from a voice of a speaker ( 11 ) using the transmission unit and being adapted to transmit the audio sig ...
In this contribution, we present a method to compensate for long duration data gaps in audio signals, in particular music. To achieve this task, a similarity graph is constructed, based on a short-time Fourier analysis of reliable signal segments, e.g. the ...
A method for presenting to a user of a wearable audio device a modified audio scene together with additional information related to the audio scene, comprising: capturing audio signals with a plurality of microphones; outputting an audio signal with a plur ...
We present a novel method for the compensation of long duration data loss in audio signals, in particular music. The concealment of such signal defects is based on a graph that encodes signal structure in terms of time-persistent spectral similarity. A sui ...
With the increasing amount of video being consumed by people daily, there is a danger of the rise in maliciously modified video content (i.e., 'fake news') that could be used to damage innocent people or to impose a certain agenda, e.g., meddle in election ...
In this paper we present Aligned Scores and Performances (ASAP): a new dataset of 222 digital musical scores aligned with 1068 performances (more than 92 hours) of Western classical piano music.The scores are provided as paired MusicXML files and quantized ...
Audio-Visual People Diarization (AVPD) is an original framework that simultaneously improves audio, video, and audiovisual diarization results. Following a literature review of people diarization for both audio and video content and their limitations, whic ...