Audio Novelty-Based Segmentation of Music Concerts
Graph Chatbot
Chattez avec Graph Search
Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
In this paper we present Aligned Scores and Performances (ASAP): a new dataset of 222 digital musical scores aligned with 1068 performances (more than 92 hours) of Western classical piano music.The scores are provided as paired MusicXML files and quantized ...
The speaker diarization system developed at the International Computer Science Institute (ICSI) has played a prominent role in the speaker diarization community, and many researchers in the rich transcription community have adopted methods and techniques d ...
In this paper, we introduce our recent studies on human perception in audio event classification. In particular, the pre-trained model VGGish is used as feature extractor to process audio data, and DenseNet is trained by and used as feature extractor for o ...
Audio-Visual People Diarization (AVPD) is an original framework that simultaneously improves audio, video, and audiovisual diarization results. Following a literature review of people diarization for both audio and video content and their limitations, whic ...
The perception that we have about the world is influenced by elements of diverse nature. Indeed humans tend to integrate information coming from different sensory modalities to better understand their environment. Following this observation, scientists hav ...
We propose a novel method to automatically extract the audio-visual objects that are present in a scene. First, the synchrony between related events in audio and video channels is exploited to identify the possible locations of the sound sources. Video reg ...
Institute of Electrical and Electronics Engineers2012
With the increasing amount of video being consumed by people daily, there is a danger of the rise in maliciously modified video content (i.e., 'fake news') that could be used to damage innocent people or to impose a certain agenda, e.g., meddle in election ...
A computer-implemented method for operating a haptic device, the haptic device comprising a plurality of tactile displays configured to provide haptic stimuli to a user, the method including the steps of (a) processing an audio signal derived from an audio ...
It is very common to reuse published (or unpublished) content in a thesis (e.g. in the case of the theses made of a compilation of published articles). It is therefore important to ensure that the use (or reuse) is possible and to ask for authorizations fr ...
The integration of audio and visual information improves speech recognition performance, specially in the presence of noise. In these circumstances it is necessary to introduce audio and visual weights to control the contribution of each modality to the re ...