Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
A wide range of techniques for coding a single speech or audio signal channel have been developed over the last few decades. In addition to pure redundancy reduction, sophisticated source and receiver models have been considered for reducing the bitrate. O ...
Multimodal signal processing analyzes a physical phenomenon through several types of measures, or modalities. This leads to the extraction of higher-quality and more reliable information than that obtained from single-modality signals. The advantage is two ...
This communication describes the multi-modal VidTIMIT database, which can be useful for research involving mono- or multi-modal speech recognition or person authentication. It is comprised of video and corresponding audio recordings of 43 volunteers, recit ...
A novel model is presented to learn bimodally informative structures from audio-visual signals. The signal is represented as a sparse sum of audio- visual kernels. Each kernel is a bimodal function consisting of synchronous snippets of an audio waveform an ...
We address the problem of both estimating the dominant person in a meeting from a single audio source and identifying them visually in a multi-camera setting. We use a speaker diarization algorithm to perform speaker segmentation and clustering, representi ...
The use of omnidirectional cameras for videoconferencing promises to simplify the hardware setup necessary for large groups of participants. We investigate the use of a multimodal speaker detection algorithm on audio-visual sequences captured with such a c ...
This paper presents a new approach toward automatic annotation of meetings in terms of speaker identities and their locations. This is achieved by segmenting the audio recordings using two independent sources of information : magnitude spectrum analysis an ...
We address the problem of both estimating the dominant person in a meeting from a single audio source and identifying them visually in a multi-camera setting. We use a speaker diarization algorithm to perform speaker segmentation and clustering, representi ...
This paper presents a new approach toward automatic annotation of meetings in terms of speaker identities and their locations. This is achieved by segmenting the audio recordings using two independent sources of information : magnitude spectrum analysis an ...
In the printing industry, one of the most common methods for reproducing halftone images using bilevel printing devices is clustered-dot ordered dithering. The images produced using this method are quite faithful to the original and are visually pleasing. ...