Publication

Speaker Inconsistency Detection in Tampered Video

Related publications (38)

Audio-Visual Fusion

The perception that we have about the world is influenced by elements of diverse nature. Indeed humans tend to integrate information coming from different sensory modalities to better understand their environment. Following this observation, scientists hav ...

EPFL2011

Audio-driven Nonlinear Video Diffusion

Pierre Vandergheynst, Anna Llagostera Casanovas

In this paper we present a novel nonlinear video diffusion approach based on the fusion of information in audio and video channels. Both modalities are efficiently combined into a diffusion coefficient that integrates the basic assumption in this domain, i ...

Institute of Electrical and Electronics Engineers2011

Semi-supervised Extraction of Audio-Visual Sources

Patricia Calatayud Martinez

This report presents a semi-supervised method to jointly extract audio-visual sources from a scene. It consist of applying a supervised method to segment the video signal followed by an automatic process to properly separate the audio track. This approach ...

2010

Blind Audio-Visual Source Separation based on Sparse Redundant Representations

Pierre Vandergheynst, Rémi Gribonval, Gianluca Monaci, Anna Llagostera Casanovas

In this paper we propose a novel method which is able to detect and separate audio-visual sources present in a scene. Our method exploits the correlation between the video signal captured with a camera and a synchronously recorded one-microphone audio trac ...

2010

Crossmodal Matching of Speakers using Lip and Voice Features in Temporally Non-overlapping Audio and Video Streams

Sébastien Marcel, Anindya Roy

Person identification using audio (speech) and visual (facial appearance, static or dynamic) modalities, either independently or jointly, is a thoroughly investigated problem in pattern recognition. In this work, we explore a novel task : person identifica ...

2010

Crossmodal Matching of Speakers using Lip and Voice Features in Temporally Non-overlapping Audio and Video Streams

Sébastien Marcel, Anindya Roy

Idiap2010

Autoregressive Models of Amplitude Modulations in Audio Compression

Petr Motlicek, Hynek Hermansky, Sriram Ganapathy

We present a scalable medium bit-rate wide-band audio coding technique based on frequency domain linear prediction (FDLP). FDLP is an efficient method for representing the long-term amplitude modulations of speech/audio signals using autoregressive models. ...

2010

Method and system for combining video sequences with spatio-temporal alignment

Martin Vetterli, Serge Ayer

Given two video sequences, a composite video sequence can be generated which includes visual elements from each of the given sequences, suitably synchronized and represented in a chosen focal plane. For example, given two video sequences with each showing ...

2010

Method and system for combining video sequences with spatio-temporal alignment

Martin Vetterli, Serge Ayer

2010

Estimating Cohesion in Small Groups using Audio-Visual Nonverbal Behavior

Daniel Gatica-Perez

Cohesiveness in teams is an essential part of ensuring the smooth running of task-oriented groups. Research in social psychology and management has shown that good cohesion in groups can be correlated with team effectiveness or productivity so automaticall ...

Idiap2010

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.