Speaker Inconsistency Detection in Tampered Video

With the increasing amount of video being consumed by people daily, there is a danger of the rise in maliciously modified video content (i.e., 'fake news') that could be used to damage innocent people or to impose a certain agenda, e.g., meddle in elections. In this paper, we consider audio manipulations in video of a person speaking to the camera. Such manipulation is easy to perform, for instance, one can just replace a part of audio, while it can dramatically change the message and the meaning of the video. With the goal to develop an automated system that can detect these audio-visual speaker inconsistencies, we consider several approaches proposed for lip-syncing and dubbing detection, based on convolutional and recurrent networks and compare them with systems that are based on more traditional classifiers. We evaluated these methods on publicly available databases VidTIMIT, AMI, and GRID, for which we generated sets of tampered data.

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Speaker Inconsistency Detection in Tampered Video

Graph Chatbot

Chat with Graph Search

Acoustical Features as Knee Health Biomarkers: A Critical Analysis

Improving Deepfake Detectors against Real-world Perturbations with Amplitude-Phase Switch Augmentation

Multi-task Neural Network for Robust Multiple Speaker Embedding Extraction

Acoustical Features as Knee Health Biomarkers: A Critical Analysis

Multi-task Neural Network for Robust Multiple Speaker Embedding Extraction

Improving Deepfake Detectors against Real-world Perturbations with Amplitude-Phase Switch Augmentation