Publication

Wordless Sounds: Robust Speaker Diarization using Privacy-Preserving Audio Representations

Publications associées (69)

Graph Chatbot

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.

Connectez-vous pour utiliser Chat avec Graph Search

A geometry-inspired decision-based attack

Pascal Frossard, Seyed Mohsen Moosavi Dezfooli, Yujia Liu

Deep neural networks have recently achieved tremendous success in image classification. Recent studies have however shown that they are easily misled into incorrect classification decisions by adversarial examples. Adversaries can even craft attacks by que ...

2019

A Geometry-Inspired Decision-Based Attack

Pascal Frossard, Seyed Mohsen Moosavi Dezfooli, Yujia Liu

IEEE COMPUTER SOC2019

, , ,

This paper addresses the problem of automatic facial expression recognition in videos, where the goal is to predict discrete emotion labels best describing the emotions expressed in short video clips. Building on a pre-trained convolutional neural network ...

IEEE2019

Language Independent Query by Example Spoken Term Detection

Dhananjay Ram

Language independent query-by-example spoken term detection (QbE-STD) is the problem of retrieving audio documents from an archive, which contain a spoken query provided by a user. This is usually casted as a hypothesis testing and pattern matching problem ...

EPFL2019

Tampered Speaker Inconsistency Detection with Phonetically Aware Audio-visual Features

Sébastien Marcel

The recent increase in social media based propaganda, i.e., ‘fake news’, calls for automated methods to detect tampered content. In this paper, we focus on detecting tampering in a video with a person speaking to a camera. This form of manipulation is easy ...

2019

SPOKEN LANGUAGE IDENTIFICATION USING LANGUAGE BOTTLENECK FEATURES

Petr Motlicek, Wissem Allouchi

In this paper, we introduce a novel approach for Language Identification (LID). Two commonly used state-of-the-art methods based on UBM/GMM I-vector technique, combined with a back-end classifier, are first evaluated. The differential factor between these ...

Idiap2019

Evolution of Neural Network Architectures for Speech Recognition

Hervé Bourlard

Over these last few years, the use of Artificial Neural Networks (ANNs), now often referred to as deep learning or Deep Neural Networks (DNNs), has significantly reshaped research and development in a variety of signal and information processing tasks. Whi ...

ISCA-INT SPEECH COMMUNICATION ASSOC2018

Phonetic aware techniques for Speaker Verification

Subhadeep Dey

The goal of this thesis is to improve current state-of-the-art techniques in speaker verification (SV), typically based on âidentity-vectorsâ (i-vectors) and deep neural network (DNN), by exploiting diverse (phonetic) information extracted using variou ...

EPFL2018

Phonological Posterior Hashing for Query by Example Spoken Term Detection

Hervé Bourlard, Afsaneh Asaei, Dhananjay Ram

State of the art query by example spoken term detection (QbE-STD) systems in zero-resource conditions rely on representation of speech in terms of sequences of class-conditional posterior probabilities estimated by deep neural network (DNN). The posteriors ...

ISCA-INT SPEECH COMMUNICATION ASSOC2018

Towards directly modeling raw speech signal for speaker verification using CNNs

Sébastien Marcel, Hannah Muckenhirn

Speaker verification systems traditionally extract and model cepstral features or filter bank energies from the speech signal. In this paper, inspired by the success of neural network-based approaches to model directly raw speech signal for applications su ...

IEEE2018