Publication

Privacy-Sensitive Audio Features for Conversational Speech Processing

Related publications (113)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Joint speech and speaker recognition

The goal of the thesis is to investigate different approaches that combine and integrate Automatic Speech Recognition (ASR) and Speaker Recognition (SR) systems, with applications to (1) User-Customized Password Speaker Verification (UCP-SV) systems, and, ...

EPFL2005

Audio-Visual Speech Recognition with a Hybrid SVM-HMM System

Jean-Philippe Thiran, Mihai Gurban

Traditional speech recognition systems use Gaussian mixture models to obtain the likelihoods of individual phonemes, which are then used as state emission probabilities in hidden Markov models representing the words. In hybrid systems, the Gaussian mixture ...

EUSIPCO2005

Joint Speech and Speaker Recognition

The goal of the present thesis was to investigate and optimize different approaches towards User-Customized Password Speaker Verification (UCP-SV) systems. In such systems, users can choose their own passwords, which will be subsequently used for verificat ...

IDIAP2005

Joint Speech and Speaker Recognition

École Polytechnique Fédérale de Lausanne, Computer Science Department2005

Can a Professional Imitator Fool a GMM-Based Speaker Verification System?

Samy Bengio

This paper presents an attempt at assessing empirically how a state-of-the-art text-independent speaker verification system behaves when confronted to imposting attempts from a professional imitator who perfectly knows how to imitate in particular the clie ...

IDIAP2005

Text Detection and Recognition in Images and Videos

Hervé Bourlard, Jean-Marc Odobez, Datong Chen

Text embedded in images and videos represents a rich source of information for content-based indexing and retrieval applications. In this paper, we present a new method for localizing and recognizing text in complex images and videos. Text localization is ...

2004

On Use of Task Independent Training Data in Tandem Feature Extraction

Hynek Hermansky

The problem we address in this paper is, whether the feature extraction module trained on large amounts of task independent data, can improve the performance of stochastic models? We show that when there is only a small amount of task specific training dat ...

2004

Automatic Speech Recognition using Dynamic Bayesian Networks with the Energy as an Auxiliary Variable

In current automatic speech recognition (ASR) systems, the energy is not used as part of the feature vector in spite of being a fundamental feature in the speech signal. The noise inherent in its estimation degrades the system performance. In this report w ...

IDIAP2003

Robust Speech Recognition and Feature Extraction Using HMM2

Hervé Bourlard, Samy Bengio, Katrin Weber, Shajith Ikbal

This paper presents the theoretical basis and preliminary experimental results of a new HMM model, referred to as HMM2, which can be considered as a mixture of HMMs. In this new model, the emission probabilities of the temporal (primary) HMM are estimated ...

2003

Using pitch frequency information in speech recognition

Hervé Bourlard

Automatic Speech Recognition systems typically use smoothed spectral features as acoustic observations. In recent studies, it has been shown that complementing these standard features with pitch frequency could improve the system performance of the system. ...

IDIAP2003