Publication

A Unified Framework for Score Normalization Techniques Applied to Text Independent Speaker Verification

Related publications (32)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Multi-pose lipreading and audio-visual speech recognition

Jean-Philippe Thiran, Virginia Estellers Casas

In this article, we study the adaptation of visual and audio-visual speech recognition systems to non-ideal visual conditions. We focus on overcoming the effects of a changing pose of the speaker, a problem encountered in natural situations where the speak ...

2012

Boosting Localized Features for Speaker and Speech Recognition

Anindya Roy

In this thesis, we propose a novel approach for speaker and speech recognition involving localized, binary, data-driven features. The proposed approach is largely inspired by similar localized approaches in the computer vision domain. The success of these ...

EPFL2011

Boosting Localized Features for Speaker and Speech Recognition

Anindya Roy

Ecole Polytechnique Federale de Lausanne (EPFL)2011

Verified Speaker Localization Utilizing Voicing Level in Split-bands

Afsaneh Asaei, Mohammadjavad Taghizadeh

This paper proposes a joint verification-localization structure based on split-band analysis of speech signal and the mixed voicing level. To address the problems in reverberant acoustic environments, a new fundamental frequency estimation algorithm is pro ...

2009

Visual feature analysis for audio-visual speech recognition

Ivana Arsic de Heras Ciechomska

Humans perceive their surrounding environment in a multimodal manner by using multi-sensory inputs combined in a coordinated way. Various studies in psychology and cognitive science indicate the multimodal nature of human speech production and perception. ...

EPFL2008

Correcting Confusion Matrices for Phone Recognizers

Modern speech recognition has many ways of quantifying the misrecognitions a speech recognizer makes. The errors in modern speech recognition makes extensive use of the Levenshtein algorithm to find the distance between the labeled target and the recognize ...

IDIAP2007

Using Pitch as Prior Knowledge in Template-Based Speech Recognition

Hervé Bourlard, Guillermo Aradilla

In a previous paper on speech recognition, we showed that templates can better capture the dynamics of speech signal compared to parametric models such as hidden Markov models. The key point in template matching approaches is finding the most similar templ ...

2006