Publication

A Unified Framework for Score Normalization Techniques Applied to Text Independent Speaker Verification

Publications associées (32)

Graph Chatbot

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.

Connectez-vous pour utiliser Chat avec Graph Search

Multi-pose lipreading and audio-visual speech recognition

Jean-Philippe Thiran, Virginia Estellers Casas

In this article, we study the adaptation of visual and audio-visual speech recognition systems to non-ideal visual conditions. We focus on overcoming the effects of a changing pose of the speaker, a problem encountered in natural situations where the speak ...

2012

Boosting Localized Features for Speaker and Speech Recognition

Anindya Roy

In this thesis, we propose a novel approach for speaker and speech recognition involving localized, binary, data-driven features. The proposed approach is largely inspired by similar localized approaches in the computer vision domain. The success of these ...

EPFL2011

Boosting Localized Features for Speaker and Speech Recognition

Anindya Roy

Ecole Polytechnique Federale de Lausanne (EPFL)2011

Verified Speaker Localization Utilizing Voicing Level in Split-bands

Afsaneh Asaei, Mohammadjavad Taghizadeh

This paper proposes a joint verification-localization structure based on split-band analysis of speech signal and the mixed voicing level. To address the problems in reverberant acoustic environments, a new fundamental frequency estimation algorithm is pro ...

2009

Visual feature analysis for audio-visual speech recognition

Ivana Arsic de Heras Ciechomska

Humans perceive their surrounding environment in a multimodal manner by using multi-sensory inputs combined in a coordinated way. Various studies in psychology and cognitive science indicate the multimodal nature of human speech production and perception. ...

EPFL2008

Correcting Confusion Matrices for Phone Recognizers

Modern speech recognition has many ways of quantifying the misrecognitions a speech recognizer makes. The errors in modern speech recognition makes extensive use of the Levenshtein algorithm to find the distance between the labeled target and the recognize ...

IDIAP2007

Using Pitch as Prior Knowledge in Template-Based Speech Recognition

Hervé Bourlard, Guillermo Aradilla

In a previous paper on speech recognition, we showed that templates can better capture the dynamics of speech signal compared to parametric models such as hidden Markov models. The key point in template matching approaches is finding the most similar templ ...

2006