Publication

Template-matching for text-dependent speaker verification

Publications associées (174)

Graph Chatbot

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.

Connectez-vous pour utiliser Chat avec Graph Search

Boosting Localized Features for Speaker and Speech Recognition

Anindya Roy

In this thesis, we propose a novel approach for speaker and speech recognition involving localized, binary, data-driven features. The proposed approach is largely inspired by similar localized approaches in the computer vision domain. The success of these ...

Ecole Polytechnique Federale de Lausanne (EPFL)2011

Investigation of kNN Classifier on Posterior Features Towards Application in Automatic Speech Recognition

Hervé Bourlard, Afsaneh Asaei, Benjamin Picart

Class posterior distributions can be used to classify or as intermediate features, which can be further exploited in different classifiers (e.g., Gaussian Mixture Models, GMM) towards improving speech recognition performance. In this paper we examine the p ...

Idiap2010

Multilayer Perceptron Based Hierarchical Acoustic Modeling for Automatic Speech Recognition

Joel Praveen Pinto

In this thesis, we investigate a hierarchical approach for estimating the phonetic class-conditional probabilities using a multilayer perceptron (MLP) neural network. The architecture consists of two MLP classifiers in cascade. The first MLP is trained in ...

EPFL2010

Enhanced Phone Posteriors for Improving Speech Recognition Systems

Hervé Bourlard, Hamed Ketabdar

Using phone posterior probabilities has been increasingly explored for improving automatic speech recognition (ASR) systems. In this paper, we propose two approaches for hierarchically enhancing these phone posteriors, by integrating long acoustic context, ...

2010

Analysis of MLP Based Hierarchical Phoneme Posterior Probability Estimator

Hervé Bourlard, Hynek Hermansky, Joel Praveen Pinto

We analyze a simple hierarchical architecture consisting of two multilayer perceptron (MLP) classifiers in tandem to estimate the phonetic class conditional probabilities. In this hierarchical setup, the first MLP classifier is trained using standard acous ...

2010

A Large Margin Algorithm for Forced Alignment

We describe and analyze a discriminative algorithm for learning to align a phoneme sequence of a speech utterance with its acoustical signal counterpart by predicting a timing sequence representing the phoneme start times. In contrast to common HMM-based a ...

John Wiley and Sons2009

Robustness of Phase based Features for Speaker Recognition

Sree Hari Krishnan Parthasarathi

This paper demonstrates the robustness of group-delay based features for speech processing. An analysis of group delay functions is presented which show that these features retain formant structure even in noise. Furthermore, a speaker verification task pe ...

2009

Robustness of Phase based Features for Speaker Recognition

Sree Hari Krishnan Parthasarathi

Idiap2009

Keyword Detection for Spontaneous Speech

Hervé Bourlard, Aude Billard, Weifeng Li

This paper presents a system for keyword detection in spontaneous speech. Keywords are predefined through a set of acoustic examples provided by the users. Keyword detection proceeds in two steps: keyword searching and verification. To address the problem ...

2009

Statistical Evaluation of Biometric Evidence in Forensic Automatic Speaker Recognition

Andrzej Drygajlo

Forensic speaker recognition is the process of determining if a specific individual (suspected speaker) is the source of a questioned voice recording (trace). This paper aims at presenting forensic automatic speaker recognition (FASR) methods that provide ...

Springer-Verlag New York, Ms Ingrid Cunningham, 175 Fifth Ave, New York, Ny 10010 Usa2009