Publication

End-to-End Acoustic Modeling using Convolutional Neural Networks for HMM-based Automatic Speech Recognition

Publications associées (181)

Graph Chatbot

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.

Connectez-vous pour utiliser Chat avec Graph Search

Enhancing posterior based speech recognition systems

Hamed Ketabdar

The use of local phoneme posterior probabilities has been increasingly explored for improving speech recognition systems. Hybrid hidden Markov model / artificial neural network (HMM/ANN) and Tandem are the most successful examples of such systems. In this ...

EPFL2008

Enhancing posterior based speech recognition systems

Hamed Ketabdar

Ecole Polytechnique Fédérale de Lausanne2008

Automatic Speech and Speaker Recognition: Large Margin and Kernel Methods

Samy Bengio

This is the first book dedicated to uniting research related to speech and speaker recognition based on the recent advances in large margin and kernel methods. The first part of the book presents theoretical and practical foundations of large margin and ke ...

John Wiley & Sons2008

Exploiting Contextual Information for Improved Phoneme Recognition

Hynek Hermansky, Joel Praveen Pinto

In this paper, we investigate the significance of contextual information in a phoneme recognition system using the hidden Markov model - artificial neural network paradigm. Contextual information is probed at the feature level as well as at the output of t ...

2008

How does a dictation machine recognize speech?

Hervé Bourlard

There is magic (or is it witchcraft?) in a speech recognizer that transcribes continuous radio speech into text with a word accuracy of even not more than 50%. The extreme difficulty of this task, tough, is usually not perceived by the general public. This ...

Idiap2008

Fast keyword detection with sparse time-frequency models

Pascal Frossard, Olivier Verscheure, Effrosyni Kokiopoulou

We address the problem of keyword spotting in continuous speech streams when training and testing conditions can be different. We propose a keyword spotting algorithm based on sparse representation of speech signals in a time-frequency feature space. The t ...

2008

Timbre and Rhythmic TRAP-TANDEM features for music information retrieval

Nicolas Scaringella

The enormous growth of digital music databases has led to a comparable growth in the need for methods that help users organize and access such information. One area in particular that has seen much recent research activity is the use of automated technique ...

2008

Timbre and Rhythmic TRAP-TANDEM features for music information retrieval

Nicolas Scaringella

IDIAP2008

Robust overlapping speech recognition based on neural networks

John David Scott Dines, Weifeng Li

We address issues for improving hands-free speech recognition performance in the presence of multiple simultaneous speakers using multiple distant microphones. In this paper, a log spectral mapping is proposed to estimate the log mel-filterbank outputs of ...

IDIAP2007

Unsupervised Speech/Non-speech Detection for Automatic Speech Recognition in Meeting Rooms

Daniel Gatica-Perez, Petr Motlicek

The goal of this work is to provide robust and accurate speech detection for automatic speech recognition (ASR) in meeting room settings. The solution is based on computing long-term modulation spectrum, and examining specific frequency range for dominant ...

2007