End-to-End Acoustic Modeling using Convolutional Neural Networks for HMM-based Automatic Speech Recognition
Publications associées (181)
Graph Chatbot
Chattez avec Graph Search
Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
This document describes a new continuous speech decoder, TODE, which is compatible with the Torch machine learning software library. A brief theory of speech recognition is presented followed by a detailed description of the architecture of TODE and the co ...
Multi-band ASR was largely inspired by the extremely high level of redundancy in the spectral signal representation which can be inferred from Fletcher's product-of-errors rule for human speech perception. Indeed, the main aim of the multi-band approach is ...
This paper investigates the use of microphone arrays to acquire and recognise speech in meetings. Meetings pose several interesting problems for speech processing, as they consist of multiple competing speakers within a small space, typically around a tabl ...
This paper describes a complete system for audio-visual recognition of continuous speech including robust lip tracking, visual feature extraction, noise-robust acoustic feature extraction, and sensor integration. An appearance based model of the articulato ...
In this paper, we present a new approach towards high performance speech/music discrimination on realistic tasks related to the automatic transcription of broadcast news. In the approach presented here, the (local) Probability Density Function (PDF) estima ...
As recently introduced, an HMM2 can be considered as a particular case of an HMM mixture in which the HMM emission probabilities (usually estimated through Gaussian mixtures or an artificial neural network) are modeled by state-dependent, feature-based HMM ...
In this paper, we describe a new speaker verification approach, using a hybrid HMM/ANN system, and accommodating user customized passwords. This system is exploiting the high phonetic recognition rates usually achieved by HMM/ANN speaker independent system ...
As recently introduced, an HMM2 can be considered as a particular case of an HMM mixture in which the HMM emission probabilities (usually estimated through Gaussian mixtures or an artificial neural network) are modeled by state-dependent, feature-based HMM ...
Articulatory representations are expected to bring better speech recognition results. This requires to estimate the parameters of a speech production model from the speech sound, problem known as acoustico-articulatory inversion. Known methods to solve thi ...
This paper presents the theoretical basis and preliminary experimental results of a new HMM model, referred to as HMM2, which can be considered as a mixture of HMMs. In this new model, the emission probabilities of the temporal (primary) HMM are estimated ...