On Variable-Scale Piecewise Stationary Spectral Analysis of Speech Signals for ASR
Graph Chatbot
Chattez avec Graph Search
Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
State-of-the-art Automatic Speech Recognition (ASR) systems make extensive use of Hidden Markov Models (HMMs), characterized by flexible statistical modeling, powerful optimization (training) techniques and efficient recognition algorithms. When allowed by ...
HMM2 is a particular hidden Markov model where state emission probabilities of the temporal (primary) HMM are modeled through (secondary) state-dependent frequency-based HMMs [12]. As shown in [13], a secondary HMM can also be used to extract robust ASR fe ...
In this paper, we present a new approach towards high performance speech/music discrimination on realistic tasks related to the automatic transcription of broadcast news. In the approach presented here, the (local) Probability Density Function (PDF) estima ...
The principal question we address in this work is the following: "Is Markov modelling appropriate to analyse data networks?". We will conclude that it is. The motivation for this question comes from the work of a research team at Bellcore who argue that th ...
We present a new framework for processing point-sampled objects using spectral methods. By establishing a concept of local frequencies on geometry, we introduce a versatile spectral representation that provides a rich repository of signal processing algori ...
In this report, we present preliminary experiments towards automatic inference and evaluation of pronunciation models based on multiple utterances of each lexicon word and their given baseline pronunciation model (baseform phonetic transcription). In the p ...
Multi-band ASR was largely inspired by the extremely high level of redundancy in the spectral signal representation which can be inferred from Fletcher's product-of-errors rule for human speech perception. Indeed, the main aim of the multi-band approach is ...
HMM2 is a particular hidden Markov model where state emission probabilities of the temporal (primary) HMM are modeled through (secondary) state-dependent frequency-based HMMs [12]. As shown in [13], a secondary HMM can also be used to extract robust ASR fe ...
The purpose of this paper is to investigate the behavior of HMM2 models for the recognition of noisy speech. It has previously been shown that HMM2 is able to model dynamically important structural information inherent in the speech signal, often correspon ...
This thesis presents a learning based approach to speech recognition and person recognition from image sequences. An appearance based model of the articulators is learned from example images and is used to locate, track, and recover visual speech features. ...