Publication

Spectro-Temporal Features for Automatic Speech Recognition using Linear Prediction in Spectral Domain

Related publications (74)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

PLP$^2$: Autoregressive modeling of auditory-like 2-D spectro-temporal patterns

Hynek Hermansky

The temporal trajectories of the spectral energy in auditory critical bands over 250~ms segments are approximated by an all-pole model, the time-domain dual of conventional linear prediction. This quarter-second auditory spectro-temporal pattern is further ...

2004

On Use of Task Independent Training Data in Tandem Feature Extraction

Hynek Hermansky

The problem we address in this paper is, whether the feature extraction module trained on large amounts of task independent data, can improve the performance of stochastic models? We show that when there is only a small amount of task specific training dat ...

2004

Some Emerging Concepts in Speech Recognition.

Hervé Bourlard, Hynek Hermansky

The paper presents a work-in-progress on several emerging concepts in Automatic Speech Recognition (ASR), that are being currently studied at IDIAP. This work can be roughly categorized into three categories: 1) data-guided features, 2) features based on m ...

IDIAP2003

On Factorizing Spectral Dynamics for Robust Speech Recognition

Hervé Bourlard, Hemant Misra, Vivek Tyagi

In this paper, we introduce new dynamic speech features based on the modulation spectrum. These features, termed Mel-cepstrum Modulation Spectrum (MCMS), map the time trajectories of the spectral dynamics into a series of slow and fast moving orthogonal co ...

IDIAP2003

On Factorizing Spectral Dynamics for Robust Speech Recognition

Hervé Bourlard, Hemant Misra, Vivek Tyagi

2003

On Multi-scale Fourier Transform Analysis of Speech Signals

Hervé Bourlard, Vivek Tyagi

In this paper, we introduce a novel algorithm to perform multi-scale Fourier transform analysis of piecewise stationary signals with application to automatic speech recognition. Such signals are composed of quasi-stationary segments of variable lengths. Th ...

IDIAP2003

On Use of Task Independent Training Data in Tandem Feature Extraction

Hynek Hermansky

IDIAP2003

Evaluation of Formant-Like Features for ASR

Hervé Bourlard, Samy Bengio, Katrin Weber

This paper investigates possibilities to automatically find a low-dimensional, formant-related physical representation of the speech signal, which is suitable for automatic speech recognition (ASR). This aim is motivated by the fact that formants have been ...

IDIAP2002

Evaluation of Formant-Like Features for ASR

Hervé Bourlard, Samy Bengio, Katrin Weber

2002

Speech Recognition Using Advanced HMM2 Features

Hervé Bourlard, Samy Bengio, Katrin Weber

HMM2 is a particular hidden Markov model where state emission probabilities of the temporal (primary) HMM are modeled through (secondary) state-dependent frequency-based HMMs [12]. As shown in [13], a secondary HMM can also be used to extract robust ASR fe ...

IDIAP2001