Publications related to Spectro-Temporal Features for Automatic Speech Recognition using Linear Prediction in Spectral Domain

Development of Bilingual ASR System for MediaParl Corpus

Petr Motlicek, Milos Cernak, David Imseng

The development of an Automatic Speech Recognition (ASR) system for the bilingual MediaParl corpus is challenging for several reasons: (1) reverberant recordings, (2) accented speech, and (3) no prior information about the language. In that context, we emp ...

Idiap2014

Combining Vocal Tract Length Normalization with Hierarchical Linear Transformations

Philip Neil Garner, John David Scott Dines, Lakshmi Babu Saheer

Recent research has demonstrated the effectiveness of vocal tract length normalization (VTLN) as a rapid adaptation technique for statistical parametric speech synthesis. VTLN produces speech with naturalness preferable to that of MLLR-based adaptation tec ...

2014

A Savitzky-Golay Filtering Perspective of Dynamic Feature Computation

Mathew Magimai Doss, Chandra Sekhar Seelamantula

We address the classical problem of delta feature computation, and interpret the operation involved in terms of Savitzky-Golay (SG) filtering. Features such as the mel-frequency cepstral coefficients (MFCCs), obtained based on short-time spectra of the spe ...

2013

Unified Framework of Feature Based Adaptation for Statistical Speech Synthesis and Recognition

Lakshmi Babu Saheer

The advent of statistical parametric speech synthesis has paved new ways to a unified framework for hidden Markov model (HMM) based text to speech synthesis (TTS) and automatic speech recognition (ASR). The techniques and advancements made in the field of ...

EPFL2013

Syllable-based Pitch Encoding for Low Bit Rate Speech Coding with Recognition/Synthesis Architecture

Philip Neil Garner, Milos Cernak

Current HMM-based low bit rate speech coding systems work with phonetic vocoders. Pitch contour coding (on frame or phoneme level) is usually fairly orthogonal to other speech coding parameters. We make an assumption in our work that the speech signal cont ...

Idiap2013

Source/Filter Factorial Hidden Markov Model, with Application to Pitch and Formant Tracking

Jean-Philippe Thiran

Tracking vocal tract formant frequencies (

f_p

) and estimating the fundamental frequency (

f_0

) are two tracking problems that have been tackled in many speech processing works, often independently, with applications to articulatory parameters estimation ...

Institute of Electrical and Electronics Engineers2013

A Simple Continuous Pitch Estimation Algorithm

Petr Motlicek, Philip Neil Garner, Milos Cernak

Recent work in text to speech synthesis has pointed to the benefit of using a continuous pitch estimate; that is, one that records pitch even when voicing is not present. Such an approach typically requires interpolation. The purpose of this paper is to sh ...

2013

Unified Framework Of Feature Based Adaptation For Statistical Speech Synthesis And Recognition

Lakshmi Babu Saheer

The advent of statistical parametric speech synthesis has paved new ways to a unified framework for hidden Markov model (HMM) based text to speech synthesis (TTS) and automatic speech recognition (ASR). The techniques and advancements made in the field of ...

Ecole Polytechnique Federale de Lausanne (EPFL)2012

Multi-parametric source-filter separation of speech and prosodic voice restoration

Olaf Schleusing

In this thesis, methods and models are developed and presented aiming at the estimation, restoration and transformation of the characteristics of human speech. During a first period of the thesis, a concept was developed that allows restoring prosodic voic ...

EPFL2012

SPARSE NON-NEGATIVE DECOMPOSITION OF SPEECH POWER SPECTRA FOR FORMANT TRACKING

Jean-Philippe Thiran

Many works on speech processing have dealt with auto-regressive (AR) models for spectral envelope and formant frequency estimation, mostly focusing on the estimation of the AR parameters. However, it is also interesting to be able to directly estimate the ...

Ieee Service Center, 445 Hoes Lane, Po Box 1331, Piscataway, Nj 08855-1331 Usa2011

Spectro-Temporal Features for Automatic Speech Recognition using Linear Prediction in Spectral Domain

Graph Chatbot

Chat with Graph Search

Development of Bilingual ASR System for MediaParl Corpus

Combining Vocal Tract Length Normalization with Hierarchical Linear Transformations

A Savitzky-Golay Filtering Perspective of Dynamic Feature Computation

Unified Framework of Feature Based Adaptation for Statistical Speech Synthesis and Recognition

Syllable-based Pitch Encoding for Low Bit Rate Speech Coding with Recognition/Synthesis Architecture

Source/Filter Factorial Hidden Markov Model, with Application to Pitch and Formant Tracking

A Simple Continuous Pitch Estimation Algorithm

Unified Framework Of Feature Based Adaptation For Statistical Speech Synthesis And Recognition

Multi-parametric source-filter separation of speech and prosodic voice restoration

SPARSE NON-NEGATIVE DECOMPOSITION OF SPEECH POWER SPECTRA FOR FORMANT TRACKING

Combining Vocal Tract Length Normalization with Hierarchical Linear Transformations

A Savitzky-Golay Filtering Perspective of Dynamic Feature Computation

Multi-parametric source-filter separation of speech and prosodic voice restoration

Development of Bilingual ASR System for MediaParl Corpus

Unified Framework of Feature Based Adaptation for Statistical Speech Synthesis and Recognition

Source/Filter Factorial Hidden Markov Model, with Application to Pitch and Formant Tracking

Syllable-based Pitch Encoding for Low Bit Rate Speech Coding with Recognition/Synthesis Architecture

Unified Framework Of Feature Based Adaptation For Statistical Speech Synthesis And Recognition

A Simple Continuous Pitch Estimation Algorithm

SPARSE NON-NEGATIVE DECOMPOSITION OF SPEECH POWER SPECTRA FOR FORMANT TRACKING