Publication

Syllable-based Pitch Encoding for Low Bit Rate Speech Coding with Recognition/Synthesis Architecture

Related publications (52)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

On Compressibility of Neural Network phonological Features for Low Bit Rate Speech Coding

Hervé Bourlard, Milos Cernak, Afsaneh Asaei

Phonological features extracted by neural network have shown interesting potential for low bit rate speech vocoding. The span of phonological features is wider than the span of phonetic features, and thus fewer frames need to be transmitted. Moreover, the ...

2015

Incremental Syllable-Context Phonetic Vocoding

Petr Motlicek, Philip Neil Garner, Milos Cernak

Current very low bit rate speech coders are, due to complexity limitations, designed to work off-line. This paper investigates incremental speech coding that operates real-time and incrementally (i.e., encoded speech depends only on already-uttered speech ...

Idiap2015

Incremental Syllable-Context Phonetic Vocoding

Petr Motlicek, Philip Neil Garner, Milos Cernak

2015

A simple continuous excitation model for parametric vocoding

Philip Neil Garner, Milos Cernak

We describe a continuous-pitch parametric vocoder suitable for speech coding and statistical text to speech synthesis. The spectral model is based on linear prediction. We show that glottal modelling techniques from recent literature can be cherry-picked t ...

Idiap2015

Syllabic Pitch Tuning for Neutral-to-Emotional Voice Conversion

Milos Cernak, Lakshmi Babu Saheer

Prosody plays an important role in both identification and synthesis of emotionalized speech. Prosodic features like pitch are usually estimated and altered at a segmental level based on short windows of speech (where the signal is expected to be quasi-sta ...

Idiap2015

Speech vocoding for laboratory phonology

Milos Cernak

In this paper, we propose a platform based on phonological speech vocoding for examining relations between phonology and speech processing, and in broader terms, between the abstract and physical structures of speech signal. The goal of this paper is to go ...

Idiap2015

On Learning Grapheme-to-Phoneme Relationships through the Acoustic Speech Signal

Ramya Rasipuram

Automatic speech recognition (ASR) systems, through use of the phoneme as an intermediary unit representation, split the problem of modeling the relationship between the written form, i.e., the text and the acoustic speech signal into two disjoint processe ...

2014

Joint Phoneme Segmentation Inference and Classification using CRFs

Ronan Collobert, Dimitri Palaz

State-of-the-art phoneme sequence recognition systems are based on hybrid hidden Markov model/artificial neural networks (HMM/ANN) framework. In this framework, the local classifier, ANN, is typically trained using Viterbi expectation-maximization algorith ...

2014

Improving Grapheme-based ASR by Probabilistic Lexical Modeling Approach

Ramya Rasipuram

There is growing interest in using graphemes as subword units, especially in the context of the rapid development of hidden Markov model (HMM) based automatic speech recognition (ASR) system, as it eliminates the need to build a phoneme pronunciation lexic ...

2013

Improving Grapheme-based ASR by Probabilistic Lexical Modeling Approach

Ramya Rasipuram

Idiap2013