Publications associées à Phonème

Novel Methods For Detection And Analysis Of Atypical Aspects In Speech

Atypical aspects in speech concern speech that deviates from what is commonly considered normal or healthy. In this thesis, we propose novel methods for detection and analysis of these aspects, e.g. to monitor the temporary state of a speaker, diseases tha ...

EPFL2023

On matching data and model in LF-MMI-based dysarthric speech recognition

Enno Hermann

In light of steady progress in machine learning, automatic speech recognition (ASR) is entering more and more areas of our daily life, but people with dysarthria and other speech pathologies are left behind. Their voices are underrepresented in the trainin ...

EPFL2023

Phonetic aware techniques for Speaker Verification

Subhadeep Dey

The goal of this thesis is to improve current state-of-the-art techniques in speaker verification (SV), typically based on â identity-vectorsâ (i-vectors) and deep neural network (DNN), by exploiting diverse (phonetic) information extracted using variou ...

EPFL2018

Perceptual Information Loss due to Impaired Speech Production

Hervé Bourlard, Afsaneh Asaei, Milos Cernak

Phonological classes define articulatory-free and articulatory-bound phone attributes. Deep neural network is used to estimate the probability of phonological classes from the speech signal. In theory, a unique combination of phone attributes form a phonem ...

2017

On Modeling the Synergy Between Acoustic and Lexical Information for Pronunciation Lexicon Development

Marzieh Razavi

State-of-the-art automatic speech recognition (ASR) and text-to-speech systems require a pronunciation lexicon that maps each word to a sequence of phones. Manual development of lexicons is costly as it needs linguistic knowledge and human expertise. To fa ...

EPFL2017

Acoustic data-driven grapheme-to-phoneme conversion in the probabilistic lexical modeling framework

Mathew Magimai Doss, Ramya Rasipuram, Marzieh Razavi

One of the primary steps in building automatic speech recognition (ASR) and text-to-speech systems is the development of a phonemic lexicon that provides a mapping between each word and its pronunciation as a sequence of phonemes. Phoneme lexicons can be d ...

2016

"Can you hear me now?" : Automatic assessment of background noise intrusiveness and speech intelligibility in telecommunications

Raphaël Marc Ullmann

This thesis deals with signal-based methods that predict how listeners perceive speech quality in telecommunications. Such tools, called objective quality measures, are of great interest in the telecommunications industry to evaluate how new or deployed sy ...

EPFL2016

On Learning Grapheme-to-Phoneme Relationships through the Acoustic Speech Signal

Mathew Magimai Doss, Ramya Rasipuram

Automatic speech recognition (ASR) systems, through use of the phoneme as an intermediary unit representation, split the problem of modeling the relationship between the written form, i.e., the text and the acoustic speech signal into two disjoint processe ...

2014

Grapheme-based Automatic Speech Recognition using Probabilistic Lexical Modeling

Ramya Rasipuram

Automatic speech recognition (ASR) systems incorporate expert knowledge of language or the linguistic expertise through the use of phone pronunciation lexicon (or dictionary) where each word is associated with a sequence of phones. The creation of phone pr ...

EPFL2014

Sparsity Averaging for Compressive Imaging

Jean-Philippe Thiran, Dimitri Nestor Alice Van De Ville, Yves Wiaux, Rafael Eduardo Carrillo Rangel, Jason Douglas McEwen

We discuss a novel sparsity prior for compressive imaging in the context of the theory of compressed sensing with coherent redundant dictionaries, based on the observation that natural images exhibit strong average sparsity over multiple coherent frames. W ...

IEEE Institute of Electrical and Electronics Engineers2013

Applying Multi- and Cross-Lingual Stochastic Phone Space Transformations to Non-Native Speech Recognition

Hervé Bourlard, Mathew Magimai Doss, Philip Neil Garner, John David Scott Dines, David Imseng

In the context of hybrid HMM/MLP Automatic Speech Recognition (ASR), this paper describes an investigation into a new type of stochastic phone space transformation, which maps "source" phone (or phone HMM state) posterior probabilities (as obtained at the ...

Ieee-Inst Electrical Electronics Engineers Inc2013

Overcoming Asynchrony in Audio-Visual Speech Recognition

Jean-Philippe Thiran, Virginia Estellers Casas

In this paper we propose two alternatives to overcome the natural asynchrony of modalities in Audio-Visual Speech Recognition. We first investigate the use of asynchronous statistical models based on Dynamic Bayesian Networks with different levels of async ...

2010