Publication

Analytic Assessment of Telephone Transmission Impact on ASR Performance Using a Simulation Model

Publications associées (33)

Graph Chatbot

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.

Connectez-vous pour utiliser Chat avec Graph Search

Novel Methods For Detection And Analysis Of Atypical Aspects In Speech

Julian David Fritsch

Atypical aspects in speech concern speech that deviates from what is commonly considered normal or healthy. In this thesis, we propose novel methods for detection and analysis of these aspects, e.g. to monitor the temporary state of a speaker, diseases tha ...

EPFL2023

Sparse Autoencoders for Speech Modeling and Recognition

Selen Hande Kabil

Speech recognition-based applications upon the advancements in artificial intelligence play an essential role to transform most aspects of modern life. However, speech recognition in real-life conditions (e.g., in the presence of overlapping speech, varyin ...

EPFL2023

On Breathing Pattern Information in Synthetic Speech

Mathew Magimai Doss, Zohreh Mostaani

The respiratory system is an integral part of human speech production. As a consequence, there is a close relation between respiration and speech signal, and the produced speech signal carries breathing pattern related information. Speech can also be gener ...

ISCA-INT SPEECH COMMUNICATION ASSOC2022

Novel Methods for Incorporating Prior Knowledge for Automatic Speech Assessment

Subrahmanya Pavankumar Dubagunta

Speech signal conveys several kinds of information such as a message, speaker identity, emotional state of the speaker and social state of the speaker. Automatic speech assessment is a broad area that refers to using automatic methods to predict human judg ...

EPFL2021

Deep learning architectures for estimating breathing signal and respiratory parameters from speech recordings

Mathew Magimai Doss, Zohreh Mostaani, Venkata Srikanth Nallanthighal

Respiration is an essential and primary mechanism for speech production. We first inhale and then produce speech while exhaling. When we run out of breath, we stop speaking and inhale. Though this process is involuntary, speech production involves a system ...

PERGAMON-ELSEVIER SCIENCE LTD2021

Understanding and Visualizing Raw Waveform-based CNNs

Sébastien Marcel, Hannah Muckenhirn

Modeling directly raw waveforms through neural networks for speech processing is gaining more and more attention. Despite its varied success, a question that remains is: what kind of information are such neural networks capturing or learning for different ...

2019

Gradient-based spectral visualization of CNNs using raw waveforms

Sébastien Marcel, Hannah Muckenhirn

Modeling directly raw waveform through neural networks for speech processing is gaining more and more attention. Despite its varied success, a question that remains is: what kind of information are such neural networks capturing or learning for different t ...

Idiap2018

Phonetic aware techniques for Speaker Verification

Subhadeep Dey

The goal of this thesis is to improve current state-of-the-art techniques in speaker verification (SV), typically based on âidentity-vectorsâ (i-vectors) and deep neural network (DNN), by exploiting diverse (phonetic) information extracted using variou ...

EPFL2018

Intonation Modelling for Speech Synthesis and Emphasis Preservation

Pierre-Edouard Jean Charles Honnet

Speech-to-speech translation is a framework which recognises speech in an input language, translates it to a target language and synthesises speech in this target language. In such a system, variations in the speech signal which are inherent to natural hum ...

EPFL2017

"Can you hear me now?"

Raphaël Marc Ullmann

This thesis deals with signal-based methods that predict how listeners perceive speech quality in telecommunications. Such tools, called objective quality measures, are of great interest in the telecommunications industry to evaluate how new or deployed sy ...

EPFL2016