Publication

Recognition Of Reverberant Speech Using Frequency Domain Linear Prediction

Publications associées (67)

ROXANNE Research Platform: Automate criminal investigations

Petr Motlicek, Maël Fabien, Aravind Krishnan

Criminal investigations require manual intervention of several investigators and translators. However, the amount and the diversity of the data collected raises many challenges, and cross-border investigations against organized crime can quickly impossible ...
ISCA-INT SPEECH COMMUNICATION ASSOC2021

Audio Feature Extraction with Convolutional Neural Autoencoders with Application to Voice Conversion

Golnooshsadat Elhami

Feature extraction is a key step in many machine learning and signal processing applications. For speech signals in particular, it is important to derive features that contain both the vocal characteristics of the speaker and the content of the speech. In ...
2019

Combining the SNR Spectrum with a Cochlear Model

Philip Neil Garner

The SNR spectrum was previously introduced as a natural consequence of using cepstral normalisa- tion in speech recognition; it is closely related to the articulation index of Fletcher. Motivated initially by a theoretical difficulty in frequency warping, ...
Idiap2018

"Can you hear me now?"

Raphaël Marc Ullmann

This thesis deals with signal-based methods that predict how listeners perceive speech quality in telecommunications. Such tools, called objective quality measures, are of great interest in the telecommunications industry to evaluate how new or deployed sy ...
EPFL2016

Predicting the intrusiveness of noise through sparse coding with auditory kernels

Hervé Bourlard, Raphaël Marc Ullmann

This paper presents a novel approach to predicting the intrusiveness of background noises in speech signals as it is perceived by human listeners. This problem is of particular interest in telephony, where the recently widened range of transmitted audio fr ...
2016

Sparse Gammatone Signal Model Predicts Perceived Noise Intrusiveness

Hervé Bourlard, Raphaël Marc Ullmann

Is it possible to predict the intrusiveness of background noise in speech signals as perceived by humans? Such a question is important to the automatic evaluation of speech enhancement systems, including those designed for new wideband speech telephony, an ...
Idiap2014

Development of Bilingual ASR System for MediaParl Corpus

Petr Motlicek, Milos Cernak, David Imseng

The development of an Automatic Speech Recognition (ASR) system for the bilingual MediaParl corpus is challenging for several reasons: (1) reverberant recordings, (2) accented speech, and (3) no prior information about the language. In that context, we emp ...
ISCA2014

Development of Bilingual ASR System for MediaParl Corpus

Petr Motlicek, Milos Cernak, David Imseng

The development of an Automatic Speech Recognition (ASR) system for the bilingual MediaParl corpus is challenging for several reasons: (1) reverberant recordings, (2) accented speech, and (3) no prior information about the language. In that context, we emp ...
Idiap2014

Interactive Real-Time Simulation and Auralization for Modifiable Rooms

Dirk Schröder

To study the effects of any changes to a room or setting on the room acoustics, a framework was developed that enables immediate acoustic and visual feedback to the user. This is achieved by running interactive room acoustics simulations and auralizations ...
2014

Robust Log-Energy Estimation and its Dynamic Change Enhancement for In-car Speech Recognition

Hervé Bourlard, Weifeng Li

The log-energy parameter, typically derived from a full-band spectrum, is a critical feature commonly used in automatic speech recognition (ASR) systems. However, log-energy is difficult to estimate reliably in the presence of background noise. In this pap ...
Ieee-Inst Electrical Electronics Engineers Inc2013

Graph Chatbot

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.