Criminal investigations require manual intervention of several investigators and translators. However, the amount and the diversity of the data collected raises many challenges, and cross-border investigations against organized crime can quickly impossible ...
Feature extraction is a key step in many machine learning and signal processing applications. For speech signals in particular, it is important to derive features that contain both the vocal characteristics of the speaker and the content of the speech. In ...
The SNR spectrum was previously introduced as a natural consequence of using cepstral normalisa-
tion in speech recognition; it is closely related to the articulation index of Fletcher. Motivated initially
by a theoretical difficulty in frequency warping, ...
This thesis deals with signal-based methods that predict how listeners perceive speech quality in telecommunications. Such tools, called objective quality measures, are of great interest in the telecommunications industry to evaluate how new or deployed sy ...
This paper presents a novel approach to predicting the intrusiveness of background noises in speech signals as it is perceived by human listeners. This problem is of particular interest in telephony, where the recently widened range of transmitted audio fr ...
Is it possible to predict the intrusiveness of background noise in speech signals as perceived by humans? Such a question is important to the automatic evaluation of speech enhancement systems, including those designed for new wideband speech telephony, an ...
The development of an Automatic Speech Recognition (ASR) system for the bilingual MediaParl corpus is challenging for several reasons: (1) reverberant recordings, (2) accented speech, and (3) no prior information about the language. In that context, we emp ...
The development of an Automatic Speech Recognition (ASR) system for the bilingual MediaParl corpus is challenging for several reasons: (1) reverberant recordings, (2) accented speech, and (3) no prior information about the language. In that context, we emp ...
To study the effects of any changes to a room or setting on the room acoustics, a framework was developed that enables immediate acoustic and visual feedback to the user. This is achieved by running interactive room acoustics simulations and auralizations ...
The log-energy parameter, typically derived from a full-band spectrum, is a critical feature commonly used in automatic speech recognition (ASR) systems. However, log-energy is difficult to estimate reliably in the presence of background noise. In this pap ...
Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.