Publication

Multi-parametric source-filter separation of speech and prosodic voice restoration

Related publications (211)

Beamforming with a Maximum Negentropy Criterion

In this paper, we address a beamforming application based on the capture of far-field speech data from a single speaker in a real meeting room. After the position of the speaker is estimated by a speaker tracking system, we construct a subband-domain beamf ...

2009

VTLN Adaptation for Statistical Speech Synthesis

Philip Neil Garner, John David Scott Dines, Hui Liang, Lakshmi Babu Saheer

The advent of statistical speech synthesis has enabled the unification of the basic techniques used in speech synthesis and recognition. Adaptation techniques that have been successfully used in recognition systems can now be applied to synthesis systems t ...

Idiap2009

Robust Speaker Diarization for Short Speech Recordings

David Imseng

We investigate a state-of-the-art Speaker Diarization system regarding its behavior on meetings that are much shorter (from 500 seconds down to 100 seconds) than those typically analyzed in Speaker Diarization benchmarks. First, the problems inherent to th ...

2009

Robust Speaker Diarization for Short Speech Recordings

David Imseng

Idiap2009

Speech/Non-Speech Detection in Meetings from Automatically Extracted Low Resolution Visual Features

Silèye Oumar Ba

In this paper we address the problem of estimating who is speaking from automatically extracted low resolution visual cues from group meetings. Traditionally, the task of speech/non-speech detection or speaker diarization tries to find who speaks and when ...

Idiap2009

How does a dictation machine recognize speech?

Hervé Bourlard

There is magic (or is it witchcraft?) in a speech recognizer that transcribes continuous radio speech into text with a word accuracy of even not more than 50%. The extreme difficulty of this task, tough, is usually not perceived by the general public. This ...

Idiap2008

Spectro-Temporal Features for Automatic Speech Recognition using Linear Prediction in Spectral Domain

Hynek Hermansky, Sriram Ganapathy, Samuel Thomas

Frequency Domain Linear Prediction (FDLP) provides an efficient way to represent temporal envelopes of a signal using auto-regressive models. For the input speech signal, we use FDLP to estimate temporal trajectories of sub-band energy by applying linear p ...

2008

Spectro-Temporal Features for Automatic Speech Recognition using Linear Prediction in Spectral Domain

Hynek Hermansky, Sriram Ganapathy, Samuel Thomas

IDIAP2008

Automatic Speech and Speaker Recognition: Large Margin and Kernel Methods

Samy Bengio

This is the first book dedicated to uniting research related to speech and speaker recognition based on the recent advances in large margin and kernel methods. The first part of the book presents theoretical and practical foundations of large margin and ke ...

John Wiley & Sons2008

Filter Bank Design for Subband Adaptive Beamforming and Application to Speech Recognition

Philip Neil Garner, Weifeng Li

\begin{abstract} We present a new filter bank design method for subband adaptive beamforming. Filter bank design for adaptive filtering poses many problems not encountered in more traditional applications such as subband coding of speech or music. The popu ...

IDIAP2008

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.