Publication

Incremental Syllable-Context Phonetic Vocoding

Publications associées (61)

Graph Chatbot

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.

Connectez-vous pour utiliser Chat avec Graph Search

Effective post-processing for single-channel frequency-domain speech enhancement

Weifeng Li

Conventional frequency-domain speech enhancement filters improve signal-to-noise ratio (SNR), but also produce speech distortions. This paper describes a novel post-processing algorithm devised for the improvement of the quality of the speech processed by ...

IDIAP2007

Truncation Confusion Patterns in Onset Consonants

Confusion matrices and truncation experiments have long been a part of psychoacoustic experimentation. However confusion matrices are seldom used to analyze truncation experiments. A truncation experiment was conducted and the confusion patterns were analy ...

IDIAP2007

Automatic Speech Receognition for Human-Machine Interaction

Pierre-André Farine, Michael Ansorge, Sara Grassi Pauletti

Since the sixties, movies such as “2001: A Space Odyssey” have familiarized us with the idea of com-puters that can speak and hear just as a human being does. Automatic speech recogni-tion (ASR) is the technol-ogy that allows machines to interpret human sp ...

2005

Short-Term Spatio-Temporal Clustering of Sporadic and Concurrent Events

Jean-Marc Odobez, Guillaume Lathoud

Accurate detection and segmentation of spontaneous multi-party speech is crucial for a variety of applications, including speech acquisition and recognition, as well as higher-level event recognition. However, the highly sporadic nature of spontaneous spee ...

IDIAP2004

Unsupervised Location-Based Segmentation of Multi-Party Speech

Jean-Marc Odobez, Guillaume Lathoud

2004

An Online Audio Indexing System

Hervé Bourlard, Jitendra Ajmera

This paper presents overview of an online audio indexing system, which creates a searchable index of speech content embedded in digitized audio files. This system is based on our recently proposed offline audio segmentation techniques. As the data arrives ...

2004

An Online Audio Indexing System

Hervé Bourlard, Jitendra Ajmera

IDIAP2003

Some applications of a priori knowledge in multi-stream HMM and HMM/ANN based ASR

Multi-band ASR was largely inspired by the extremely high level of redundancy in the spectral signal representation which can be inferred from Fletcher's product-of-errors rule for human speech perception. Indeed, the main aim of the multi-band approach is ...

2000

Relating LPC modeling to a factor-based articulatory model

Sacha Krstulovic

This paper proposes a method for recovering the articulatory parameters of a factor-based vocal tract shape model from the speech waveform. This is realized by analytically relating the shape model to a Linear Prediction lattice filter. Results pertaining ...

2000

Extraction of Articulators in X-Ray Image Sequences

We describe a method for tracking tongue, lips, and throat in X-ray films showing the side-view of the vocal tract. The technique uses specialized histogram normalization techniques and a new tracking method that is robust against occlusion, noise, and spo ...

1999