Publication

Towards using slide information to enhance speech transcription of meetings

Graph Chatbot

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.

Connectez-vous pour utiliser Chat avec Graph Search

Sector-Based Detection for Hands-Free Speech Enhancement in Cars

Guillaume Lathoud

Speech-based command interfaces are becoming more and more common in cars. Applications include automatic dialog systems for hands-free phone calls as well as more advanced features such as navigation systems. However, interferences, such as speech from th ...

IDIAP2004

Noisy Text Categorization

Alessandro Vinciarelli

This work presents categorization experiments performed over noisy texts. By noisy it is meant any text obtained through an extraction process (affected by errors) from media other than digital texts (e.g. transcriptions of speech recordings extracted with ...

IDIAP2004

Effect of Recognition Errors on Information Retrieval Performance

Alessandro Vinciarelli

This work shows experiments on the retrieval of handwritten documents. The performance of the same state-of-the-art Information Retrieval system is compared when dealing with manual (no errors) and automatic (Word Error Rate around 50%) transcriptions of t ...

IDIAP2004

Effect of Recognition Errors on Information Retrieval Performance

Alessandro Vinciarelli

2004

Clustering And Segmenting Speakers And Their Locations In Meetings

Guillaume Lathoud, Jitendra Ajmera

This paper presents a new approach toward automatic annotation of meetings in terms of speaker identities and their locations. This is achieved by segmenting the audio recordings using two independent sources of information : magnitude spectrum analysis an ...

2004

An Online Audio Indexing System

Hervé Bourlard, Jitendra Ajmera

This paper presents overview of an online audio indexing system, which creates a searchable index of speech content embedded in digitized audio files. This system is based on our recently proposed offline audio segmentation techniques. As the data arrives ...

IDIAP2003

Information Retrieval on Noisy Text

Hervé Bourlard, David Grangier, Alessandro Vinciarelli

Spoken Document Retrieval (SDR) consists in retrieving segments of a speech database that are relevant to a query. The state-of-the-art approach to the SDR problem consists in transcribing the speech data into digital text before applying common Informatio ...

IDIAP2003

Clustering And Segmenting Speakers And Their Locations In Meetings

Guillaume Lathoud, Jitendra Ajmera

IDIAP2003

An information theoretic measure of sequence recognition performance

Sequence recognition performance is often summarised first in terms of the number of hits (H), substitutions (S), deletions (D) and insertions (I), and then as a single statistic by the "word error rate" WER = 100(S D I)/(H S D). While in common use, WER h ...

IDIAP2002

A Pragmatic View of the Application of HMM2 for ASR

Hervé Bourlard, Samy Bengio, Katrin Weber

This report investigates the HMM2 approach recently introduced in the framework of automatic speech recognition. HMM2 can be seen as a mixture of HMMs, where a conventional primary HMM (processing a time series of speech data) is supported on a lower level ...

IDIAP2001