Publication

Word-level Embeddings for Cross-Task Transfer Learning in Speech Processing

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Machine Learning for Information Retrieval

David Grangier

In this thesis, we explore the use of machine learning techniques for information retrieval. More specifically, we focus on ad-hoc retrieval, which is concerned with searching large corpora to identify the documents relevant to user queries. Thisidentifica ...

IDIAP2008

Automatic Speech and Speaker Recognition: Large Margin and Kernel Methods

Samy Bengio

This is the first book dedicated to uniting research related to speech and speaker recognition based on the recent advances in large margin and kernel methods. The first part of the book presents theoretical and practical foundations of large margin and ke ...

John Wiley & Sons2008

Learning sparse generative models of audiovisual signals

Pierre Vandergheynst, Gianluca Monaci

This paper presents a novel framework to learn sparse represen- tations for audiovisual signals. An audiovisual signal is modeled as a sparse sum of audiovisual kernels. The kernels are bimodal functions made of synchronous audio and video components that ...

2008

Discriminative Keyword Spotting

Samy Bengio, David Grangier

This paper proposes a new approach for keyword spotting, which is not based on HMMs. The proposed method employs a new discriminative learning procedure, in which the learning phase aims at maximizing the area under the ROC curve, as this quantity is the m ...

2007

Bi-Modal Face and Speech Authentication: a BioLogin Demonstration System

Sébastien Marcel, Yann Rodriguez

This paper presents a bi-modal (face and speech) authentication demonstration system that simulates the login of a user using its face and its voice. This demonstration is called BioLogin. It runs both on Linux and Windows and the Windows version is freely ...

IDIAP2006

Bi-Modal Face and Speech Authentication: a BioLogin Demonstration System

Sébastien Marcel, Yann Rodriguez

2006

Robust audio segmentation

Jitendra Ajmera

Audio segmentation, in general, is the task of segmenting a continuous audio stream in terms of acoustically homogenous regions, where the rule of homogeneity depends on the task. This thesis aims at developing and investigating efficient, robust and unsup ...

EPFL2005

Multi-stream ASR: Oracle Test and Embedded Training

Hervé Bourlard, Hemant Misra

Multi-stream based automatic speech recognition (ASR) systems outperform their single stream counterparts, especially in the case of noisy speech. However, the main issues in multi-stream systems are to know a) Which streams to be combined, and b) How to c ...

IDIAP2005

Semi-supervised Meeting Event Recognition with Adapted HMMs

Daniel Gatica-Perez, Samy Bengio, Dong Zhang

This paper investigates the use of unlabeled data to help labeled data for audio-visual event recognition in meetings. To deal with situations in which it is difficult to collect enough labeled data to capture event characteristics, but collecting a large ...

2005

Semi-supervised Meeting Event Recognition with Adapted HMMs

Daniel Gatica-Perez, Samy Bengio, Dong Zhang

IDIAP2005