Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
We present a feature selection method based on information theoretic measures, targeted at multimodal signal processing, showing how we can quantitatively assess the relevance of features from different modalities. We are able to find the features with the ...
Speaker detection is an important component of a speech-based user interface. Audiovisual speaker detection, speech and speaker recognition or speech synthesis for example find multiple applications in human-computer interaction, multimedia content indexin ...
The enormous growth of digital music databases has led to a comparable growth in the need for methods that help users organize and access such information. One area in particular that has seen much recent research activity is the use of automated technique ...
In this paper, we investigate the significance of contextual information in a phoneme recognition system using the hidden Markov model - artificial neural network paradigm. Contextual information is probed at the feature level as well as at the output of t ...
In this paper, a novel statistical generative model to describe a face is presented, and is applied on the face authentication task. Classical generative models used so far in face recognition, such as Gaussian Mixture Models (GMM) and Hidden Markov Models ...
Most state-of-the-art automatic speech recognition (ASR) systems deal with noise in the environment by extracting noise robust features which are subsequently modelled by a Hidden Markov Model (HMM). A limitation of this feature-based approach is that the ...
Models dealing directly with the raw acoustic speech signal are an alternative to conventional feature-based HMMs. A popular way to model the raw speech signal is by means of an autoregressive (AR) process. Being too simple to cope with the nonlinearity of ...
In this report, we propose a statistical model to deal with the discrete-distribution data varying over time. The proposed model -- HMM+DM -- extends the Dirichlet mixture model to the dynamic case: Hidden Markov Model with Dirichlet mixture output. Both t ...
This paper presents an effective implementation of detection-localization of multiple speech sources with microphone arrays. In particular, the Scaled Conjugate Gradient descent is used for fast and precise localization, within a pre-detected volume of spa ...
This paper presents experiments aiming at improving the automatic transcription of oral presentations through the inclusion of the slides in the recognition process. The experiments are performed over a data set of around three hours of material (~33 kword ...