Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
Automatic recognition of gestures using computer vision is important for many real-world applications such as sign language recognition and human-robot interaction (HRI). Our goal is a real-time hand gesture-based HRI interface for mobile robots. We use a ...
In this paper we propose a new technique for robust keyword spotting that uses bidirectional Long Short-Term Memory (BLSTM) recurrent neural nets to incorporate contextual information in speech decoding. Our approach overcomes the drawbacks of generative H ...
In this paper, we investigate employment of discriminatively trained acoustic features modeled by Subspace Gaussian Mixture Models (SGMMs) for Rich Transcription meeting recognition. More specifically, first, we focus on exploiting various types of complex ...
We address the problem of recognizing sequences of human interaction patterns in meetings, with the goal of structuring them in semantic terms. The investigated patterns, are inherently group-based (defined by the individual activities of meeting participa ...
The objective of this thesis is to develop probabilistic graphical models for analyzing human interaction in meetings based on multimodel cues. We use meeting as a study case of human interactions since research shows that high complexity information is mo ...
In this paper, to learn multiple tasks sharing same inputs, a two-layer architecture for a reservoir based recurrent neural network is proposed. The inputs are fed into the general workspace layer where the weights are adapted to provide maximum informatio ...
We address the problem of clustering multimodal group actions in meetings using a two-layer HMM framework. Meetings are structured as sequences of group actions. Our approach aims at creating one cluster for each group action, where the number of group act ...
In this thesis, we investigate a hierarchical approach for estimating the phonetic class-conditional probabilities using a multilayer perceptron (MLP) neural network. The architecture consists of two MLP classifiers in cascade. The first MLP is trained in ...
The objective of this thesis is to develop probabilistic graphical models for analyzing human interaction in meetings based on multimodel cues. We use meeting as a study case of human interactions since research shows that high complexity information is mo ...
Juicer is a decoder for HMM-based large vocabulary speech recognition that uses a weighted finite state transducer (WFST) representation of the search space. The package consists of a number of command line utilities: the Juicer decoder itself, along with ...