Unsupervised graph representation learning aims to learn low-dimensional node embeddings without supervision while preserving graph topological structures and node attributive features. Previous Graph Neural Networks (GNN) require a large number of labeled ...
Training deep neural network based Automatic Speech Recognition (ASR) models often requires thousands of hours of transcribed data, limiting their use to only a few languages. Moreover, current state-of-the-art acoustic models are based on the Transformer ...
While deep neural networks are state-of-the-art models of many parts of the human visual system, here we show that they fail to process global information in a humanlike manner. First, using visual crowding as a probe into global visual information process ...
In communication systems, it is crucial to estimate the perceived quality of audio and speech. The industrial standards for many years have been PESQ, 3QUEST, and POLQA, which are intrusive methods. This restricts the possibilities of using these metrics i ...
State-of-the-art acoustic models for Automatic Speech Recognition (ASR) are based on Hidden Markov Models (HMM) and Deep Neural Networks (DNN) and often require thousands of hours of transcribed speech data during training. Therefore, building multilingual ...
In this paper, we trace the history of neural networks applied to natural language understanding tasks, and identify key contributions which the nature of language has made to the development of neural network architectures. We focus on the importance of v ...
A common pattern of progress in engineering has seen deep neural networks displacing human-designed logic. There are many advantages to this approach, divorcing decisionmaking from human oversight and intuition has costs as well. One is that deep neural ne ...
In this supplementary material, we present the details of the neural network architecture and training settings used in all our experiments. This holds for all experiments presented in the main paper as well as in this supplementary material. We also show ...
This thesis deals with exploiting the low-dimensional multi-subspace structure of speech towards the goal of improving acoustic modeling for automatic speech recognition (ASR). Leveraging the parsimonious hierarchical nature of speech, we hypothesize that ...
Language independent query-by-example spoken term detection (QbE-STD) is the problem of retrieving audio documents from an archive, which contain a spoken query provided by a user. This is usually casted as a hypothesis testing and pattern matching problem ...
EPFL2019
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.