Publication

Exploiting Low-dimensional Structures to Enhance DNN based Acoustic Modeling in Speech Recognition

Related publications (52)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Dynamic Model Pruning with Feedback

Martin Jaggi, Sebastian Urban Stich, Luis Felipe Barba Flores, Tao Lin, Daniil Dmitriev

Deep neural networks often have millions of parameters. This can hinder their deployment to low-end devices, not only due to high memory requirements but also because of increased latency at inference. We propose a novel model compression method that gener ...

2020

A Graph Signal Processing Framework for the Classification of Temporal Brain Data

Sarah Itani

Graph Signal Processing (GSP) addresses the analysis of data living on an irregular domain which can be modeled with a graph. This capability is of great interest for the study of brain connectomes. In this case, data lying on the nodes of the graph are co ...

IEEE2020

Low-rank and sparse subspace modeling of speech for DNN based acoustic modeling

Hervé Bourlard, Afsaneh Asaei, Pranay Dighe

Towards the goal of improving acoustic modeling for automatic speech recognition (ASR), this work investigates the modeling of senone subspaces in deep neural network (DNN) posteriors using low-rank and sparse modeling approaches. While DNN posteriors are ...

ELSEVIER SCIENCE BV2019

Sparse and Low-rank Modeling for Automatic Speech Recognition

Pranay Dighe

This thesis deals with exploiting the low-dimensional multi-subspace structure of speech towards the goal of improving acoustic modeling for automatic speech recognition (ASR). Leveraging the parsimonious hierarchical nature of speech, we hypothesize that ...

EPFL2019

Deep Micro-Dictionary Learning and Coding Network

Yan Yan, Wei Wang, Wei Xiao

In this paper, we propose a novel Deep Micro-Dictionary Learning and Coding Network (DDLCN). DDLCN has most of the standard deep learning layers (pooling, fully, connected, input/output, etc.) but the main difference is that the fundamental convolutional l ...

IEEE2019

Language Independent Query by Example Spoken Term Detection

Dhananjay Ram

Language independent query-by-example spoken term detection (QbE-STD) is the problem of retrieving audio documents from an archive, which contain a spoken query provided by a user. This is usually casted as a hypothesis testing and pattern matching problem ...

EPFL2019

End-to-End Acoustic Modeling using Convolutional Neural Networks for HMM-based Automatic Speech Recognition

Ronan Collobert, Dimitri Palaz

In hidden Markov model (HMM) based automatic speech recognition (ASR) system, modeling the statistical relationship between the acoustic speech signal and the HMM states that represent linguistically motivated subword units such as phonemes is a crucial st ...

ELSEVIER SCIENCE BV2019

Sparse Subspace Modeling for Query by Example Spoken Term Detection

Hervé Bourlard, Afsaneh Asaei, Dhananjay Ram

This paper focuses on the problem of query by example spoken term detection (QbE-STD) in zero-resource scenario. Current state-of-the-art approaches to tackle this problem rely on dynamic programming based template matching techniques using phone posterior ...

2018

Evolution of Neural Network Architectures for Speech Recognition

Hervé Bourlard

Over these last few years, the use of Artificial Neural Networks (ANNs), now often referred to as deep learning or Deep Neural Networks (DNNs), has significantly reshaped research and development in a variety of signal and information processing tasks. Whi ...

ISCA-INT SPEECH COMMUNICATION ASSOC2018

Phonetic aware techniques for Speaker Verification

Subhadeep Dey

The goal of this thesis is to improve current state-of-the-art techniques in speaker verification (SV), typically based on âidentity-vectorsâ (i-vectors) and deep neural network (DNN), by exploiting diverse (phonetic) information extracted using variou ...

EPFL2018