Person

Pranay Dighe

This person is no longer with EPFL

Related publications (11)

On quantifying the quality of acoustic models in hybrid DNN-HMM ASR

Hervé Bourlard, Afsaneh Asaei, Pranay Dighe

We propose an information theoretic framework for quantitative assessment of acoustic models used in hidden Markov model (HMM) based automatic speech recognition (ASR). The HMM backend expects that (i) the acoustic model yields accurate state conditional e ...
ELSEVIER2020

Sparse and Low-rank Modeling for Automatic Speech Recognition

Pranay Dighe

This thesis deals with exploiting the low-dimensional multi-subspace structure of speech towards the goal of improving acoustic modeling for automatic speech recognition (ASR). Leveraging the parsimonious hierarchical nature of speech, we hypothesize that ...
EPFL2019

Low-rank and sparse subspace modeling of speech for DNN based acoustic modeling

Hervé Bourlard, Afsaneh Asaei, Pranay Dighe

Towards the goal of improving acoustic modeling for automatic speech recognition (ASR), this work investigates the modeling of senone subspaces in deep neural network (DNN) posteriors using low-rank and sparse modeling approaches. While DNN posteriors are ...
ELSEVIER SCIENCE BV2019

Far-field ASR Using Low-rank and Sparse Soft Targets from Parallel Data

Hervé Bourlard, Afsaneh Asaei, Pranay Dighe

Far-field automatic speech recognition (ASR) of conversational speech is often considered to be a very challenging task due to the poor quality of alignments available for training the DNN acoustic models. A common way to alleviate this problem is to use c ...
IEEE2018

Low-rank and Sparse Soft Targets to Learn Better DNN Acoustic Models

Hervé Bourlard, Afsaneh Asaei, Pranay Dighe

Conventional deep neural networks (DNN) for speech acoustic modeling rely on Gaussian mixture models (GMM) and hidden Markov model (HMM) to obtain binary class labels as the targets for DNN training. Subword classes in speech recognition systems correspond ...
Ieee2017

Sparse Modeling of Neural Network Posterior Probabilities for Exemplar-based Speech Recognition

Hervé Bourlard, Afsaneh Asaei, Pranay Dighe

In this paper, a compressive sensing (CS) perspective to exemplar-based speech processing is proposed. Relying on an analytical relationship between CS formulation and statistical speech recognition (Hidden Markov Models HMM), the automatic speech recognit ...
2016

Exploiting Low-dimensional Structures to Enhance DNN based Acoustic Modeling in Speech Recognition

Hervé Bourlard, Afsaneh Asaei, Pranay Dighe

We propose to model the acoustic space of deep neural network (DNN) class-conditional posterior probabilities as a union of low- dimensional subspaces. To that end, the training posteriors are used for dictionary learning and sparse coding. Sparse represen ...
IEEE2016

Exploiting Low-dimensional Structures to Enhance DNN based Acoustic Modeling in Speech Recognition

Hervé Bourlard, Afsaneh Asaei, Pranay Dighe

We propose to model the acoustic space of deep neural network (DNN) class-conditional posterior probabilities as a union of lowdimensional subspaces. To that end, the training posteriors are used for dictionary learning and sparse coding. Sparse representa ...
ICASSP2016

Sparse Hidden Markov Models for Exemplar-based Speech Recognition Using Deep Neural Network Posterior Features

Hervé Bourlard, Afsaneh Asaei, Pranay Dighe

Statistical speech recognition has been cast as a natural realization of the compressive sensing problem in this work. The compressed acoustic observations are sub-word posterior probabilities obtained from a deep neural network. Dictionary learning and sp ...
Idiap2016

Low-Rank Representation of Nearest Neighbor Phone Posterior Probabilities to Enhance DNN Acoustic Modeling

Hervé Bourlard, Afsaneh Asaei, Pranay Dighe

We hypothesize that optimal deep neural networks (DNN) class-conditional posterior probabilities live in a union of low-dimensional subspaces. In real test conditions, DNN posteriors encode uncertainties which can be regarded as a superposition of unstruct ...
Idiap2016

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.