Multilayer Perceptron Based Hierarchical Acoustic Modeling for Automatic Speech Recognition
Publications associées (76)
Graph Chatbot
Chattez avec Graph Search
Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
The goal of this thesis is to improve current state-of-the-art techniques in speaker verification
(SV), typically based on âidentity-vectorsâ (i-vectors) and deep neural network (DNN), by exploiting diverse (phonetic) information extracted using variou ...
In the last decade, i-vector and Joint Factor Analysis (JFA) approaches to speaker modeling have become ubiquitous in the area of automatic speaker recognition. Both of these techniques involve the computation of posterior probabilities, using either Gauss ...
In this paper, we introduce a novel approach for Language Identification (LID). Two commonly used state-of-the-art methods based on UBM/GMM I-vector technique, combined with a back-end classifier, are first evaluated. The differential factor between these ...
Over these last few years, the use of Artificial Neural Networks (ANNs), now often referred to as deep learning or Deep Neural Networks (DNNs), has significantly reshaped research and development in a variety of signal and information processing tasks. Whi ...
Automatic Gender Recognition (AGR) is the task of identifying the gender of a speaker given a speech signal. Standard approaches extract features like fundamental frequency and cepstral features from the speech signal and train a binary classifier. Inspire ...
We propose a head pose estimation framework that leverages on a recent keypoint detection model. More specifically, we apply the convolutional pose machines (CPMs) to input images, extract different types of facial keypoint features capturing appearance in ...
Deep neural networks (DNN) have revolutionized the field of machine learning by providing unprecedented human-like performance in solving many real-world problems such as image or speech recognition. Training of large DNNs, however, is a computationally in ...
Air Navigation Service Provider (ANSPs) replace paper flight strips through different digital solutions. The instructed commands from an air traffic controller (ATCOs) are then available in computer readable form. However, those systems require manual cont ...
In Deep Neural Network (DNN) i-vector based speaker recognition systems, acoustic models trained for Automatic Speech Recognition are employed to estimate sufficient statistics for i-vector modeling. The DNN based acoustic model is typically trained on a w ...
In a recent work, we have shown that speaker verification systems can be built where both features and classifiers are directly learned from the raw speech signal with convolutional neural networks (CNNs). In this framework, the training phase also decides ...