Publication

Phonetic aware techniques for Speaker Verification

Publications associées (140)

Graph Chatbot

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.

Connectez-vous pour utiliser Chat avec Graph Search

Multilingual Training and Adaptation in Speech Recognition

Sibo Tong

State-of-the-art acoustic models for Automatic Speech Recognition (ASR) are based on Hidden Markov Models (HMM) and Deep Neural Networks (DNN) and often require thousands of hours of transcribed speech data during training. Therefore, building multilingual ...

EPFL2020

AM-FM DECOMPOSITION OF SPEECH SIGNAL: APPLICATIONS FOR SPEECH PRIVACY AND DIAGNOSIS

Petr Motlicek, Hynek Hermansky, Sriram Ganapathy, Amrutha Prasad

Although current trends in speech processing consider deep learning through data-driven technologies, many potential applications exhibit lack of training or development data. Therefore, considerably light signal processing techniques are still of interest ...

Idiap2020

Dysarthric Speech Recognition with Lattice-Free MMI

Enno Hermann

Recognising dysarthric speech is a challenging problem as it differs in many aspects from typical speech, such as speaking rate and pronunciation. In the literature the focus so far has largely been on handling these variabilities in the framework of HMM/G ...

2020

Language Independent Query by Example Spoken Term Detection

Dhananjay Ram

Language independent query-by-example spoken term detection (QbE-STD) is the problem of retrieving audio documents from an archive, which contain a spoken query provided by a user. This is usually casted as a hypothesis testing and pattern matching problem ...

EPFL2019

Sparse and Low-rank Modeling for Automatic Speech Recognition

Pranay Dighe

This thesis deals with exploiting the low-dimensional multi-subspace structure of speech towards the goal of improving acoustic modeling for automatic speech recognition (ASR). Leveraging the parsimonious hierarchical nature of speech, we hypothesize that ...

EPFL2019

Tampered Speaker Inconsistency Detection with Phonetically Aware Audio-visual Features

Sébastien Marcel

The recent increase in social media based propaganda, i.e., ‘fake news’, calls for automated methods to detect tampered content. In this paper, we focus on detecting tampering in a video with a person speaking to a camera. This form of manipulation is easy ...

2019

End-to-End Acoustic Modeling using Convolutional Neural Networks for HMM-based Automatic Speech Recognition

Ronan Collobert, Dimitri Palaz

In hidden Markov model (HMM) based automatic speech recognition (ASR) system, modeling the statistical relationship between the acoustic speech signal and the HMM states that represent linguistically motivated subword units such as phonemes is a crucial st ...

ELSEVIER SCIENCE BV2019

Trustworthy speaker recognition with minimal prior knowledge using neural networks

Hannah Muckenhirn

The performance of speaker recognition systems has considerably improved in the last decade. This is mainly due to the development of Gaussian mixture model-based systems and in particular to the use of i-vectors. These systems handle relatively well noise ...

EPFL2019

Neural VTLN for Speaker Adaptation in TTS

Philip Neil Garner, Bastian Schnell

Vocal tract length normalisation (VTLN) is well established as a speaker adaptation technique that can work with very little adaptation data. It is also well known that VTLN can be cast as a linear transform in the cepstral domain. Building on this latter ...

2019

Understanding and Visualizing Raw Waveform-based CNNs

Sébastien Marcel, Hannah Muckenhirn

Modeling directly raw waveforms through neural networks for speech processing is gaining more and more attention. Despite its varied success, a question that remains is: what kind of information are such neural networks capturing or learning for different ...

2019