Publication

VTLN Adaptation for Statistical Speech Synthesis

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Sparse Autoencoders for Speech Modeling and Recognition

Selen Hande Kabil

Speech recognition-based applications upon the advancements in artificial intelligence play an essential role to transform most aspects of modern life. However, speech recognition in real-life conditions (e.g., in the presence of overlapping speech, varyin ...

EPFL2023

Novel Methods For Detection And Analysis Of Atypical Aspects In Speech

Julian David Fritsch

Atypical aspects in speech concern speech that deviates from what is commonly considered normal or healthy. In this thesis, we propose novel methods for detection and analysis of these aspects, e.g. to monitor the temporary state of a speaker, diseases tha ...

EPFL2023

On matching data and model in LF-MMI-based dysarthric speech recognition

Enno Hermann

In light of steady progress in machine learning, automatic speech recognition (ASR) is entering more and more areas of our daily life, but people with dysarthria and other speech pathologies are left behind. Their voices are underrepresented in the trainin ...

EPFL2023

Explanation of Face Recognition via Saliency Maps

Touradj Ebrahimi, Yuhang Lu

Despite the significant progress in recent years, deep face recognition is often treated as a "black box" and has been criticized for lacking explainability. It becomes increasingly important to understand the characteristics and decisions of deep face rec ...

2023

How to Boost Face Recognition with StyleGAN?

Nikita Durasov

State-of-the-art face recognition systems require vast amounts of labeled training data. Given the priority of privacy in face recognition applications, the data is limited to celebrity web crawls, which have issues such as limited numbers of identities. O ...

Ieee Computer Soc2023

A Virtual Simulation-Pilot Agent for Training of Air Traffic Controllers

Petr Motlicek, Juan Pablo Zuluaga Gomez, Amrutha Prasad

In this paper we propose a novel virtual simulation-pilot engine for speeding up air traffic controller (ATCo) training by integrating different state-of-the-art artificial intelligence (AI)-based tools. The virtual simulation-pilot engine receives spoken ...

MDPI2023

Can Self-Supervised Neural Networks Pre-Trained on Human Speech distinguish Animal Callers?

Mathew Magimai Doss, Eklavya Sarkar

Self-supervised learning (SSL) models use only the intrinsic structure of a given signal, independent of its acoustic domain, to extract essential information from the input to an embedding space. This implies that the utility of such representations is no ...

ISCA2023

Lausanne Historical Censuses Dataset HTR 35k

Lucas Arnaud André Rappo, Rémi Guillaume Petitpierre, Marion Kramer

This training dataset includes a total of 34,913 manually transcribed text segments. It is dedicated to the handwritten text recognition (HTR) of historical sources, typically tabular records, such as censuses. This dataset is based on a sample of 83 pages ...

Zenodo2023

On Breathing Pattern Information in Synthetic Speech

Mathew Magimai Doss, Zohreh Mostaani

The respiratory system is an integral part of human speech production. As a consequence, there is a close relation between respiration and speech signal, and the produced speech signal carries breathing pattern related information. Speech can also be gener ...

ISCA-INT SPEECH COMMUNICATION ASSOC2022

Automatic pathological speech assessment

Parvaneh Janbakhshi

Many pathologies cause impairments in the speech production mechanism resulting in reduced speech intelligibility and communicative ability. To assist the clinical diagnosis, treatment and management of speech disorders, automatic pathological speech asses ...

EPFL2022