Concept

Speech synthesis

Related publications (611)

Training a Filter-Based Model of the Cochlea in the Context of Pre-Trained Acoustic Models

Philip Neil Garner

Auditory research aims in general to lead to understanding of physiological processes. By contrast, the state of the art in automatic speech processing (notably recognition) is dominated by large pre-trained models that are meant to be used as black-boxes. ...
2024

On matching data and model in LF-MMI-based dysarthric speech recognition

Enno Hermann

In light of steady progress in machine learning, automatic speech recognition (ASR) is entering more and more areas of our daily life, but people with dysarthria and other speech pathologies are left behind. Their voices are underrepresented in the trainin ...
EPFL2023

Text as a Richer Source of Supervision in Semantic Segmentation Tasks

Devis Tuia, Valérie Zermatten, Javiera Francisca Castillo Navarro, Lloyd Haydn Hughes

This paper introduces TACOSS a text-image alignment approach that allows explainable land cover semantic segmentation by directly integrating semantic concepts encoded from texts. TACOSS combines convolutional neural networks for visual feature extraction ...
The Institute of Electrical and Electronics Engineers, Inc2023

Novel Methods For Detection And Analysis Of Atypical Aspects In Speech

Julian David Fritsch

Atypical aspects in speech concern speech that deviates from what is commonly considered normal or healthy. In this thesis, we propose novel methods for detection and analysis of these aspects, e.g. to monitor the temporary state of a speaker, diseases tha ...
EPFL2023

Can Self-Supervised Neural Networks Pre-Trained on Human Speech distinguish Animal Callers?

Mathew Magimai Doss, Eklavya Sarkar

Self-supervised learning (SSL) models use only the intrinsic structure of a given signal, independent of its acoustic domain, to extract essential information from the input to an embedding space. This implies that the utility of such representations is no ...
ISCA2023

A multimodal measurement of the impact of deepfakes on the ethical reasoning and affective reactions of students

Touradj Ebrahimi, Patrick Jermann, Roland John Tormey, Cécile Hardebolle, Vivek Ramachandran, Nihat Kotluk

Deepfakes - synthetic videos generated by machine learning models - are becoming increasingly sophisticated. While they have several positive use cases, their potential for harm is also high. Deepfake production involves input from multiple engineers, maki ...
2023

Sparse Autoencoders for Speech Modeling and Recognition

Selen Hande Kabil

Speech recognition-based applications upon the advancements in artificial intelligence play an essential role to transform most aspects of modern life. However, speech recognition in real-life conditions (e.g., in the presence of overlapping speech, varyin ...
EPFL2023

On the Privacy-Robustness-Utility Trilemma in Distributed Learning

Rachid Guerraoui, Nirupam Gupta, John Stephan, Youssef Allouah, Rafaël Benjamin Pinot

The ubiquity of distributed machine learning (ML) in sensitive public domain applications calls for algorithms that protect data privacy, while being robust to faults and adversarial behaviors. Although privacy and robustness have been extensively studied ...
2023

Deep Generative Models for Autonomous Driving: from Motion Forecasting to Realistic Image Synthesis

Saeed Saadatnejad

Forecasting is a capability inherent in humans when navigating. Humans routinely plan their paths, considering the potential future movements of those around them. Similarly, to achieve comparable sophistication and safety, autonomous systems must embrace ...
EPFL2023

Automatic Multi-Robot Control Design and Optimization Leveraging Multi-Level Modeling: An Exploration Case Study

Alcherio Martinoli, Cyrill Silvan Baumann, Wakana Endo

More specifically, a behavior-based controller for a multi-robot exploration scenario is automatically synthesized using a predefined set of basic behaviors and conditions. A key feature of the used synthesis approach is the tailored use of two modeling le ...
Elsevier2023

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.