Concept

Acte de langage

Publications associées (15)

Sparse Autoencoders for Speech Modeling and Recognition

Speech recognition-based applications upon the advancements in artificial intelligence play an essential role to transform most aspects of modern life. However, speech recognition in real-life conditions (e.g., in the presence of overlapping speech, varyin ...

EPFL2023

Can Self-Supervised Neural Representations Pre-Trained on Human Speech distinguish Animal Callers?

Mathew Magimai Doss, Eklavya Sarkar

Self-supervised learning (SSL) models use only the intrinsic structure of a given signal, independent of its acoustic domain, to extract essential information from the input to an embedding space. This implies that the utility of such representations is no ...

ISCA2023

Novel Methods For Detection And Analysis Of Atypical Aspects In Speech

Julian David Fritsch

Atypical aspects in speech concern speech that deviates from what is commonly considered normal or healthy. In this thesis, we propose novel methods for detection and analysis of these aspects, e.g. to monitor the temporary state of a speaker, diseases tha ...

EPFL2023

On Breathing Pattern Information in Synthetic Speech

Mathew Magimai Doss, Zohreh Mostaani

The respiratory system is an integral part of human speech production. As a consequence, there is a close relation between respiration and speech signal, and the produced speech signal carries breathing pattern related information. Speech can also be gener ...

ISCA-INT SPEECH COMMUNICATION ASSOC2022

Automatic pathological speech assessment

Parvaneh Janbakhshi

Many pathologies cause impairments in the speech production mechanism resulting in reduced speech intelligibility and communicative ability. To assist the clinical diagnosis, treatment and management of speech disorders, automatic pathological speech asses ...

EPFL2022

Dysarthric Speech Recognition with Lattice-Free MMI

Enno Hermann

Recognising dysarthric speech is a challenging problem as it differs in many aspects from typical speech, such as speaking rate and pronunciation. In the literature the focus so far has largely been on handling these variabilities in the framework of HMM/G ...

IEEE2020

Phonetic Subspace Features for Improved Query by Example Spoken Term Detection

Hervé Bourlard, Afsaneh Asaei, Dhananjay Ram

This paper addresses the problem of detecting speech utterances from a large audio archive using a simple spoken query, hence referring to this problem as "Query by Example Spoken Term Detection" (QbE-STD). This still open pattern matching problem has been ...

2018

Intonation Modelling for Speech Synthesis and Emphasis Preservation

Pierre-Edouard Jean Charles Honnet

Speech-to-speech translation is a framework which recognises speech in an input language, translates it to a target language and synthesises speech in this target language. In such a system, variations in the speech signal which are inherent to natural hum ...

EPFL2017

Understanding and Decoding Imagined Speech using Electrocorticographic Recordings in Humans

Stéphanie Martin

Certain brain disorders, resulting from brainstem infarcts, traumatic brain injury, stroke and amyotrophic lateral sclerosis, limit verbal communication despite the patient being fully aware. People that cannot communicate due to neurological disorders wou ...

EPFL2017

Exploiting sequence information for text-dependent Speaker Verification

Petr Motlicek, Subhadeep Dey

Model-based approaches to Speaker Verification (SV), such as Joint Factor Analysis (JFA), i-vector and relevance Maximum-a-Posteriori (MAP), have shown to provide state-of-the-art performance for text-dependent systems with fixed phrases. The performance o ...

Ieee2017

Incremental Syllable-Context Phonetic Vocoding

Petr Motlicek, Philip Neil Garner, Milos Cernak

Current very low bit rate speech coders are, due to complexity limitations, designed to work off-line. This paper investigates incremental speech coding that operates real-time and incrementally (i.e., encoded speech depends only on already-uttered speech ...

2015

Inferior frontal oscillations reveal visuo-motor matching for actions and speech: evidence from human intracranial recordings

Olaf Blanke, Silvio Ionta, Pär Halje

The neural correspondence between the systems responsible for the execution and recognition of actions has been suggested both in humans and non-human primates. Apart from being a key region of this visuo-motor observation-execution matching (OEM) system, ...

Pergamon-Elsevier Science Ltd2015

Data-Driven Enhancement of State Mapping-Based Cross-Lingual Speaker Adaptation

Hui Liang

The thesis work was motivated by the goal of developing personalized speech-to-speech translation and focused on one of its key component techniques – cross-lingual speaker adaptation for text-to-speech synthesis. A personalized speech-to-speech translator ...

EPFL2012

Current trends in multilingual speech processing

Hervé Bourlard, Mathew Magimai Doss, Petr Motlicek, John David Scott Dines, Philip Neil Garner, David Imseng, Hui Liang, Fabio Valente, Lakshmi Babu Saheer

In this paper, we describe recent work at Idiap Research Institute in the domain of multilingual speech processing and provide some insights into emerging challenges for the research community. Multilingual speech processing has been a topic of ongoing int ...

2011

Perception Studies on the Attributes of Synthetic Clear Speech for the Hard of Hearing

Chandra Sekhar Seelamantula

We make a case for ‘synthetic clear speech’ in the context of the persons with hearing impairment. We study the acoustic attributes of ‘clear speech’ that enable us to understand their importance in speech perception. Our perception experiments are motivat ...

IEEE2007