AM-FM DECOMPOSITION OF SPEECH SIGNAL: APPLICATIONS FOR SPEECH PRIVACY AND DIAGNOSIS
Related publications (92)
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
The goal of this thesis is to improve current state-of-the-art techniques in speaker verification
(SV), typically based on âidentity-vectorsâ (i-vectors) and deep neural network (DNN), by exploiting diverse (phonetic) information extracted using variou ...
The papers in this special issue are intended to address some of the main research challenges in Graph Signal Processing by presenting a collection of the latest advances in the domain. These papers examine key representation, learning and processing aspec ...
Modeling directly raw waveform through neural networks for speech processing is gaining more and more attention. Despite its varied success, a question that remains is: what kind of information are such neural networks capturing or learning for different t ...
Air Navigation Service Provider (ANSPs) replace paper flight strips through different digital solutions. The instructed commands from an air traffic controller (ATCOs) are then available in computer readable form. However, those systems require manual cont ...
In air traffic control rooms, paper flight strips are more and more replaced by digital solutions. The digital systems, however, increase the workload for air traffic controllers: For instance, each voice-command must be manually inserted into the system b ...
Automatic speaker verification systems can be spoofed through recorded, synthetic or voice converted speech of target speakers. To make these systems practically viable, the detection of such attacks, referred to as presentation attacks, is of paramount in ...
This paper introduces a new task termed low-latency speaker spotting (LLSS). Related to security and intelligence applications, the task involves the detection, as soon as possible, of known speakers within multi-speaker audio streams. The paper describes ...
We address the problem of automatically predicting group performance on a task, using multimodal features derived from the group conversation. These include acoustic features extracted from the speech signal, and linguistic features derived from the conver ...
ASSOC COMPUTING MACHINERY2018
,
Automatic speaker verification systems can be spoofed through recorded, synthetic or voice converted speech of target speakers. To make these systems practically viable, the detection of such attacks, referred to as presentation attacks, is of paramount in ...
2017
, ,
The speech signal conveys information on different time scales from short (20--40 ms) time scale or segmental, associated to phonological and phonetic information to long (150--250 ms) time scale or supra segmental, associated to syllabic and prosodic info ...