Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.
Far-field automatic speech recognition (ASR) of conversational speech is often considered to be a very challenging task due to the poor quality of alignments available for training the DNN acoustic models. A common way to alleviate this problem is to use c ...
In air traffic control rooms around the world paper flight strips are replaced through different digital solutions. This enables other systems to access the instructed air traffic controller (ATCo) commands and use them for other purposes. Digital flight s ...
Automatic Speech Recognition (ASR) has recently proved to be a useful tool to reduce the workload of air traffic controllers leading to significant gains in operational efficiency. Air Traffic Control (ATC) systems in operation rooms around the world gener ...
Phonological classes define articulatory-free and articulatory-bound phone attributes. Deep neural network is used to estimate the probability of phonological classes from the speech signal. In theory, a unique combination of phone attributes form a phonem ...
Research in the area of automatic speaker verification (ASV) has advanced enough for the industry to start using ASV systems in practical applications. However, these systems are highly vulnerable to spoofing or presentation attacks (PAs), limiting their w ...
The SNR spectrum was previously introduced as a natural consequence of using cepstral normalisa-
tion in speech recognition; it is closely related to the articulation index of Fletcher. Motivated initially
by a theoretical difficulty in frequency warping, ...
In a recent work, we have shown that speaker verification systems can be built where both features and classifiers are directly learned from the raw speech signal with convolutional neural networks (CNNs). In this framework, the training phase also decides ...
We propose a novel multi-task neural network-based approach for joint sound source localization and speech/non-speech classification in noisy environments. The network takes raw short time Fourier transform as input and outputs the likelihood values for th ...
Certain brain disorders, resulting from brainstem infarcts, traumatic brain injury, stroke and amyotrophic lateral sclerosis, limit verbal communication despite the patient being fully aware. People that cannot communicate due to neurological disorders wou ...
Modeling directly raw waveform through neural networks for speech processing is gaining more and more attention. Despite its varied success, a question that remains is: what kind of information are such neural networks capturing or learning for different t ...