Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
Speech is a complex signal produced by a highly constrained articulation machinery. Neuro and psycholinguistic theories assert that speech can be decomposed into molecules of structured atoms. Although characterization of the atoms is controversial, the ex ...
Phonological classes define articulatory-free and articulatory-bound phone attributes. Deep neural network is used to estimate the probability of phonological classes from the speech signal. In theory, a unique combination of phone attributes form a phonem ...
In the last decade, i-vector and Joint Factor Analysis (JFA) approaches to speaker modeling have become ubiquitous in the area of automatic speaker recognition. Both of these techniques involve the computation of posterior probabilities, using either Gauss ...
Speech is the most natural means of communication for humans. Therefore, since the beginning of computers it has been a goal to interact with machines via speech. While there have been gradual improvements in this field over the decades, and with recent dr ...
Modeling directly raw waveform through neural networks for speech processing is gaining more and more attention. Despite its varied success, a question that remains is: what kind of information are such neural networks capturing or learning for different t ...
We present a novel method for the compensation of long duration data loss in audio signals, in particular music. The concealment of such signal defects is based on a graph that encodes signal structure in terms of time-persistent spectral similarity. A sui ...
Speech-to-speech translation is a framework which recognises speech in an input language, translates it to a target language and synthesises speech in this target language. In such a system, variations in the speech signal which are inherent to natural hum ...
In a recent work, we have shown that speaker verification systems can be built where both features and classifiers are directly learned from the raw speech signal with convolutional neural networks (CNNs). In this framework, the training phase also decides ...
In this work, we consider the problem of estimating the coefficients of linear shift-invariant FIR graph filters. We assume hybrid node-varying graph filters where the network is decomposed into clusters of nodes and, within each cluster, all nodes have th ...
The SNR spectrum was previously introduced as a natural consequence of using cepstral normalisa-
tion in speech recognition; it is closely related to the articulation index of Fletcher. Motivated initially
by a theoretical difficulty in frequency warping, ...