Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
Atypical aspects in speech concern speech that deviates from what is commonly considered normal or healthy. In this thesis, we propose novel methods for detection and analysis of these aspects, e.g. to monitor the temporary state of a speaker, diseases tha ...
The speech signal conveys information on different time scales from short (20–40 ms) time scale or segmental, associated to phonological and phonetic information to long (150–250 ms) time scale or supra segmental, associated to syllabic and prosodic inform ...
Idiap2016
State-of-the-art automatic speech recognition (ASR) and text-to-speech systems require a pronunciation lexicon that maps each word to a sequence of phones. Manual development of lexicons is costly as it needs linguistic knowledge and human expertise. To fa ...
Prosody in speech is manifested by variations of loudness, exaggeration of pitch, and specific phonetic variations of prosodic segments. For example, in the stressed and unstressed syllables, there are differences in place or manner of articulation, vowels ...
Prosody plays an important role in both identification and synthesis of emotionalized speech. Prosodic features like pitch are usually estimated and altered at a segmental level based on short windows of speech (where the signal is expected to be quasi-sta ...
In light of steady progress in machine learning, automatic speech recognition (ASR) is entering more and more areas of our daily life, but people with dysarthria and other speech pathologies are left behind. Their voices are underrepresented in the trainin ...
Automatic speech recognition (ASR) systems, through use of the phoneme as an intermediary unit representation, split the problem of modeling the relationship between the written form, i.e., the text and the acoustic speech signal into two disjoint processe ...
The speech signal conveys information on different time scales from short (20--40 ms) time scale or segmental, associated to phonological and phonetic information to long (150--250 ms) time scale or supra segmental, associated to syllabic and prosodic info ...
Elsevier2016
, ,
The speech signal conveys information on different time scales from short (20--40 ms) time scale or segmental, associated to phonological and phonetic information to long (150--250 ms) time scale or supra segmental, associated to syllabic and prosodic info ...
2016
In this paper, we investigate pitch contour modelling in speech synthesis based on segmental units. A convolutional pitch target approximation model is proposed. This model allows jointly stochastic modelling of framewise pitch and pitch contour of longer ...