Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
The speech signal conveys information on different time scales from short (20–40 ms) time scale or segmental, associated to phonological and phonetic information to long (150–250 ms) time scale or supra segmental, associated to syllabic and prosodic inform ...
The speech signal conveys information on different time scales from short (20--40 ms) time scale or segmental, associated to phonological and phonetic information to long (150--250 ms) time scale or supra segmental, associated to syllabic and prosodic info ...
For a long time, natural language processing (NLP) has relied on generative models with task specific and manually engineered features. Recently, there has been a resurgence of interest for neural networks in the machine learning community, obtaining state ...
Speaker diarization is the task of identifying ``who spoke when'' in an audio stream containing multiple speakers. This is an unsupervised task as there is no a priori information about the speakers. Diagnostical studies on state-of-the-art diarization sys ...
Speaker diarization is the task of identifying “who spoke when” in an audio stream containing multiple speakers. This is an unsupervised task as there is no a priori information about the speakers. Diagnostical studies on state-of-the-art diarization syste ...
Decoding speech from intracranial recordings serves two main purposes: understanding the neural correlates of speech processing and decoding speech features for targeting speech neuroprosthetic devices. Intracranial recordings have high spatial and tempora ...
Since the prosody of a spoken utterance carries information about its discourse function, salience, and speaker attitude, prosody mod- els and prosody generation modules have played a crucial part in text-to- speech (TTS) synthesis systems from the beginni ...
This paper provides an overview of the Speaker Anti-spoofing Competition organized by Biometric group at Idiap Research Institute for the IEEE International Conference on Biometrics: Theory, Applications, and Systems (BTAS 2016). The competition used AVspo ...
This paper provides an overview of the Speaker Anti-spoofing Competition organized by Biometric group at Idiap Research Institute for the IEEE International Conference on Biometrics: Theory, Applications, and Systems (BTAS 2016). The competition used AVspo ...
Phonological studies suggest that the typical subword units such as phones or phonemes used in automatic speech recognition systems can be decomposed into a set of features based on the articulators used to produce the sound. Most of the current approaches ...