Using Pitch as Prior Knowledge in Template-Based Speech Recognition
Publications associées (145)
Graph Chatbot
Chattez avec Graph Search
Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.
AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.
Phonological studies suggest that the typical subword units such as phones or phonemes used in automatic speech recognition systems can be decomposed into a set of features based on the articulators used to produce the sound. Most of the current approaches ...
Speaker diarization is the task of identifying ``who spoke when'' in an audio stream containing multiple speakers. This is an unsupervised task as there is no a priori information about the speakers. Diagnostical studies on state-of-the-art diarization sys ...
Objective assessment of synthetic speech intelligibility can be a useful tool for the development of text-to-speech (TTS) systems, as it provides a reproducible and inexpensive alternative to subjective listening tests. In a recent work, it was shown that ...
This work demonstrates an application of different real-time speech technologies, exploited in an online gaming scenario. The game developed for this purpose is inspired by the famous television based quiz-game show, “Who wants to be a millionaire”, in whi ...
Automatic speech recognition (ASR) systems, through use of the phoneme as an intermediary unit representation, split the problem of modeling the relationship between the written form, i.e., the text and the acoustic speech signal into two disjoint processe ...
The prosody of the speech signal carries both linguistic and paralinguistic information. As such, there is a necessity of its modelling for the purpose of integrating it in speech technology systems. So far, there has been a multitude of proposed models fo ...
Objective assessment of synthetic speech intelligibility can be a useful tool for the development of text-to-speech (TTS) systems, as it provides a reproducible and inexpensive alternative to subjective listening tests. In a recent work, it was shown that ...
Automatic processing of multiparty interactions is a research domain with important applications in content browsing, summarization and information retrieval. In recent years, several works have been devoted to find regular patterns which speakers exhibit ...
Automatic non-native accent assessment has many potential benefits in language learning and speech technologies. The three fundamental challenges in automatic accent assessment are to characterize, model and assess individual variation in speech of the non ...
In this paper, we propose a platform based on phonological speech vocoding for examining relations between phonology and speech processing, and in broader terms, between the abstract and physical structures of speech signal. The goal of this paper is to go ...