Fast keyword detection with sparse time-frequency models

We address the problem of keyword spotting in continuous speech streams when training and testing conditions can be different. We propose a keyword spotting algorithm based on sparse representation of speech signals in a time-frequency feature space. The training speech elements are jointly represented in a common subspace built on simple basis functions. The subspace is trained in order to capture the common time-frequency structures from different occurrences of the keywords to be spotted. The keyword spotting algorithm then employs a sliding window mechanism on speech streams. It computes the contribution of successive speech segments in the subspace of interest and evaluates the similarity with the training data. Experimental results on the TIMIT database show the effectiveness and the noise resilience of the low complexity spotting algorithm.

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.

Fast keyword detection with sparse time-frequency models

Graph Chatbot

Chattez avec Graph Search

Sparse Autoencoders for Speech Modeling and Recognition

Automatic pathological speech assessment

AM-FM DECOMPOSITION OF SPEECH SIGNAL: APPLICATIONS FOR SPEECH PRIVACY AND DIAGNOSIS

Sparse Autoencoders for Speech Modeling and Recognition

Automatic pathological speech assessment

AM-FM DECOMPOSITION OF SPEECH SIGNAL: APPLICATIONS FOR SPEECH PRIVACY AND DIAGNOSIS