Dysarthric Speech Recognition with Lattice-Free MMI

Recognising dysarthric speech is a challenging problem as it differs in many aspects from typical speech, such as speaking rate and pronunciation. In the literature the focus so far has largely been on handling these variabilities in the framework of HMM/GMM and cross-entropy based HMM/DNN systems. This paper focuses on the use of state-of-the-art sequence-discriminative training, in particular lattice-free maximum mutual information (LF-MMI), for improving dysarthric speech recognition. Through a systematic investigation on the Torgo corpus we demonstrate that LF-MMI performs well on such atypical data and compensates much better for the low speaking rates of dysarthric speakers than conventionally trained systems. This can be attributed to inherent aspects of current speech recognition training regimes, like frame subsampling and speed perturbation, which obviate the need for some techniques previously adopted specifically for dysarthric speech.

Dysarthric Speech Recognition with Lattice-Free MMI

Graph Chatbot

Chat with Graph Search

Sparse Autoencoders for Speech Modeling and Recognition

Novel Methods For Detection And Analysis Of Atypical Aspects In Speech

Automatic pathological speech assessment

Novel Methods For Detection And Analysis Of Atypical Aspects In Speech

Sparse Autoencoders for Speech Modeling and Recognition

Automatic pathological speech assessment