Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
In this paper, we explore how different acoustic modeling techniques can benefit from data in languages other than the target language. We propose an algorithm to perform decision tree state clustering for the recently proposed Kullback-Leibler divergence based hidden Markov models (KL-HMM) and compare it to subspace Gaussian mixture modeling (SGMM). KL-HMM can exploit multilingual information in the form of universal phoneme posterior features and SGMM benefits from a universal background model that can be trained on multilingual data. Taking the Greek SpeechDat(II) data as an example, we show that KL-HMM performs best for small amounts of target language data.
Jan Skaloud, Davide Antonio Cucci, Kenneth Joseph Paul