Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
This paper provides an in-depth analysis of the impacts of language mismatch on the performance of cross-lingual speaker adaptation. Our work confirms the influence of language mismatch between average voice distributions for synthesis and for transform estimation and the necessity of eliminating this mismatch in order to effectively utilize multiple transforms for cross-lingual speaker adaptation. Specifically, we show that language mismatch introduces unwanted language-specific information when estimating multiple transforms, thus making these transforms detrimental to adaptation performance. Our analysis demonstrates speaker characteristics should be separated from language characteristics in order to improve cross-lingual adaptation performance.
Tatiana Pieloni, Nicolas Frank Mounet, Christophe Emmanuel R. Lannoy
, ,