Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?
Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur Graph Search.
Speech sounds can be characterized by articulatory features. Articulatory features are typically estimated using a set of multilayer perceptrons (MLPs), i.e., a separate MLP is trained for each articulatory feature. In this report, we investigate multitask learning (MTL) approach for joint estimation of articulatory features with and without phoneme classification as subtask. The effect of number of subtasks in MTL is studied by selecting two different articulatory feature representations. Our studies show that MTL MLP can estimate articulatory features compactly and efficiently by learning the inter-feature dependencies through a common hidden layer representation, irrespective of number of subtasks. Furthermore, adding phoneme as subtask while estimating articulatory features improves both articulatory feature estimation and phoneme recognition. On TIMIT phoneme recognition task, articulatory feature posterior probabilities obtained by MTL MLP achieve a phoneme recognition accuracy of 73.8%, while the phoneme posterior probabilities achieve an accuracy of 74.2%.
Subrahmanya Pavankumar Dubagunta