Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?
Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur Graph Search.
This paper investigates the usage of prosody for the improvement of keyword spotting, focusing on the highly agglutinating Hungarian language, where keyword spotting cannot be effectively performed using LVCSR, as such systems are either unavailable or hard to operate due to high OOV rates and poor N-gram language modelling capabilities. Therefore, the applied keyword spotting system is based on confidence scores computed as a ratio of acoustic scores obtained in two ways: firstly, by decoding with an universal background model; and secondly, by decoding with a keyword model embedded into filler models. Prosody is used to perform an automatic phonological phrase alignment for speech, proven to be useful for automatic partial word boundary detection in fixed stress languages. Several features deduced from the phonological phrase alignment are investigated to rescore baseline confidence scores both in a rule-based and in a data-driven manner. Results show that in relevant operating points of the system, a false alarm reduction of 10% - 40% can be reached by the same miss probability rates.
Petr Motlicek, Philip Neil Garner, Milos Cernak
Hervé Bourlard, Philip Neil Garner, Milos Cernak, Afsaneh Asaei, Pierre-Edouard Jean Charles Honnet
Petr Motlicek, Philip Neil Garner, Milos Cernak