Advances in Automatic Speech Recognition (ASR) over the last decade opened new areas of speech-based automation such as in Air-Traffic Control (ATC) environments. Currently, voice communication and Controller Pilot Data Link Communications are the only way of contact between pilots and Air-Traffic Controllers (ATCo), where the former is the most widely used and the latter is a non-speech method mandatory for oceanic messages and limited for some domestically issues. ASR systems on ATCo environments inherit increasing complexity due to accents from non-English speakers, cockpit noise, speaker-dependent biases and small in-domain ATC databases for training. In this paper, we review the last advances related to ASR on ATCo communication. Then, we introduce CleanSky EC H2020 ATCO2, a project that aims to develop a platform to collect, organize and automatically pre-process ATCo data from air space. We apply transfer learning from out-of-domain corpus coupled with adaptation on seven command-related corpora. The acoustic modelling is based on conventional TDNN-HMMs trained using lattice-free MMI objective function. The developed ASR achieves relative improvement in word error rates of 29% when using transfer learning and an additional 36% when adapting the model with seven command-related databases, these results obtained from EC H2020 SESAR project MALORCA Vienna database.
Petr Motlicek, Juan Pablo Zuluaga Gomez, Amrutha Prasad
Petr Motlicek, Juan Pablo Zuluaga Gomez, Amrutha Prasad