This document describes a neural method for clustering words and its use in language modeling for speech recognizers. The method is based on clustering the words which appear on similar local context and estimating the parameters needed for language modeling based on these clusters. The language model used is similar to the traditional n-grams.
Vinitra Swamy, Paola Mejia Domenzain, Julian Thomas Blackwell, Isadora Alves de Salles
Mathias Josef Payer, Zhiyao Feng, Chunmin Zhang, Ji Shi