Full-combination multi-band approach has been proposed in the literature and performs well for band-limited noise. But the approach fails to deliver in case of wide-band noise. To overcome this, multi-stream approaches are proposed in literature with varying degree of success. Based on our observation that for a classifier trained on clean speech, the entropy at the output of the classifier increases in presence of noise at its input, we used entropy as a measure of confidence to give weightage to a classifier output. In this paper, we propose a new entropy based combination strategy for full-combination multi-stream approach. In this entropy based approach, a particular stream is weighted inversely proportional to the output entropy of its specific classifier. A few variations of this basic approach are also suggested. It is observed that the word-error-rate (WER) achieved by the proposed combination methods is better for different types of noises and for their different signal-to-noise-ratios (SNRs). Some interesting relationship is observed between the WER performances of different combination methods and their respective entropies.
Sabine Süsstrunk, Radhakrishna Achanta, Mahmut Sami Arpa, Martin Nicolas Everaert, Athanasios Fitsios