Jean-Marc Odobez, Petr Motlicek, Weipeng He
This paper introduces a novel approach for extracting speaker embeddings from audio mixtures of multiple overlapping voices. This approach is based on a multi-task neural network. The network first extracts a latent feature for each direction. This feature ...
ISCA-INT SPEECH COMMUNICATION ASSOC2021