Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
Speaker verification (SV) on portable devices like smartphones is gradually becoming popular. In this context, two issues need to be considered: 1) such devices have relatively limited computation resources, and 2) they are liable to be used everywhere, possibly in very noisy, uncontrolled environments. This work aims to address both these issues by proposing a computationally efficient yet robust SV system. This novel parts-based system draws inspiration from face and object detection systems in the computer vision domain. The system involves boosted ensembles of simple threshold-based classifiers. It uses a novel set of features extracted from speech spectra, called "slice features." The performance of the proposed system was evaluated through extensive studies involving a wide range of experimental conditions using the TIMIT, HTIMIT, and MOBIO corpus, against standard cepstral features and Gaussian Mixture Model-based SV systems.
Edoardo Charbon, Paul Mos, Mohit Gupta