This paper describes a multi-modal person verification system using speech and frontal face images. We consider two different speaker verification algorithms, a text-independent method using a second-order statistical measure and a text-dependent method based on hidden Markov modelling, as well as a face verification technique using a robust form of corellation. Fusion of the different recognition modules is performed by a Support Vector Machine classifier. Experimental results obtained on the audio-visual database XM2VTS for individual modalities and their combinations show that multimodal systems yield better performances than individual modules for all cases.
Christophe René Joseph Ecabert
Touradj Ebrahimi, Lin Yuan, Xiao Pu, Yao Zhang, Hongbo Li