In this paper we consider the problem of automatic extraction of the geometric lip features for the purposes of multi-modal speaker identification. The use of visual information from the mouth region can be of great importance for improving the speaker identification system performance in noisy conditions. We propose a novel method for automated lip features extraction that utilizes color space transformation and a fuzzy-based c-means clustering technique. Using the obtained visual cues closed-set audio-visual speaker identification experiments are performed on the CUAVE database, [1] showing promising results.
Lucas Arnaud André Rappo, Rémi Guillaume Petitpierre, Marion Kramer
Isabella Di Lenardo, Lucas Arnaud André Rappo, Rémi Guillaume Petitpierre, Marion Kramer