Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
The extremely high recognition accuracy achieved by modern, convolutional neural network (CNN) based face recognition (FR) systems has contributed significantly to the adoption of such systems in a variety of applications, from mundane activities like unlocking phones to high-security applications such as border-control. Nonetheless, they have been shown to be highly vulnerable to presentation attacks (PA), also known as spoof-attacks.
A face PA is said to have occurred when a face biometric-sample is presented to the camera of an FR system with the intention of interfering with the operation of biometric recognition. An example PA is when someone tries to illicitly access an FR system by presenting a printed face photo of an authorized person to the camera. State-of-the-art face presentation attack detection (PAD) systems which are based on CNNs as well offer counter-measures to PAs.
Over the past decade, several datasets have been collected and publicly shared by different research groups, for face PAD experiments. It has been shown that most face PAD systems do not generalize well. That is, PAD systems show satisfactory classification performance when they are trained and evaluated on disjoint subsets of a dataset (known as an intra-dataset evaluation). However, their performance degrades significantly when they are trained using data from one dataset and evaluated using data from another dataset (a cross-dataset evaluation). The poor generalization of PAD systems precludes FR systems from deployment in many real-world applications.
In this thesis, I address generalization issues in face PAD systems in three ways:
Although many CNN architectures have been proposed for face PAD, no systematic evaluation of their classification performance has been done before. Here, I evaluate six different CNN architectures on four face PAD datasets in terms of both intra-dataset and cross-dataset performance, and show that patch-based CNN architectures generalize better. Moreover, I propose a novel CNN that analyzes the face images at different scales. This multi-scale analysis allows the proposed CNN to generalize better compared to baseline CNNs.
I formulate the low cross-dataset performance of PAD as a domain shift problem and investigate domain adaptation methods as a solution. I propose a novel domain adaptation method based on the hypothesis that some learned filters in CNNs are domain specific and do not generalize to the other datasets. Pruning these filters leads to higher performance in both intra-dataset and cross-dataset evaluations.
I hypothesize that the variability of face images in an FR dataset are nuisance factors in face PAD systems. Based on that, I propose to model the variability of face images in an FR dataset explicitly and induce invariance to these variabilities in the PAD system. The proposed method shows improvements over the baselines in terms of cross-dataset performance.
Extensive experiments on four recent PAD datasets (Replay-Mobile, OULU-NPU, SWAN, and WMCA) are conducted to support the claims. Overall, generalization in face PAD systems still remains a challenge and more research effort is needed to address this problem. Finally, this thesis is reproducible as complete implementation of the baselines and the proposed methods are made available freely via the machine-learning library Bob.
Touradj Ebrahimi, Yuhang Lu, Zewei Xu
Lukas Vogelsang, Marin Vogelsang