Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
The present thesis is concerned with the development and evaluation (in terms of accuracy and utility) of systems using hand postures and hand gestures for enhanced Human-Computer Interaction (HCI). In our case, these systems are based on vision techniques, thus only requiring cameras, and no other specific sensors or devices. When dealing with hand movements, it is necessary to distinguish two aspects of these hand movements: the static aspect and the dynamic aspect. The static aspect is characterized by a pose or configuration of the hand in an image and is related to the Hand Posture Recognition (HPR) problem. The dynamic aspect is defined either by the trajectory of the hand, or by a series of hand postures in a sequence of images. This second aspect is related to the Hand Gesture Recognition (HGR) task. Given the recognized lack of common evaluation databases in the HGR field, a first contribution of this thesis was the collection and public distribution of two databases, containing both one- and two-handed gestures, which part of the results reported here will be based upon. On these databases, we compare two state-of-the-art models for the task of HGR. As a second contribution, we propose a HPR technique based on a new feature extraction. This method has the advantage of being faster than conventional methods while yielding good performances. In addition, we provide comparison results of this method with other state-of-the-art technique. Finally, the most important contribution of this thesis lies in the thorough study of the state-of-the-art not only in HGR and HPR but also more generally in the field of HCI. The first chapter of the thesis provides an extended study of the state-of-the-art. The second chapter of this thesis contributes to HPR. We propose to apply for HPR a technique employed with success for face detection. This method is based on the Modified Census Transform (MCT) to extract relevant features in images. We evaluate this technique on an existing benchmark database and provide comparison results with other state-of-the-art approaches. The third chapter is related to HGR. In this chapter we describe the first recorded database, containing both one- and two-handed gestures in the 3D space. We propose to compare two models used with success in HGR, namely Hidden Markov Models (HMM) and Input-Output Hidden Markov Model (IOHMM). The fourth chapter is also focused on HGR but more precisely on two-handed gesture recognition. For that purpose, a second database has been recorded using two cameras. The goal of these gestures is to manipulate virtual objects on a screen. We propose to invesitigate on this second database the state-of-the-art sequence processing techniques we used in the previous chapter. We then discuss the results obtained using different features, and using images of one or two cameras. In conclusion, we propose a method for HPR based on new feature extraction. For HGR, we provide two databases and comparison results of two major sequence processing techniques. Finally, we present a complete survey on recent state-of-the-art techniques for both HPR and HGR. We also present some possible applications of these techniques, applied to two-handed gesture interaction. We hope this research will open new directions in the field of hand posture and gesture recognition.
Pasquale Davide Schiavone, Elisa Donati