Statistical pattern recognition occupies a central place in the general context of machine learning techniques, as it provides the theoretical insights and the practical means for solving a variety of problems ranging from character recognition to face recognition, from malignant nodules classification to speaker identification, to name just a few domains of application. In this very general framework, the work presented in this thesis focuses on two theoretical aspects – feature extraction and pattern classification – and a more practical one – human face class modeling. We first propose a new framework for using the autocorrelation-based features for pattern classification. In this context, we extend their use to higher orders by exploiting a property of the inner product of two autocorrelation feature vectors. As a large number of classification algorithms may be cast into a form in which data is involved only by means of inner products between vectors, this property provides a very convenient way of avoiding explicitly computing the feature vectors – an operation which is normally very costly. Using this approach, we show that, practically, there is no upper limit for the order of autocorrelation functions used – rather this limit depends on the application. By defining different topologies over which the autocorrelation feature vectors are applied, we obtain extended feature vectors, for which the inner product property still holds. We illustrate the use of autocorrelation feature vectors by first showing how one can perform Principal Component Analysis in the autocorrelation space and we apply it to the classification of unidimensional signals. Another application presented uses Support Vector Machines (SVM) for detecting human faces described in terms of extended autocorrelation feature vectors. Next, we develop a family of algorithms that generate sparse representations of the discriminant functions. This family is based on a recently proposed algorithm, the Kernel Matching Pursuit [176] (KMP), which is considerably extended and enhanced. Using as a starting point the theory of n-term greedy function approximation, we justify the introduction of different versions of KMP. Firstly, we introduce the Weak KMP which uses a learning rate-like parameter for controling the importance of the decisions taken at each iteration. We propose then another version – Stochastic KMP – designed to make the application of KMP to large datasets feasible. Finally, two approaches for building adaptive versions of KMP are presented and compared. All these algorithms are tested on various standard datasets and their performance is compared to that of SVM. On the application side we discuss various aspects of human face detection. In this context we introduce a new method for estimating the performance of the face detection and localization algorithms which is designed to make the results comparable in an objective way. At the core of the technique is the precise