**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Concept# Statistical learning theory

Summary

Statistical learning theory is a framework for machine learning drawing from the fields of statistics and functional analysis. Statistical learning theory deals with the statistical inference problem of finding a predictive function based on data. Statistical learning theory has led to successful applications in fields such as computer vision, speech recognition, and bioinformatics.
The goals of learning are understanding and prediction. Learning falls into many categories, including supervised learning, unsupervised learning, online learning, and reinforcement learning. From the perspective of statistical learning theory, supervised learning is best understood. Supervised learning involves learning from a training set of data. Every point in the training is an input-output pair, where the input maps to an output. The learning problem consists of inferring the function that maps between the input and the output, such that the learned function can be used to predict the output from future input.
Depending on the type of output, supervised learning problems are either problems of regression or problems of classification. If the output takes a continuous range of values, it is a regression problem. Using Ohm's Law as an example, a regression could be performed with voltage as input and current as an output. The regression would find the functional relationship between voltage and current to be , such that
Classification problems are those for which the output will be an element from a discrete set of labels. Classification is very common for machine learning applications. In facial recognition, for instance, a picture of a person's face would be the input, and the output label would be that person's name. The input would be represented by a large multidimensional vector whose elements represent pixels in the picture.
After learning a function based on the training set data, that function is validated on a test set of data, data that did not appear in the training set.

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related publications (34)

Related people (1)

Related MOOCs (12)

Related concepts (2)

Related courses (11)

Related lectures (54)

Neuroscience Reconstructed: Cell Biology

This course will provide the fundamental knowledge in neuroscience required to
understand how the brain is organised and how function at multiple scales is
integrated to give rise to cognition and beh

Neuroscience Reconstructed: Cell Biology

This course will provide the fundamental knowledge in neuroscience required to
understand how the brain is organised and how function at multiple scales is
integrated to give rise to cognition and beh

Simulation Neurocience

Learn how to digitally reconstruct a single neuron to better study the biological mechanisms of brain function, behaviour and disease.

In machine learning, a common task is the study and construction of algorithms that can learn from and make predictions on data. Such algorithms function by making data-driven predictions or decisions, through building a mathematical model from input data. These input data used to build the model are usually divided into multiple data sets. In particular, three data sets are commonly used in different stages of the creation of the model: training, validation, and test sets.

In mathematical modeling, overfitting is "the production of an analysis that corresponds too closely or exactly to a particular set of data, and may therefore fail to fit to additional data or predict future observations reliably". An overfitted model is a mathematical model that contains more parameters than can be justified by the data. In a mathematical sense, these parameters represent the degree of a polynomial. The essence of overfitting is to have unknowingly extracted some of the residual variation (i.

FIN-525: Financial big data

The course introduces modern methods to acquire, clean, and analyze large quantities of financial data efficiently. The second part expands on how to apply these techniques and robust statistics to fi

DH-406: Machine learning for DH

This course aims to introduce the basic principles of machine learning in the context of the digital humanities. We will cover both supervised and unsupervised learning techniques, and study and imple

MATH-412: Statistical machine learning

A course on statistical machine learning for supervised and unsupervised learning

Mathematics of Data: Models and Estimators

Covers the Mathematics of Data, focusing on models, estimators, and practical issues in data analysis.

Machine Learning Fundamentals: Overfitting and Regularization

Covers overfitting, regularization, and cross-validation in machine learning, exploring polynomial curve fitting, feature expansion, kernel functions, and model selection.

Neural Networks: Regularization & Optimization

Explores neural network regularization, optimization, and practical implementation tips.

Laser Powder Bed Fusion (LPBF) is an Additive Manufacturing (AM) process consolidating parts layer by layer, from a metallic powder bed. It allows no limitation in terms of geometry and is therefore of particular interest to various industries. Metallic LP ...

Andrea Wulzer, Alfredo Glioti, Siyu Chen

Extracting maximal information from experimental data requires access to the likelihood function, which however is never directly available for complex experiments like those performed at high energy colliders. Theoretical predictions are obtained in this ...

Control systems operating in real-world environments often face disturbances arising from measurement noise and model mismatch. These factors can significantly impact the perfor- mance and safety of the system. In this thesis, we aim to leverage data to de ...