**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Concept# Sample complexity

Summary

The sample complexity of a machine learning algorithm represents the number of training-samples that it needs in order to successfully learn a target function.
More precisely, the sample complexity is the number of training-samples that we need to supply to the algorithm, so that the function returned by the algorithm is within an arbitrarily small error of the best possible function, with probability arbitrarily close to 1.
There are two variants of sample complexity:
The weak variant fixes a particular input-output distribution;
The strong variant takes the worst-case sample complexity over all input-output distributions.
The No free lunch theorem, discussed below, proves that, in general, the strong sample complexity is infinite, i.e. that there is no algorithm that can learn the globally-optimal target function using a finite number of training samples.
However, if we are only interested in a particular class of target functions (e.g, only linear functions) then the sample complexity is finite, and it depends linearly on the VC dimension on the class of target functions.
Let be a space which we call the input space, and be a space which we call the output space, and let denote the product . For example, in the setting of binary classification, is typically a finite-dimensional vector space and is the set .
Fix a hypothesis space of functions . A learning algorithm over is a computable map from to . In other words, it is an algorithm that takes as input a finite sequence of training samples and outputs a function from to . Typical learning algorithms include empirical risk minimization, without or with Tikhonov regularization.
Fix a loss function , for example, the square loss , where . For a given distribution on , the expected risk of a hypothesis (a function) is
In our setting, we have , where is a learning algorithm and is a sequence of vectors which are all drawn independently from . Define the optimal riskSet , for each . Note that is a random variable and depends on the random variable , which is drawn from the distribution .

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related publications (4)

Related people (3)

This thesis focuses on developing efficient algorithmic tools for processing large datasets. In many modern data analysis tasks, the sheer volume of available datasets far outstrips our abilities to p

Mikhail Kapralov, Amir Zandieh, Andreas Krause

Learning set functions is a key challenge arising in many domains, ranging from sketching graphs to black-box optimization with discrete parameters. In this paper we consider the problem of efficientl

Mikhail Kapralov, Amir Zandieh

Reconstructing continuous signals based on a small number of discrete samples is a fundamental problem across science and engineering. We are often interested in signals with "simple" Fourier structur