The sample complexity of a machine learning algorithm represents the number of training-samples that it needs in order to successfully learn a target function. More precisely, the sample complexity is the number of training-samples that we need to supply to the algorithm, so that the function returned by the algorithm is within an arbitrarily small error of the best possible function, with probability arbitrarily close to 1. There are two variants of sample complexity: The weak variant fixes a particular input-output distribution; The strong variant takes the worst-case sample complexity over all input-output distributions. The No free lunch theorem, discussed below, proves that, in general, the strong sample complexity is infinite, i.e. that there is no algorithm that can learn the globally-optimal target function using a finite number of training samples. However, if we are only interested in a particular class of target functions (e.g, only linear functions) then the sample complexity is finite, and it depends linearly on the VC dimension on the class of target functions. Let be a space which we call the input space, and be a space which we call the output space, and let denote the product . For example, in the setting of binary classification, is typically a finite-dimensional vector space and is the set . Fix a hypothesis space of functions . A learning algorithm over is a computable map from to . In other words, it is an algorithm that takes as input a finite sequence of training samples and outputs a function from to . Typical learning algorithms include empirical risk minimization, without or with Tikhonov regularization. Fix a loss function , for example, the square loss , where . For a given distribution on , the expected risk of a hypothesis (a function) is In our setting, we have , where is a learning algorithm and is a sequence of vectors which are all drawn independently from . Define the optimal riskSet , for each . Note that is a random variable and depends on the random variable , which is drawn from the distribution .

À propos de ce résultat
Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Graph Chatbot

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.