Summary
In machine learning, a probabilistic classifier is a classifier that is able to predict, given an observation of an input, a probability distribution over a set of classes, rather than only outputting the most likely class that the observation should belong to. Probabilistic classifiers provide classification that can be useful in its own right or when combining classifiers into ensembles. Formally, an "ordinary" classifier is some rule, or function, that assigns to a sample x a class label ŷ: The samples come from some set X (e.g., the set of all documents, or the set of all images), while the class labels form a finite set Y defined prior to training. Probabilistic classifiers generalize this notion of classifiers: instead of functions, they are conditional distributions , meaning that for a given , they assign probabilities to all (and these probabilities sum to one). "Hard" classification can then be done using the optimal decision rule or, in English, the predicted class is that which has the highest probability. Binary probabilistic classifiers are also called binary regression models in statistics. In econometrics, probabilistic classification in general is called discrete choice. Some classification models, such as naive Bayes, logistic regression and multilayer perceptrons (when trained under an appropriate loss function) are naturally probabilistic. Other models such as support vector machines are not, but methods exist to turn them into probabilistic classifiers. Some models, such as logistic regression, are conditionally trained: they optimize the conditional probability directly on a training set (see empirical risk minimization). Other classifiers, such as naive Bayes, are trained generatively: at training time, the class-conditional distribution and the class prior are found, and the conditional distribution is derived using Bayes' rule. Not all classification models are naturally probabilistic, and some that are, notably naive Bayes classifiers, decision trees and boosting methods, produce distorted class probability distributions.
About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related courses (20)
MGT-448: Statistical inference and machine learning
This course aims to provide graduate students a thorough grounding in the methods, theory, mathematics and algorithms needed to do research and applications in machine learning. The course covers topi
EE-566: Adaptation and learning
In this course, students learn to design and master algorithms and core concepts related to inference and learning from data and the foundations of adaptation and learning theories with applications.
MGT-302: Data driven business analytics
This course focuses on on methods and algorithms needed to apply machine learning with an emphasis on applications in business analytics.
Show more
Related lectures (48)
Statistical Modeling
Covers exercises on statistical modeling, including Gibbs Ising, GCM pruning, fairness, and simplification in teacher-student models.
SVM and Multiclass Classification
Covers SVM and multiclass classification using one-vs-all and one-vs-one approaches.
Document Classification: Overview
Explores document classification methods, including k-Nearest-Neighbors, Naïve Bayes Classifier, transformer models, and multi-head attention.
Show more
Related publications (189)
Related concepts (14)
Multiclass classification
In machine learning and statistical classification, multiclass classification or multinomial classification is the problem of classifying instances into one of three or more classes (classifying instances into one of two classes is called binary classification). While many classification algorithms (notably multinomial logistic regression) naturally permit the use of more than two classes, some are by nature binary algorithms; these can, however, be turned into multinomial classifiers by a variety of strategies.
Bias–variance tradeoff
In statistics and machine learning, the bias–variance tradeoff is the property of a model that the variance of the parameter estimated across samples can be reduced by increasing the bias in the estimated parameters. The bias–variance dilemma or bias–variance problem is the conflict in trying to simultaneously minimize these two sources of error that prevent supervised learning algorithms from generalizing beyond their training set: The bias error is an error from erroneous assumptions in the learning algorithm.
Platt scaling
In machine learning, Platt scaling or Platt calibration is a way of transforming the outputs of a classification model into a probability distribution over classes. The method was invented by John Platt in the context of support vector machines, replacing an earlier method by Vapnik, but can be applied to other classification models. Platt scaling works by fitting a logistic regression model to a classifier's scores. Consider the problem of binary classification: for inputs x, we want to determine whether they belong to one of two classes, arbitrarily labeled +1 and −1.
Show more