**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Concept# Probabilistic classification

Summary

In machine learning, a probabilistic classifier is a classifier that is able to predict, given an observation of an input, a probability distribution over a set of classes, rather than only outputting the most likely class that the observation should belong to. Probabilistic classifiers provide classification that can be useful in its own right or when combining classifiers into ensembles.
Types of classification
Formally, an "ordinary" classifier is some rule, or function, that assigns to a sample x a class label ŷ:
:\hat{y} = f(x)
The samples come from some set X (e.g., the set of all documents, or the set of all images), while the class labels form a finite set Y defined prior to training.
Probabilistic classifiers generalize this notion of classifiers: instead of functions, they are conditional distributions \Pr(Y \vert X), meaning that for a given x \in X, they assign probabili

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related publications

Loading

Related people

Loading

Related units

Loading

Related concepts

Loading

Related courses

Loading

Related lectures

Loading

Related people (1)

Related publications (17)

Loading

Loading

Loading

Related concepts (13)

Supervised learning

Supervised learning (SL) is a paradigm in machine learning where input objects (for example, a vector of predictor variables) and a desired output value (also known as human-labeled supervisory signal

Support vector machine

In machine learning, support vector machines (SVMs, also support vector networks) are supervised learning models with associated learning algorithms that analyze data for

Machine learning

Machine learning (ML) is an umbrella term for solving problems for which development of algorithms by human programmers would be cost-prohibitive, and instead the problems are solved by helping machin

Related courses (6)

COM-406: Foundations of Data Science

We discuss a set of topics that are important for the understanding of modern data science but that are typically not taught in an introductory ML course. In particular we discuss fundamental ideas and techniques that come from probability, information theory as well as signal processing.

DH-406: Machine learning for DH

This course aims to introduce the basic principles of machine learning in the context of the digital humanities. We will cover both supervised and unsupervised learning techniques, and study and implement methods to analyze diverse data types, such as images, music and social network data.

MATH-467: Probabilistic methods in combinatorics

We develop a sophisticated framework for solving problems in discrete mathematics through the use of randomness (i.e., coin flipping). This includes constructing mathematical structures with unexpected (and sometimes paradoxical) properties for which no other methods of construction are known.

Related units (2)

Related lectures (11)

Graphs offer a simple yet meaningful representation of relationships between data. Thisrepresentation is often used in machine learning algorithms in order to incorporate structuralor geometric information about data. However, it can also be used in an inverted fashion:instead of modelling data through graphs, we model graphs through data distributions. In thisthesis, we explore several applications of this new modelling framework.Starting with the graph learning problem, we exploit the probabilistic model of data giventhrough graphs to propose a multi-graph learning method for structured data mixtures. Weexplore various relations that data can have with the underlying graphs through the notionof graph filters. We propose an algorithm to jointly cluster a set of data and learn a graph foreach of the clusters. Experiments demonstrate promising performance in data clusteringand multiple graph inference, and show desirable properties in terms of interpretability andproper handling of high dimensionality on synthetic and real data. The model has further beenapplied to fMRI data, where the method is used to successfully identify different functionalbrain networks and their activation times.This probabilistic model of data defined through graphs can be very meaningful evenwhen no data is available. Thus, in the second part of this thesis, we use such models torepresent each graph through the probabilistic distribution of data, which varies smoothly onthe graph. Optimal transport allows for a comparison of two such distributions, which in turngives a structurally meaningful measure for graph comparison. We follow by using this distance to formulate a new graph alignment problem based on theoptimal transport framework, and propose an efficient stochastic algorithm based onBayesian exploration to accommodate for the nonconvexity of the graph alignment problem.We demonstrate the performance of our novel framework on different tasks like graph alignment,graph classification and graph signal prediction, and we show that our method leads tosignificant improvement with respect to the state-of-art algorithms.Furthermore, we cast a new formulation for the one-to-many graph alignment problem,allowing for comparison of graphs of different sizes. The resulting alignment problem issolved with stochastic gradient descent, where a novel Dykstra operator ensures that thesolution is a one-to-many (soft) assignment matrix. Experiments on graph alignment and graph classification problems show that our method for one-to-many alignment leads to meaningful improvements with respect to the state-of-the-art algorithms for each of thesetasks.Finally, we explore a family of probabilistic distributions for data based on graph filters.Distances defined through a graph filter give a high level of flexibility in choosing whichgraph properties we want to emphasize. In addition, in order to make the above graphalignment problem more scalable, we formulate an approximation to our filter Wassersteingraph distance that allows for the exploitation of faster algorithms, without grossly sacrificingthe performance. We propose two algorithms, a simple one based on mirror gradient descentand another one built on its stochastic version, which offers a trade-off between speed andaccuracy. Our experiments show the performance benefits of our novel stochastic algorithm,as well as the strong value of flexibility offered by filter-based distances.

Giovanni Chierchia, Mireille El Gheche, Pascal Frossard, Hermina Petric Maretic

We present a novel framework based on optimal transport for the challenging problem of comparing graphs. Specifically, we exploit the probabilistic distribution of smooth graph signals defined with respect to the graph topology. This allows us to derive an explicit expression of the Wasserstein distance between graph signal distributions in terms of the graph Laplacian matrices. This leads to a structurally meaningful measure for comparing graphs, which is able to take into account the global structure of graphs, while most other measures merely observe local changes independently. Our measure is then used for formulating a new graph alignment problem, whose objective is to estimate the permutation that minimizes the distance between two graphs. We further propose an efficient stochastic algorithm based on Bayesian exploration to accommodate for the non-convexity of the graph alignment problem. We finally demonstrate the performance of our novel framework on different tasks like graph alignment, graph classification and graph signal prediction, and we show that our method leads to significant improvement with respect to the state-of-art algorithms.

Rachid Guerraoui, David Kozhaya, Yvonne Anne Pignolet-Oswald

To increase their dependability, distributed control systems (DCSs) need to agree in real time about which hosts have crashed, i.e., they need a real-time membership service. In this paper, we prove that such a service cannot be implemented deterministically if, besides host crashes, communication can also fail. We define implementable probabilistic variants of member- ship properties, which constitute what we call a synchronous membership service (SYMS). We present an algorithm, ViewSnoop, that implements SYMS with high-probability. We implement, deploy and evaluate ViewSnoop analytically as well as experimentally, within an industrial DCS framework. We show that ViewSnoop significantly improves the dependability of DCSs compared to membership schemes based on classic heart- beats, at low additional cost. Moreover, ViewSnoop distinguishes, with high probability, host crashes from message losses, enabling DCSs to counteract losses better than existing approaches.