**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.

Publication# Wasserstein Distributionally Robust Learning

Abstract

Many decision problems in science, engineering, and economics are affected by uncertainty, which is typically modeled by a random variable governed by an unknown probability distribution. For many practical applications, the probability distribution is only observable through a set of training samples. In data-driven decision-making, the goal is to find a decision from the training samples that will perform equally well on unseen test samples. In this thesis, we leverage techniques from distributionally robust optimization to address decision-making problems in statistical learning, behavioral economics and estimation problems. In particular, Wasserstein distributionally robust optimization is studied where the decision-maker learns decisions that perform well under the most adverse distribution within a certain Wasserstein distance from a nominal distribution constructed from the training samples.

In the first part of the thesis we study regression and classification methods in supervised learning from the distributionally robust perspective. In the classical setting the goal is to minimize the empirical risk, that is, the expectation of some loss function quantifying the prediction error under the empirical distribution. When facing scarce training data, overfitting is typically mitigated by adding regularization terms to the objective that penalize hypothesis complexity. We introduce new regularization techniques using ideas from distributionally robust optimization, and we give new probabilistic interpretations to existing techniques.

In the second part of the thesis we consider data-driven inverse optimization problems where an observer aims to learn the preferences of an agent who solves a parametric optimization problem depending on an exogenous signal. Thus, the observer seeks the agent's objective function that best explains a historical sequence of signals and corresponding optimal actions. We focus here on situations where the observer has imperfect information, that is, where the agent's true objective function is not contained in the search space of candidate objectives, where the agent suffers from bounded rationality or implementation errors, or where the observed signal-response pairs are corrupted by measurement noise. We formalize this inverse optimization problem as a distributionally robust program minimizing the worst-case risk that the predicted decision (i.e., the decision implied by a particular candidate objective) differs from the agent's actual response to a random signal.

In the final part of the thesis we study a distributionally robust mean square error estimation problem over a nonconvex Wasserstein ambiguity set containing only normal distributions. We show that the optimal estimator and the least favorable distribution form a Nash equilibrium. Despite the nonconvex nature of the ambiguity set, we prove that the estimation problem is equivalent to a tractable convex program. We further devise a Frank-Wolfe algorithm for this convex program whose direction-searching subproblem can be solved in a quasi-closed form. Using these ingredients, we introduce a distributionally robust Kalman filter that hedges against model risk.

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related concepts (46)

Related publications (368)

Related MOOCs (32)

Normal distribution

In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is The parameter is the mean or expectation of the distribution (and also its median and mode), while the parameter is its standard deviation. The variance of the distribution is . A random variable with a Gaussian distribution is said to be normally distributed, and is called a normal deviate.

Decision-making

In psychology, decision-making (also spelled decision making and decisionmaking) is regarded as the cognitive process resulting in the selection of a belief or a course of action among several possible alternative options. It could be either rational or irrational. The decision-making process is a reasoning process based on assumptions of values, preferences and beliefs of the decision-maker. Every decision-making process produces a final choice, which may or may not prompt action.

Probability distribution

In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon in terms of its sample space and the probabilities of events (subsets of the sample space). For instance, if X is used to denote the outcome of a coin toss ("the experiment"), then the probability distribution of X would take the value 0.5 (1 in 2 or 1/2) for X = heads, and 0.

Selected Topics on Discrete Choice

Discrete choice models are used extensively in many disciplines where it is important to predict human behavior at a disaggregate level. This course is a follow up of the online course “Introduction t

Selected Topics on Discrete Choice

Discrete choice models are used extensively in many disciplines where it is important to predict human behavior at a disaggregate level. This course is a follow up of the online course “Introduction t

Neuronal Dynamics - Computational Neuroscience of Single Neurons

The activity of neurons in the brain and the code used by these neurons is described by mathematical neuron models at different levels of detail.

We propose a novel approach to evaluating the ionic Seebeck coefficient in electrolytes from relatively short equilibrium molecular dynamics simulations, based on the Green-Kubo theory of linear response and Bayesian regression analysis. By exploiting the ...

Decision-making permeates every aspect of human and societal development, from individuals' daily choices to the complex decisions made by communities and institutions. Central to effective decision-making is the discipline of optimization, which seeks the ...

As large, data-driven artificial intelligence models become ubiquitous, guaranteeing high data quality is imperative for constructing models. Crowdsourcing, community sensing, and data filtering have long been the standard approaches to guaranteeing or imp ...