**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Publication# Metrizing Fairness

Abstract

We study supervised learning problems for predicting properties of individuals who belong to one of two demographic groups, and we seek predictors that are fair according to statistical parity. This means that the distributions of the predictions within the two groups should be close with respect to the Kolmogorov distance, and fairness is achieved by penalizing the dissimilarity of these two distributions in the objective function of the learning problem. In this paper, we showcase conceptual and computational benefits of measuring unfairness with integral probability metrics (IPMs) other than the Kolmogorov distance. Conceptually, we show that the generator of any IPM can be interpreted as a family of utility functions and that unfairness with respect to this IPM arises if individuals in the two demographic groups have diverging expected utilities. We also prove that the unfairness-regularized prediction loss admits unbiased gradient estimators if unfairness is measured by the squared L2-distance or by a squared maximum mean discrepancy. In this case the fair learning problem is susceptible to efficient stochastic gradient descent (SGD) algorithms. Numerical experiments on real data show that these SGD algorithms outperform state-of-the-art methods for fair learning in that they achieve superior accuracy-unfairness trade-offs—sometimes orders of magnitude faster. Finally, we identify conditions under which statistical parity can improve prediction accuracy.

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related concepts

Loading

Related publications

Loading

Related concepts (17)

Algorithm

In mathematics and computer science, an algorithm (ˈælɡərɪðəm) is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algo

Stochastic gradient descent

Stochastic gradient descent (often abbreviated SGD) is an iterative method for optimizing an objective function with suitable smoothness properties (e.g. differentiable or subdifferentiable). It can b

Loss function

In mathematical optimization and decision theory, a loss function or cost function (sometimes also called an error function) is a function that maps an event or values of one or more variables onto

Related publications (33)

Loading

Loading

Loading

Finding convergence rates for numerical optimization algorithms is an important task, because it gives a justification to their use in solving practical problems, while also providing a way to compare their efficiency. This is especially useful in an asynchronous environment, because the algorithms are often proven to be more efficient than their synchronous counterparts by experience, but they lack the theory that justifies this property. Furthermore, analyzing the various issues that can arise when inconsistency is taken into consideration, it is possible to obtain a clearer picture of the downsides of inexact implementations one should be aware of. This work tries to address the problem of finding an expected convergence rate for the asynchronous version of the widely popular stochastic gradient descent algorithm, applied to the common class of problems that present a cost function with a sum structure. It follows a similar approach to the one suggested by R. Leblond, F. Pedregosa and S. Lacoste-Julien in "ASAGA: Asynchronous Parallel SAGA" (2016) [RLLJ16], also borrowing their formalization of asynchronicity. The main achievement of this work is a bound on the constant step size that guarantees convergence in expectation of the algorithm. The relative convergence rate is also obtained. The result is also partially validated by sequential models of an asynchronous environment. We hope that this can be a basis for future applications of the same approach to more specific algorithms and that numerical experiments on real multiprocessor architecture can be performed in the future to further validate the convergence rates.

2017We have developed a new derivative-free algorithm based on Radial Basis Functions (RBFs). Derivative-free optimization is an active field of research and several algorithms have been proposed recently. Problems of this nature in the industrial setting are quite frequent. The reason is that in a number of applications the optimization process contains simulation packages which are treated as black boxes. The development of our own algorithm was originally motivated by an application in biomedical imaging: the medical image registration problem. The particular characteristics of this problem have incited us to develop a new optimization algorithm based on trust-region methods. However it has been designed to be generic and to be applied to a wide range of problems. The main originality of our approach is the use of RBFs to build the models. In particular we have adapted the existing theory based on quadratic models to our own models and developed new procedures especially designed for models based on RBFs. We have tested our algorithm called BOOSTERS against state-of-the-art methods (UOBYQA, NEWUOA, DFO). On the medical image registration problem, BOOSTERS appears to be the method of choice. The tests on problems from the CUTEr collection show that BOOSTERS is comparable to, but not better than other methods on small problems (size 2-20). It is performing very well for medium size problems (20-80). Moreover, it is able to solve problems of dimension 200, which is considered very large in derivative-free optimization. We have also developed a new class of algorithms combining the robustness of derivative-free algorithms with the faster rate of convergence characterizing Newtonlike-methods. In fact, they define a new class of algorithms lying between derivative-free optimization and quasi-Newton methods. These algorithms are built on the skeleton of our derivative-free algorithm but they can incorporate the gradient when it is available. They can be interpreted as a way of doping derivative-free algorithms with derivatives. If the derivatives are available at each iteration, then our method can be seen as an alternative to quasi-Newton methods. At the opposite, if the derivatives are never evaluated, then the algorithm is totally similar to BOOSTERS. It is a very interesting alternative to existing methods for problems whose objective function is expensive to evaluate and when the derivatives are not available. In this situation, the gradient can be approximated by finite differences and its costs corresponds to n additional function evaluations assuming that Rn is the domain of definition of the objective function. We have compared our method with CFSQP and BTRA, two gradient-based algorithms, and the results show that our doped method performs best. We have also a theoretical analysis of the medical image registration problem based on maximization of mutual information. Most of the current research in this field is concentrated on registration based on nonlinear image transformation. However, little attention has been paid to the theoretical properties of the optimization problem. In our analysis, we focus on the continuity and the differentiability of the objective function. We show in particular that performing a registration without extension of the reference image may lead to discontinuities in the objective function. But we demonstrate that, under some mild assumptions, the function is differentiable almost everywhere. Our analysis is important from an optimization point of view and conditions the choice of a solver. The usual practice is to use generic optimization packages without worrying about the differentiability of the objective function. But the use of gradient-based methods when the objective function is not differentiable may result in poor performance or even in absence of convergence. One of our objectives with this analysis is also that practitioners become aware of these problems and to propose them new algorithms having a potential interest for their applications.

Physically based rendering methods can create photorealistic images by simulating the propagation and interaction of light in a virtual scene. Given a scene description including the shape of objects, participating media, material properties, etc., the simulation computes an image representing the radiance reaching the sensor.This thesis, however, pursues the corresponding inverse problem: given observations such as pictures of a scene, we want to recover a plausible description of its components. Unfortunately, the rendering process is typically far too complex to invert analytically. We therefore turn to iterative gradient-based approximations of the inverse, that require efficiently estimating gradients of an objective function with respect to the scene parameters of interest.The first part of this thesis is dedicated to algorithms for gradient estimation. We consider the usage of automatic differentiation and examine the associated tradeoffs. Next, we introduce radiative backpropagation, an adjoint method that casts the gradient estimation problem into a modified light transport problem, unlocking vastly more efficient implementations. For the case of participating media, we propose differential ratio tracking, a sampling technique that addresses the bias and high variance found in existing gradient estimators.In the second part, we focus on the design of systems to support effective differentiable rendering research, including the efficient implementation of the algorithms above. We describe the architecture and features of Mitsuba 2, an open-source retargetable physically based renderer. Mitsuba 2 supports different representations of colors (RGB, spectral, polarized), computing platforms (scalar, CPU vectorized, GPU), and numerical precision, within a single codebase. Importantly, automatic differentiation can be applied throughout the system. Then, we extend this design by applying an automatic conversion from wavefront-style rendering to a megakernel-based approach, leveraging just-in-time compilation. We obtain a fast, flexible and memory-efficient framework for primal and differentiable physically based rendering.Finally, the third part showcases three applications of differentiable physically based rendering: caustic design, inverse volume rendering, and material & lighting estimation in real indoor scenes. In all cases, special care is taken to avoid sub-optimal local minima due to the ambiguous and non-convex nature of the reconstruction problems.