Metrizing Fairness

Daniel Kuhn, Yves Rychener, Bahar Taskesen
2022
Journal paper

Abstract

We study supervised learning problems for predicting properties of individuals who belong to one of two demographic groups, and we seek predictors that are fair according to statistical parity. This means that the distributions of the predictions within the two groups should be close with respect to the Kolmogorov distance, and fairness is achieved by penalizing the dissimilarity of these two distributions in the objective function of the learning problem. In this paper, we showcase conceptual and computational benefits of measuring unfairness with integral probability metrics (IPMs) other than the Kolmogorov distance. Conceptually, we show that the generator of any IPM can be interpreted as a family of utility functions and that unfairness with respect to this IPM arises if individuals in the two demographic groups have diverging expected utilities. We also prove that the unfairness-regularized prediction loss admits unbiased gradient estimators if unfairness is measured by the squared L2-distance or by a squared maximum mean discrepancy. In this case the fair learning problem is susceptible to efficient stochastic gradient descent (SGD) algorithms. Numerical experiments on real data show that these SGD algorithms outperform state-of-the-art methods for fair learning in that they achieve superior accuracy-unfairness trade-offs—sometimes orders of magnitude faster. Finally, we identify conditions under which statistical parity can improve prediction accuracy.

Official source

https://infoscience.epfl.ch/record/294294?ln=en

About this result

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Graph Chatbot

Chat with Graph Search

Topics in statistical physics of high-dimensional machine learning

Understanding generalization and robustness in modern deep learning

Communication-efficient distributed training of machine learning models

Understanding generalization and robustness in modern deep learning

Communication-efficient distributed training of machine learning models

Topics in statistical physics of high-dimensional machine learning