Distribution Inference Risks: Identifying and Mitigating Sources of Leakage

Robert West, Maxime Jean Julien Peyrard, Valentin Hartmann, Léo Nicolas René Meynent
2023
Article de conférence

Résumé

A large body of work shows that machine learning (ML) models can leak sensitive or confidential information about their training data. Recently, leakage due to distribution inference (or property inference) attacks is gaining attention. In this attack, the goal of an adversary is to infer distributional information about the training data. So far, research on distribution inference has focused on demonstrating successful attacks, with little attention given to identifying the potential causes of the leakage and to proposing mitigations. To bridge this gap, as our main contribution, we theoretically and empirically analyze the sources of information leakage that allows an adversary to perpetrate distribution inference attacks. We identify three sources of leakage: (1) memorizing specific information about the E[Y | X] (expected label given the feature values) of interest to the adversary, (2) wrong inductive bias of the model, and (3) finiteness of the training data. Next, based on our analysis, we propose principled mitigation techniques against distribution inference attacks. Specifically, we demonstrate that causal learning techniques are more resilient to a particular type of distribution inference risk termed distributional membership inference than associative learning methods. And lastly, we present a formalization of distribution inference that allows for reasoning about more general adversaries than was previously possible.

Source officielle

https://infoscience.epfl.ch/record/303905?ln=fr

À propos de ce résultat

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Distribution Inference Risks: Identifying and Mitigating Sources of Leakage

Graph Chatbot

Chattez avec Graph Search

Topics in statistical physics of high-dimensional machine learning

Robust machine learning for neuroscientific inference

Few-shot Learning for Efficient and Effective Machine Learning Model Adaptation

Topics in statistical physics of high-dimensional machine learning

Robust machine learning for neuroscientific inference

Few-shot Learning for Efficient and Effective Machine Learning Model Adaptation