Publication

Synthetic Data - Anonymisation Groundhog Day

Abstract

Synthetic data has been advertised as a silver-bullet solution to privacy-preserving data publishing that addresses the shortcomings of traditional anonymisation techniques. The promise is that synthetic data drawn from generative models preserves the statistical properties of the original dataset but, at the same time, provides perfect protection against privacy attacks. In this work, we present the first quantitative evaluation of the privacy gain of synthetic data publishing and compare it to that of previous anonymisation techniques. Our evaluation of a wide range of state-of-the-art generative models demonstrates that synthetic data either does not prevent inference attacks or does not retain data utility. In other words, we empirically show that synthetic data does not provide a better tradeoff between privacy and utility than traditional anonymisation techniques. Furthermore, in contrast to traditional anonymisation, the privacy-utility tradeoff of synthetic data publishing is hard to predict. Because it is impossible to predict what signals a synthetic dataset will preserve and what information will be lost, synthetic data leads to a highly variable privacy gain and unpredictable utility loss. In summary, we find that synthetic data is far from the holy grail of privacy-preserving data publishing.

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related concepts (12)
Privacy law
Privacy law is the body of law that deals with the regulating, storing, and using of personally identifiable information, personal healthcare information, and financial information of individuals, which can be collected by governments, public or private organisations, or other individuals. It also applies in the commercial sector to things like trade secrets and the liability that directors, officers, and employees have when handing sensitive information.
Right to privacy
The right to privacy is an element of various legal traditions that intends to restrain governmental and private actions that threaten the privacy of individuals. Over 150 national constitutions mention the right to privacy. On 10 December 1948, the United Nations General Assembly adopted the Universal Declaration of Human Rights (UDHR), originally written to guarantee individual rights of everyone everywhere; while right to privacy does not appear in the document, many interpret this through Article 12, which states: "No one shall be subjected to arbitrary interference with his privacy, family, home or correspondence, nor to attacks upon his honour and reputation.
Privacy
Privacy (UK, US) is the ability of an individual or group to seclude themselves or information about themselves, and thereby express themselves selectively. The domain of privacy partially overlaps with security, which can include the concepts of appropriate use and protection of information. Privacy may also take the form of bodily integrity. There have been many different conceptions of privacy throughout history. Most cultures recognize the right of an individual to withhold aspects of their personal lives from public record.
Show more
Related publications (32)

The Privacy Power of Correlated Noise in Decentralized Learning

Rachid Guerraoui, Martin Jaggi, Anastasiia Koloskova, Youssef Allouah, Aymane El Firdoussi

Decentralized learning is appealing as it enables the scalable usage of large amounts of distributed data and resources (without resorting to any central entity), while promoting privacy since every user minimizes the direct exposure of their data. Yet, wi ...
PMLR2024

A Privacy-Preserving Querying Mechanism with High Utility for Electric Vehicles

Sayan Biswas

Electric vehicles (EVs) are becoming more popular due to environmental consciousness. The limited availability of charging stations (CSs), compared to the number of EVs on the road, has led to increased range anxiety and a higher frequency of CS queries du ...
Piscataway2024

Bridging the gap between theoretical and practical privacy technologies for at-risk populations

Kasra Edalatnejadkhamene

With the pervasive digitalization of modern life, we benefit from efficient access to information and services. Yet, this digitalization poses severe privacy challenges, especially for special-needs individuals. Beyond being a fundamental human right, priv ...
EPFL2023
Show more

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.