Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
The impressive technological advances in genomic analysis and the significant drop in the cost of genome sequencing are paving the way to a variety of revolutionary applications in modern healthcare. In particular, the increasing understanding of the human genome, and of its relation to diseases, health and to responses to treatments brings promise of improvements in better preventive and personalized medicine. Unfortunately, the impact on privacy and security is unprecedented. The genome is our ultimate identifier and, if leaked, it can unveil sensitive and personal information such as our genetic diseases, our propensity to develop certain conditions (e.g., cancer or Alzheimer's) or the health issues of our family. Even though legislation, such as the EU General Data Protection Regulation (GDPR) or the US Health Insurance Portability and Accountability Act (HIPAA), aims at mitigating abuses based on genomic and medical data, it is clear that this information also needs to be protected by technical means.
In this thesis, we investigate the problem of developing new and practical privacy-enhancing technologies (PETs) for the protection of medical and genomic data. Our goal is to accelerate the adoption of PETs in the medical field in order to address the privacy and security concerns that prevent personalized medicine from reaching its full potential. We focus on two main areas of personalized medicine: clinical care and medical research. For clinical care, we first propose a system for securely storing and selectively retrieving raw genomic data that is indispensable for in-depth diagnoses and treatments of complex genetic diseases such as cancer. Then, we focus on genetic variants and devise a new model based on additively-homomorphic encryption for privacy-preserving genetic testing in clinics. Our model, implemented in the context of HIV treatment, is the first to be tested and evaluated by practitioners in a real operational setting. For medical research, we first propose a method that combines somewhat-homomorphic encryption with differential privacy to enable secure feasibility studies on genetic data stored at an untrusted central repository. Second, we address the problem of sharing genomic and medical data when the data is distributed across multiple mistrustful institutions. We begin by analyzing the risks that threaten patientsâ privacy in systems for the discovery of genetic variants, and we propose practical mitigations to the re-identification risk. Then, for clinical sites to be able to share the data without worrying about the risk of data breaches, we develop a new system based on collective homomorphic encryption: it achieves trust decentralization and enables researchers to securely find eligible patients for clinical studies. Finally, we design a new framework, complementary to the previous ones, for quantifying the risk of unintended disclosure caused by potential inference attacks that are jointly combined by a malicious adversary, when exact genomic data is shared.
In summary, in this thesis we demonstrate that PETs, still believed unpractical and immature, can be made practical and can become real enablers for overcoming the privacy and security concerns blocking the advancement of personalized medicine. Addressing privacy issues in healthcare remains a great challenge that will increasingly require long-term collaboration among geneticists, healthcare providers, ethicists, lawmakers, and computer scientists.
Carmela González Troncoso, Bogdan Kulynych
, , ,