Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
Recent developments in information and communication technologies have been profound and life-changing. Most people are now equipped with smart phones with high computation power and communication capabilities. These devices can efficiently run multiple software applications in parallel, store a non-negligible amount of (personal) user data, process various sophisticated sensors and actuators, and communicate over multiple wireless media. Furthermore, they are commonly equipped with high-precision localization capabilities based, for example, on a GPS receiver or on triangulation with nearby base stations or access points. Mobile applications take advantage of this feature to provide location-based services to users. The ever-increasing usage of these personal communication devices and mobile applications, although providing convenience to their owners, comes at a very high cost to their privacy. Interacting with location-based services (LBSs) leaves an almost indelible digital trace of users’ whereabouts. Moreover, the contextual information attached to these traces can reveal users’ personal habits, interests, activities, and relationships. Consequently, exposure of this private information to third parties (such as service providers) escalates their power on individuals, and opens the door to various misuses of users’ personal data. Individuals have the right, and should also have the means to control the amount of their private (location) information that is disclosed to others. In the context of location-based services, various privacy enhancing mechanisms, such as location obfuscation and user anonymization, are proposed in the literature. However, the existing design methodologies for location-privacy preserving mechanisms do not consistently model users’ (privacy and service quality) requirements together with the adversary’s knowledge and objectives. Protection mechanisms are instead designed in an ad hoc manner and irrespective of the adversary model. Consequently, there is a mismatch between the goals and results of these protection mechanisms. Furthermore, the evaluation of privacy preserving mechanisms and their comparison remain problematic because of the absence of a systematic method to quantify them. In particular, the assumptions about the adversary model tend to be incomplete, with the risk of a possibly wrong estimation of the users’ location privacy. Arguably, the lack of a generic analytical framework for specifying protection mechanisms and for evaluating location privacy is evident. The absence of such a framework makes the design of effective protection mechanisms and the objective comparisons between them impossible. In this thesis, we address these issues and provide solutions for a systematic quantification and protection of location privacy. To this end, we construct an analytic framework for location privacy. We formalize users’ mobility model, their access pattern to location-based services, and their privacy and service quality requirements. We also model location-privacy preserving mechanisms as probabilistic functions that obfuscate users’ (location and identity) information before being shared with location-based services. Moreover, in order to quantify users’ location privacy, we propose inference mechanisms that measure users’ information leakage to third parties. They combine various pieces of information about users and estimate (by establishing a probabilistic belief on) users’ private information (e.g., their location at a given time). Therefore, we propose the adversary’s expected estimation error as, arguably, the right metric for location privacy. In our inference framework, we formalize the adversary’s prior knowledge on users, his observation (on the users’ accesses to LBS), and his inference objectives (e.g., re-identifying or localizing users). We assume that adversary constructs a (mobility) profile for each user, to be used in his inference attacks. We make use of statistical tools to construct these profiles, given users’ partial traces. Moreover, we model the inference attacks as the estimation of users’ actual locations, given their profiles and their LBS accesses (observed by the adversary). We mainly use Bayesian inference to perform the estimation. In particular, we use known inference algorithms for hidden Markov models to design de-anonymization, localization, and tracking attacks. To cover more adversary’s objectives, we propose an algorithm for generic location inference attacks, based on Markov-chain Monte-Carlo methods. We also provide a software tool: the Location-Privacy and Mobility Meter (LPM). It is designed based on our formal framework for evaluating the effectiveness of various location-privacy preserving mechanisms and quantifying users’ location privacy. As an example, using LPM, we validate the efficacy of existing location obfuscation and anonymization mechanisms on real location traces. We show that users’ location privacy measured by existing popular metrics, k-anonymity and entropy, is not correlated with the adversary’s success (in learning users’ private information), thus these metrics are inappropriate as privacy metrics. Our results also confirm that anonymization alone is a weak location-privacy preserving mechanism. Moreover, our results show how the resilience of a protection mechanism varies with respect to different inference attacks. Hence, it is a necessity for privacy protection mechanisms to be designed with concrete attack objectives in mind. Relying on these findings, we design optimal location obfuscation techniques tailored against localization attacks. A user needs a protection mechanism that maximizes her location privacy. This is at odds with the objectives of the adversary who designs inference attacks that minimize his estimation error. We propose a game-theoretic methodology that models the conflicting objectives of user and adversary simultaneously. More precisely, we model the problem as a Bayesian Stackelberg game and solve it by using linear programming. In the optimization problem, users constrain the protection mechanism to respect their service quality requirements. This enables us to find the optimal point in the tradeoff curve between privacy and service quality that satisfies both user privacy and service quality requirements. Our results indicate that anticipating for the inference attacks and considering the adversary’s knowledge lead to the design of more effective protection mechanisms. This thesis is a step towards a more systematic modeling, analysis, and design of (location) privacy enhancing technologies. We believe that our analytical approach can be used to quantify and protect privacy in scenarios and domains that are not covered in this thesis.
Boi Faltings, Sujit Prakash Gujar, Aleksei Triastcyn, Sankarshan Damle
, , , ,