In our daily lives, our mobile phones sense our movements and interactions via a rich set of embedded sensors such as a GPS, Bluetooth, accelerometers, and microphones. This enables us to use mobile phones as agents for collecting spatio-temporal data. The idea of mining these spatio-temporal data is currently being explored for many applications, including environmental pollution monitoring, health care, and social networking. When used as sensing devices, a particular feature of mobile phones is their aspect of mobility, in contrast to static sensors. Furthermore, despite having useful applications, collecting data from mobile phones introduces privacy concerns, as the collected data might reveal sensitive information about the users, especially if the collector has access to auxiliary information. In the first part of this thesis, we use spatio-temporal data collected by mobile phones in order to evaluate different features of a population related to their mobility patterns. In particular, we consider the problems of population-size and population-density estimation that have applications, among others, in crowd monitoring, activity-hotspot detection, and urban analysis. We first conduct an experiment where ten attendees of an open-air music festival act as Bluetooth probes. Next, we construct parametric statistical models to estimate the total number of visible Bluetooth devices and their density in the festival area. We further test our proposed models against Wi-Fi traces obtained on the EPFL campus. We conclude that mobile phones can be effectively used as sensing devices to evaluate mobility-related parameters of a population. For the specific problem of population-density estimation, we investigate the mobility aspect of sensing: We quantitatively analyze the performance of mobile sensors compared to static sensors. Under an independent and identically distributed mobility model for the population, we derive the optimal random-movement strategy for mobile sensors in order to yield the best estimate of population density (in the mean-squared error sense). This enables us to plan an adaptive trajectory for the mobile sensors. In particular, we demonstrate that mobility brings an added value to the sensors; these sensors outperform static sensors for long observation intervals. In the second part of this thesis, we analyze the vulnerability of anonymized mobility statistics stored in the form of histograms. We consider an attacker who has access to an anonymized set of histograms of a set of users’ mobility traces and to an independent set of non-anonymized histograms of traces belonging to the same users. We study the hypothesis-testing problem of identifying the correct matching between the anonymized histograms and the non-anonymized histograms. We show that the solution can be obtained by using a minimum-weight matching algorithm on a complete weighted bipartite graph. By applying the algorithm to Wi-Fi traces obtained on the EPFL campu
Michael Lehning, Dylan Stewart Reynolds, Michael Haugeneder