Deanonymization Exercise | EPFL Graph Search

This lecture covers the process of deanonymization using two public datasets: one anonymized and published by Netflix, and the other non-anonymized. The datasets are loaded and displayed, containing random names and evaluations. The exercise involves matching users between the datasets, sorting by rating, and finding missing films. The lecture progresses to handling larger datasets, evaluating user matches, and addressing the challenges of real-world databases. Techniques such as frequency evaluation and probabilistic correlations are discussed, emphasizing the complexities of accurate matching and the need for probabilistic approaches.