Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
This lecture covers the concepts of Jaccard similarity, minhashing, and locality-sensitive hashing for data summarization. It explains how to find similar items using Jaccard similarity and bitvectors, and how to reduce false positives and negatives in similarity detection. The lecture also delves into the construction of hash functions and the application of cosine distance for document similarity.