Lecture

Entity Resolution: Techniques and Applications

Description

This lecture covers the concept of entity resolution (ER), which involves identifying and aggregating different entity profiles that refer to the same real-world entity across datasets. Topics include duplicate elimination, record linkage, similarity metrics, data deduplication, and possible repairs. The instructor also discusses the challenges of dealing with duplicate entities, such as name/attribute ambiguity and errors due to data entry. Various techniques like clustering, blocking, q-gram set join, and ClusterJoin algorithm are explained in detail to handle duplicate detection and entity clustering efficiently.

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.