Data Wrangling with HadoopCovers data wrangling techniques using Hadoop, focusing on row versus column-oriented databases, popular storage formats, and HBase-Hive integration.
Entity Resolution TechniquesExplores entity resolution techniques, data deduplication, similarity metrics, computational cost, blocking techniques, and scaling out similarity joins.