Data Wrangling with HadoopCovers data wrangling techniques using Hadoop, focusing on row versus column-oriented databases, popular storage formats, and HBase-Hive integration.
Water Consumption in GenevaExplores water consumption data in Geneva, including charts on consumption and losses, available datasets, and data processing phases.
Cache MemoryExplores cache memory design, hits, misses, and eviction policies in computer systems, emphasizing spatial and temporal locality.
Data Issues in ResearchExplores challenges in data assumptions, biases, and more in research, including incomplete write-ups and frustrations of newcomers.
Introduction to Data Stream ProcessingCovers the fundamentals of data stream processing, including tools like Apache Storm and Kafka, key concepts like event time and window operations, and the challenges of stream processing.