Explores the design of a general-purpose distributed execution system, covering challenges, specialized frameworks, decentralized control logic, and high-performance shuffle.
Explores storage management challenges in transitioning to data lakes, addressing software and hardware heterogeneity, unified storage design, and performance optimization.
Covers advanced Spark optimizations, memory management, shuffle operations, and data partitioning strategies to improve big data processing efficiency.
Covers a homework assignment on data wrangling and analysis using Python's pandas library for real-world datasets.
Graph Chatbot
Chat with Graph Search
Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.
DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.