Spark Storage Layer | EPFL Graph Search

This lecture covers the Spark ecosystem, focusing on the architectural choices and the Spark SQL interface. It discusses the limitations of MapReduce, introduces the concept of Resilient Distributed Datasets (RDDs), and compares RDDs with Hadoop HDFS. The lecture also explains the storage layer in Spark, emphasizing the abstraction provided by RDDs and the utilization of distributed RAM.