Distributed Computing Execution Models: Spark Ecosystem

About
Privacy
Disclaimer

Graph Chatbot

In course

This course is intended for students who want to understand modern large-scale data analysis systems and database systems. It covers a wide range of topics and technologies, and will prepare students

Description

This lecture discusses the evolution of execution models for distributed computing, focusing on the spark ecosystem and its architectural choices, the spark sequel interface, and the problems with skew in the spark ecosystem. It also addresses the limitations of MapReduce, such as extensive IO requirements, a limited programming model, and suboptimal implementation according to database experts.

This video is available exclusively on Mediaspace for a restricted audience. Please log in to MediaSpace to access it if you have the necessary permissions.

Watch on Mediaspace

Instructor

Anastasia Ailamaki

Official source

https://mediaspace.epfl.ch/media/0_ltubck9h

About this result

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

In course

CS-422: Database systems

This course is intended for students who want to understand modern large-scale data analysis systems and database systems. It covers a wide range of topics and technologies, and will prepare students