Lecture

Advanced Spark Optimization

Related lectures (112)
Spark Data Frames
Covers Spark Data Frames, distributed collections of data organized into named columns, and the benefits of using them over RDDs.
General Introduction to Big Data
Covers data science tools, Hadoop, Spark, data lake ecosystems, CAP theorem, batch vs. stream processing, HDFS, Hive, Parquet, ORC, and MapReduce architecture.
Data Science Visualization with Pandas
Covers data manipulation and exploration using Python with a focus on visualization techniques.
Introduction to Data Science
Introduces the basics of data science, covering decision trees, machine learning advancements, and deep reinforcement learning.
Data Wrangling with Hadoop
Covers data wrangling techniques using Hadoop, focusing on row versus column-oriented databases, popular storage formats, and HBase-Hive integration.
Rhythmic Generation Techniques
Covers rhythm generation techniques, including Markov models and hierarchical rhythm generation, with a focus on Nancarrow's Study 14.
Advanced Pandas Functions
Focuses on advanced pandas functions for data manipulation, exploration, and visualization with Python, emphasizing the importance of understanding and preparing data.
Decision Tree Classification
Covers decision tree classification using KNIME Analytics Platform for data preprocessing and model creation.
General Introduction to Data Science
Offers a comprehensive introduction to Data Science, covering Python, Numpy, Pandas, Matplotlib, and Scikit-learn, with a focus on practical exercises and collaborative work.
Collaborative Data Science: Tools and Git Workflow
Explores tools like Git and Docker for collaborative data science projects.

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.