Lecture

Big Data Best Practices and Guidelines

Related lectures (151)
General-Purpose Distributed Execution System
Explores the design of a general-purpose distributed execution system, covering challenges, specialized frameworks, decentralized control logic, and high-performance shuffle.
Model Finetuning for NLI
Covers the second assignment for the CS-552: Modern NLP course, focusing on transfer learning and data augmentation.
Storage Management in SmartDataLake
Explores storage management challenges in transitioning to data lakes, addressing software and hardware heterogeneity, unified storage design, and performance optimization.
Real-time Intelligence: Data Challenges and Hardware Evolution
Explores data challenges and hardware evolution for real-time intelligence in the era of big data.
Spark DataFrames: Basics and Optimization
Covers the basics of Spark DataFrames, their advantages, performance comparison with RDDs, and practical demos.
Enterprise and Service-Oriented Architecture
Discusses solutions for business/IT alignment and the importance of updating business to IT models.
Apache Spark Ecosystem: Basics and Operations
Provides an overview of the Apache Spark ecosystem, covering basics, operations, and key components.
Advanced Spark Optimizations and Partitioning
Covers advanced Spark optimizations, memory management, shuffle operations, and data partitioning strategies to improve big data processing efficiency.
Data Virtualization Demo: SmartDataLake
Showcases a demo on adaptive data virtualization in SmartDataLake, focusing on assembling company profiles and executing join queries across datasets.
Data Wrangling and Analysis
Covers a homework assignment on data wrangling and analysis using Python's pandas library for real-world datasets.

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.