Skip to main content
Graph
Search
fr
en
Login
Search
All
Categories
Concepts
Courses
Lectures
MOOCs
People
Practice
Publications
Startups
Units
Show all results for
Home
Lecture
Fault Tolerance and Recovery: Data Safety in Distributed Computing
Graph Chatbot
Related lectures (23)
Previous
Page 2 of 3
Next
Big Data Best Practices and Guidelines
Covers best practices and guidelines for big data, including data lakes, architecture, challenges, and technologies like Hadoop and Hive.
General-Purpose Distributed Execution System
Explores the design of a general-purpose distributed execution system, covering challenges, specialized frameworks, decentralized control logic, and high-performance shuffle.
Introduction to Spark Runtime Architecture
Covers the Spark runtime architecture, including RDDs, transformations, actions, and caching for performance optimization.
Data Wrangling with Hadoop
Covers data wrangling techniques using Hadoop, focusing on row versus column-oriented databases, popular storage formats, and HBase-Hive integration.
Big Data Challenges: Scaling to Massive Data
Explores challenges of handling massive data in the era of big data, discussing solutions like MapReduce and Spark.
Hadoop Ecosystem: Architectural Choices & MapReduce Programming
Explores the Hadoop ecosystem's architecture and MapReduce programming model, emphasizing strengths and limitations.
Execution Models for Distributed Computing - 2nd generation
Explores the 2nd generation of execution models for distributed computing, focusing on Spark and Resilient Distributed Datasets (RDDs).
Introduction to Spark runtime architecture
Introduces Apache Spark, covering its key features, history, RDDs, architecture, and distributed computing framework.
Data formats and data wrangling with Hadoop
Explores Apache Hive for data warehousing, data formats, and partitioning, with practical exercises in querying and connecting to Hive.
Data Wrangling with Hadoop: Storage Formats and Hive
Explores data wrangling with Hadoop, emphasizing storage formats and Hive for big data processing.