Lecture

Big Data: Best Practices and Guidelines

Description

This lecture provides a general introduction to big data, covering best practices and guidelines. It explores the concept of data lakes, typical big data architecture, and the challenges of addressing big data. The instructor emphasizes the importance of ingesting, cleaning, and integrating data before analytics. The lecture delves into the CAP Theorem of Distributed Data Stores, the clash between batch and stream processing, and the technologies used to address big data challenges. It also covers Hadoop Distributed File Systems, MapReduce, and popular HDFS storage formats. Additionally, it introduces the upcoming topic of HIVE Hadoop Data Warehouse and discusses a graded assignment focusing on CO2 time series modeling and data visualization.

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.