Lecture

Introduction to Data Stream Processing

Related lectures (49)
Spark Data Frames
Covers Spark Data Frames, distributed collections of data organized into named columns, and the benefits of using them over RDDs.
General Introduction to Big Data
Covers data science tools, Hadoop, Spark, data lake ecosystems, CAP theorem, batch vs. stream processing, HDFS, Hive, Parquet, ORC, and MapReduce architecture.
Introduction to Data Science
Introduces the basics of data science, covering decision trees, machine learning advancements, and deep reinforcement learning.
Elements of Collaborative Data Science
Introduces collaborative data science tools like Jupyter notebooks, Docker, and Git, emphasizing data versioning and containerization.
Data Science Visualization with Pandas
Covers data manipulation and exploration using Python with a focus on visualization techniques.
Data Wrangling with Hadoop
Covers data wrangling techniques using Hadoop, focusing on row versus column-oriented databases, popular storage formats, and HBase-Hive integration.
Analytics on Data at Rest and Data in Motion
Explores combining data at rest with data in motion, emphasizing the Lambda architecture complexities and quality assessment of streams and batches.
Big Data Best Practices and Guidelines
Covers best practices and guidelines for big data, including data lakes, architecture, challenges, and technologies like Hadoop and Hive.
General Introduction to Data Science
Offers a comprehensive introduction to Data Science, covering Python, Numpy, Pandas, Matplotlib, and Scikit-learn, with a focus on practical exercises and collaborative work.
Decision Tree Classification
Covers decision tree classification using KNIME Analytics Platform for data preprocessing and model creation.

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.