Covers Spark Data Frames, distributed collections of data organized into named columns, and the benefits of using them over RDDs.
Covers data manipulation and exploration using Python with a focus on visualization techniques.
Introduces the basics of data science, covering decision trees, machine learning advancements, and deep reinforcement learning.
Covers data wrangling techniques using Hadoop, focusing on row versus column-oriented databases, popular storage formats, and HBase-Hive integration.
Covers best practices and guidelines for big data, including data lakes, architecture, challenges, and technologies like Hadoop and Hive.
Covers the setup of a Gitlab agent for Kubernetes, focusing on installation, version control, and troubleshooting.
Covers decision tree classification using KNIME Analytics Platform for data preprocessing and model creation.
Introduces collaborative data science tools like Jupyter notebooks, Docker, and Git, emphasizing data versioning and containerization.
Covers collaborative data science tools, big data concepts, Spark, and data stream processing, with tips for the final project.
Covers the setup of a development environment for WordPress, including configuring Docker and managing databases.