Covers data science tools, Hadoop, Spark, data lake ecosystems, CAP theorem, batch vs. stream processing, HDFS, Hive, Parquet, ORC, and MapReduce architecture.
Covers the fundamentals of data stream processing, including real-time insights, industry applications, and practical exercises on Kafka and Spark Streaming.