Covers data science tools, Hadoop, Spark, data lake ecosystems, CAP theorem, batch vs. stream processing, HDFS, Hive, Parquet, ORC, and MapReduce architecture.
Introduces machine learning basics, covering data segmentation, clustering, classification, and practical applications like image classification and face similarity.