This lecture delves into the challenges of distributed computing, emphasizing the need for parallelism to handle ever-increasing data sizes. The instructor discusses the execution models for platforms in data analytics, the exponential growth of data, the types of data sources, and the battle against the three Vs (Volume, Velocity, Variety) in big data. The lecture also explores the complexities of handling structured and unstructured data, the importance of data harmonization, and the trade-offs between data integration and query time.
This video is available exclusively on Mediaspace for a restricted audience. Please log in to MediaSpace to access it if you have the necessary permissions.
Watch on Mediaspace