This lecture introduces the concept of data streams, focusing on computing statistics with sub-linear memory and estimating quantities efficiently. It covers techniques like counting distinct elements, finding heavy hitters, and approximating count-distinct using algorithms like Flajolet-Martin. The lecture also explores document similarity, discussing shingles, sketches, and sketch comparison methods. Additionally, it delves into distances and nearest-neighbor queries in high-dimensional data, presenting randomized dimension reduction techniques such as the Johnson-Lindenstrauss Lemma and random projection. The instructor provides practical examples and applications, emphasizing the importance of these algorithms in handling 'Big Data' challenges.