This lecture provides an introduction to classical language models, focusing on their foundational concepts and applications. It begins with an overview of language models, explaining their role in predicting sequences of tokens and their importance in various applications such as machine translation, text generation, and speech recognition. The instructor discusses count-based language models, emphasizing the maximum likelihood estimation (MLE) method for estimating probabilities based on token occurrences in a corpus. The lecture also covers the Markov assumption and introduces n-gram models, detailing how they simplify the modeling of sequences by considering only recent tokens. The evaluation of language models is addressed, highlighting metrics like perplexity and the importance of assigning higher probabilities to grammatically correct sentences. The discussion includes the challenges of sparsity in language models and the need for smoothing techniques to handle unseen sequences. Overall, the lecture encapsulates the historical context and evolution of language models, setting the stage for more advanced topics in subsequent sessions.