Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
This lecture covers the pre-processing steps for Natural Language Processing tasks, focusing on tokenization, stop words removal, and lemmatization. The instructor guides through the process of preparing text data for sentiment analysis using Python libraries like NLTK and Spacy. The lecture includes practical examples of tokenizing text, removing stop words, and reducing words to their base form. Students will learn how to implement these techniques in a step-by-step manner and understand their importance in text analysis tasks.