Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
This lecture covers the concept of document classification, where a classifier is constructed to assign labels to unlabeled documents based on training data. It explains the use of document vectors, words, phrases, and metadata as features in classification models like k-Nearest-Neighbors, Naïve Bayes, and word embeddings. The challenges of dealing with high dimensionality and the implementation of classification models are also discussed, along with self-attention mechanisms and transformer models.