Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
This lecture delves into the optimization process of word embedding models, focusing on maximizing overall probabilities and learning parameters from positive and negative examples. It covers the skipgram model with negative sampling, loss function minimization, and gradient descent for learning. The instructor explains the derivation of probabilities, hierarchical softmax, and techniques like Fasttext, Byte Pair Encoding, and Subword Embeddings to enhance model efficiency. The lecture concludes with a detailed explanation of the BPE algorithm and its application in tokenizing text.