Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
This lecture covers the basics of sequence to sequence models, focusing on the shortcomings related to long-range dependencies and the temporal bottleneck. It introduces improvements through attention mechanisms, explaining how they address the limitations of traditional recurrent models. The instructor discusses the concept of bidirectional encoders and how they enhance the representation of input sequences. The training process for encoder-decoder models is explained, emphasizing the need for paired data across languages for tasks like machine translation. The lecture also delves into the implementation of attentive encoder-decoder models, highlighting the importance of attention in mitigating the temporal bottleneck and improving interpretability. Various attention functions and their role in enhancing model performance are explored, along with the interpretability aspect of attention mechanisms.