Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
This lecture covers training strategies for Transformers, focusing on applications in NLP and Vision. It discusses vanilla Transformer architectures, pre-training strategies, and recent advancements in the field. The instructor emphasizes the rapid evolution of Transformer research and the challenges in scaling up models. Various techniques like BERT, BEIT, and GPT are explained, along with their respective training methodologies. The lecture also touches on the limitations of large-scale models and the computational costs involved. Overall, it provides insights into the key aspects of training Transformers and the current trends in the field.