Lecture

Training Strategies for Transformers

Description

This lecture covers training strategies for Transformers, focusing on applications in NLP and Vision. It discusses vanilla Transformer architectures, pre-training strategies, and recent advancements in the field. The instructor emphasizes the rapid evolution of Transformer research and the challenges in scaling up models. Various techniques like BERT, BEIT, and GPT are explained, along with their respective training methodologies. The lecture also touches on the limitations of large-scale models and the computational costs involved. Overall, it provides insights into the key aspects of training Transformers and the current trends in the field.

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.