Transformers in Vision: Applications and Architectures

In course

Occaecat adipisicing deserunt aute fugiat reprehenderit eu proident. Nostrud irure occaecat commodo labore labore aute id cillum amet dolore. Laborum elit laboris excepteur do reprehenderit labore nisi incididunt mollit ea eu cupidatat.

Description

This lecture discusses the transformative impact of transformers in various fields, particularly in computer vision. It begins with an overview of transformers, highlighting their unifying role across different machine learning domains, such as natural language processing and speech recognition. The instructor reviews the foundational paper 'Attention Is All You Need' and explains the architecture of transformers, including the encoder-decoder structure. The lecture emphasizes the effectiveness of transformer-based models in image classification and semantic segmentation, showcasing recent advancements and leaderboards. The discussion extends to the applications of transformers in visual perception, including embodied AI and static vision tasks. The instructor also covers the importance of tokenization and positional encoding in processing different data types, such as text and images. The lecture concludes with insights into the future of transformers in vision, including their scalability and potential for further innovations in the field.

Login to watch the video

Instructor

nulla amet anim magna

Occaecat irure proident nulla amet commodo ea laborum. Fugiat do pariatur duis commodo eiusmod aliquip nostrud pariatur ullamco consequat. Ipsum deserunt deserunt amet labore ea cupidatat id cupidatat. Nostrud sunt consectetur non adipisicing voluptate. In labore ipsum sint tempor eu minim amet Lorem ullamco fugiat amet et quis. Amet occaecat esse est voluptate incididunt aliqua sunt nisi do pariatur ex id.

Login to see this section

Official source

https://mediaspace.epfl.ch/media/0_ve9snkot

About this result

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Transformers in Vision: Applications and Architectures

Graph Chatbot

Chat with Graph Search