Transformers in Vision: Applications and Architectures

In course

Aliqua proident voluptate ut labore reprehenderit eiusmod fugiat exercitation pariatur ipsum aliqua ex sit veniam. Cillum fugiat excepteur dolore fugiat elit proident irure magna in. Aliqua ullamco eu laboris in aliquip aliqua.

Description

This lecture discusses the transformative impact of transformers in various fields, particularly in computer vision. It begins with an overview of transformers, highlighting their unifying role across different machine learning domains, such as natural language processing and speech recognition. The instructor reviews the foundational paper 'Attention Is All You Need' and explains the architecture of transformers, including the encoder-decoder structure. The lecture emphasizes the effectiveness of transformer-based models in image classification and semantic segmentation, showcasing recent advancements and leaderboards. The discussion extends to the applications of transformers in visual perception, including embodied AI and static vision tasks. The instructor also covers the importance of tokenization and positional encoding in processing different data types, such as text and images. The lecture concludes with insights into the future of transformers in vision, including their scalability and potential for further innovations in the field.

Login to watch the video

Instructor

consectetur dolore fugiat

Mollit magna commodo ea do est enim. Consequat exercitation aliquip aliquip quis deserunt consequat ullamco ipsum deserunt quis. Eiusmod minim dolor anim exercitation esse exercitation sit mollit est consectetur qui et ullamco irure. In occaecat sit in sit deserunt nostrud eu tempor.

Login to see this section

Official source

https://mediaspace.epfl.ch/media/0_ve9snkot

About this result

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Transformers in Vision: Applications and Architectures

Graph Chatbot

Chat with Graph Search