Lecture

State Space Models: Expressivity of Transformers

Description

This lecture discusses state space models and expressivity results related to transformers. The instructor begins by explaining the necessity of sufficient state storage in state space models for copying sequences. The proof highlights that the output of these models is state-dependent, requiring prior information for accurate copying. The lecture then transitions to transformers, focusing on a theorem regarding their expressivity. The instructor elaborates on the concept of attention heads, which are crucial for the transformer architecture. The discussion includes how transformers can copy sequences exponentially based on the number of attention heads. The instructor introduces an n-gram copying algorithm, explaining how it utilizes a hash table to map n-grams to their subsequent tokens. The lecture concludes with an exploration of the relationship between the size of the hash table and the input sequence, emphasizing the efficiency of transformers in learning and implementing this copying mechanism. Overall, the lecture provides insights into the theoretical underpinnings of transformers and their practical applications in sequence copying tasks.

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.