This lecture discusses the application of transformers in embodied perception and robotics. It begins by addressing the challenges of obtaining real-world data for specific robotic tasks, such as fetching objects. The instructor highlights the use of simulators to generate task-specific data, allowing for extensive training without the limitations of real-world environments. The lecture covers various transformer architectures, including encoder-decoder models, and their effectiveness in learning from large datasets. Key topics include large-scale imitation learning, seamless sim-to-real transfer, and the use of transformers for humanoid locomotion. The instructor also explores decision transformers, which frame reinforcement learning as a sequential modeling problem, and introduces the concept of universal controllers that adapt to different robot morphologies. The lecture concludes with insights on the importance of data in training transformers and the engineering challenges involved in deploying these models effectively in real-world scenarios.