Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
This lecture explores the training and applications of Vision-Language-Action models, focusing on large language models and their integration with robotic control. The presentation covers topics such as language to rewards for robotic skill synthesis, VLMs as robot policies, and the transfer of web knowledge to robotic control. Results from various experiments are discussed, showcasing emergent skills, quantitative evaluations, and the performance of different models in language-based tasks. The lecture concludes with insights on representing actions in VLMs and the future directions of research in this field.