Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
This lecture explores the integration of computer vision and machine learning to understand visual intelligence and its applications in robotics. It delves into the relationship between vision and action, challenges traditional approaches in computer vision, and discusses the importance of cross-task consistency. The instructor presents research areas such as surface normals, segmentation, and object classification, emphasizing the significance of perception in robotics. The lecture also covers multi-task learning, transfer learning, and incremental learning techniques, showcasing the development of multi-modal multi-task masked autoencoders. Various projects and tools like Omnidata, Taskonomy, and Gibson Environment are highlighted as essential resources in the field.