Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
This lecture by the instructor focuses on deep visual recognition and interpretability. It covers the deep learning revolution, semantic segmentation, and deformable 3D reconstruction. The lecture explores the use of Bag of Words and Bag of Visual Words models, standard CNN architectures, and the creation of visual dictionaries. It delves into experiments with datasets, visual codewords, and adversarial attack detection. The instructor also discusses dealing with complex scenes, proposing solutions for encoding local features and generating attention maps. The lecture concludes with insights on scene recognition, attention-aware pooling, and the importance of interpretability in deep networks.