Incorporating Projective Geometry into Deep Learning

Michal Jan Tyszkiewicz
2024
EPFL thesis

Abstract

In this thesis we explore the applications of projective geometry, a mathematical theory of the relation between 3D scenes and their 2D images, in modern learning-based computer vision systems. This is an interesting research question which contradicts the recent trend to forgo such domain knowledge in favor of learning everything directly from data. We show how to use these robust mathematics where applicable while maximally leveraging data for the remaining aspects.The thesis extends three peer-reviewed papers. In the first, we introduce an algorithm to extract local image features, a technique of matching related regions across images. Unlike in standard supervised learning, we do not define the features through examples but rather their desired properties. We leave it to the training procedure to find a conforming algorithm. This shows an application of projective geometry for supervision of neural networks. We then turn to two cases of using projective geometry in the network architecture. In one, we present a method to deduce indoor scene layouts from video walkthroughs. We constrain the Transformer, a computationally intensive task-agnostic learning system, by using relevant geometry to significantly reduce its processing time and enhance memory efficiency. In the last paper, we address the challenge of reversing the 3D-to-2D projection in a generative setting. By offering multiple potential 3D reconstructions based on a 2D view, we acknowledge the inherent uncertainties of this inversion. Each chapter provides a thorough review of existing literature and outlines potential avenues for future research in the domain.

Official source

https://infoscience.epfl.ch/record/307081?ln=en

About this result

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Incorporating Projective Geometry into Deep Learning

Graph Chatbot

Chat with Graph Search

HYPERBOLA METHOD ON TORIC VARIETIES

Breaking the Curse of Dimensionality in Deep Neural Networks by Learning Invariant Representations

Deep Learning for 3D Surface Modelling and Reconstruction

Breaking the Curse of Dimensionality in Deep Neural Networks by Learning Invariant Representations

Deep Learning for 3D Surface Modelling and Reconstruction

HYPERBOLA METHOD ON TORIC VARIETIES