Publication

Incorporating Projective Geometry into Deep Learning

Michal Jan Tyszkiewicz
2024
EPFL thesis
Abstract

In this thesis we explore the applications of projective geometry, a mathematical theory of the relation between 3D scenes and their 2D images, in modern learning-based computer vision systems. This is an interesting research question which contradicts the recent trend to forgo such domain knowledge in favor of learning everything directly from data. We show how to use these robust mathematics where applicable while maximally leveraging data for the remaining aspects.The thesis extends three peer-reviewed papers. In the first, we introduce an algorithm to extract local image features, a technique of matching related regions across images. Unlike in standard supervised learning, we do not define the features through examples but rather their desired properties. We leave it to the training procedure to find a conforming algorithm. This shows an application of projective geometry for supervision of neural networks. We then turn to two cases of using projective geometry in the network architecture. In one, we present a method to deduce indoor scene layouts from video walkthroughs. We constrain the Transformer, a computationally intensive task-agnostic learning system, by using relevant geometry to significantly reduce its processing time and enhance memory efficiency. In the last paper, we address the challenge of reversing the 3D-to-2D projection in a generative setting. By offering multiple potential 3D reconstructions based on a 2D view, we acknowledge the inherent uncertainties of this inversion. Each chapter provides a thorough review of existing literature and outlines potential avenues for future research in the domain.

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related concepts (34)
Deep learning
Deep learning is part of a broader family of machine learning methods, which is based on artificial neural networks with representation learning. The adjective "deep" in deep learning refers to the use of multiple layers in the network. Methods used can be either supervised, semi-supervised or unsupervised.
Projective geometry
In mathematics, projective geometry is the study of geometric properties that are invariant with respect to projective transformations. This means that, compared to elementary Euclidean geometry, projective geometry has a different setting, projective space, and a selective set of basic geometric concepts. The basic intuitions are that projective space has more points than Euclidean space, for a given dimension, and that geometric transformations are permitted that transform the extra points (called "points at infinity") to Euclidean points, and vice-versa.
Machine learning
Machine learning (ML) is an umbrella term for solving problems for which development of algorithms by human programmers would be cost-prohibitive, and instead the problems are solved by helping machines 'discover' their 'own' algorithms, without needing to be explicitly told what to do by any human-developed algorithms. Recently, generative artificial neural networks have been able to surpass results of many previous approaches.
Show more
Related publications (48)

HYPERBOLA METHOD ON TORIC VARIETIES

Marta Pieropan

We develop a very general version of the hyperbola method which extends the known method by Blomer and Brudern for products of projective spaces to complete smooth split toric varieties. We use it to count Campana points of bounded log-anticanonical height ...
Palaiseau2024

Breaking the Curse of Dimensionality in Deep Neural Networks by Learning Invariant Representations

Leonardo Petrini

Artificial intelligence, particularly the subfield of machine learning, has seen a paradigm shift towards data-driven models that learn from and adapt to data. This has resulted in unprecedented advancements in various domains such as natural language proc ...
EPFL2023

Deep Learning for 3D Surface Modelling and Reconstruction

Benoît Alain René Guillard

In recent years, there has been a significant revolution in the field of deep learning, which has demonstrated its effectiveness in automatically capturing intricate patterns from large datasets. However, the majority of these successes in Computer Vision ...
EPFL2023
Show more
Related MOOCs (32)
Neuronal Dynamics - Computational Neuroscience of Single Neurons
The activity of neurons in the brain and the code used by these neurons is described by mathematical neuron models at different levels of detail.
Neuronal Dynamics - Computational Neuroscience of Single Neurons
The activity of neurons in the brain and the code used by these neurons is described by mathematical neuron models at different levels of detail.
Neuronal Dynamics 2- Computational Neuroscience: Neuronal Dynamics of Cognition
This course explains the mathematical and computational models that are used in the field of theoretical neuroscience to analyze the collective dynamics of thousands of interacting neurons.
Show more

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.