Publication

SynDeMo: Synergistic Deep Feature Alignment for Joint Learning of Depth and Ego-Motion

Related concepts (32)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.

Supervised learning

Supervised learning (SL) is a paradigm in machine learning where input objects (for example, a vector of predictor variables) and a desired output value (also known as human-labeled supervisory signal) train a model. The training data is processed, building a function that maps new data on expected output values. An optimal scenario will allow for the algorithm to correctly determine output values for unseen instances. This requires the learning algorithm to generalize from the training data to unseen situations in a "reasonable" way (see inductive bias).

Convolutional neural network

Convolutional neural network (CNN) is a regularized type of feed-forward neural network that learns feature engineering by itself via filters (or kernel) optimization. Vanishing gradients and exploding gradients, seen during backpropagation in earlier neural networks, are prevented by using regularized weights over fewer connections. For example, for each neuron in the fully-connected layer 10,000 weights would be required for processing an image sized 100 × 100 pixels.

Labeled data

Labeled data is a group of samples that have been tagged with one or more labels. Labeling typically takes a set of unlabeled data and augments each piece of it with informative tags. For example, a data label might indicate whether a photo contains a horse or a cow, which words were uttered in an audio recording, what type of action is being performed in a video, what the topic of a news article is, what the overall sentiment of a tweet is, or whether a dot in an X-ray is a tumor.

Connected-component labeling

Connected-component labeling (CCL), connected-component analysis (CCA), blob extraction, region labeling, blob discovery, or region extraction is an algorithmic application of graph theory, where subsets of connected components are uniquely labeled based on a given heuristic. Connected-component labeling is not to be confused with . Connected-component labeling is used in computer vision to detect connected regions in s, although s and data with higher dimensionality can also be processed.

Image rectification

Image rectification is a transformation process used to project images onto a common image plane. This process has several degrees of freedom and there are many strategies for transforming images to the common plane. Image rectification is used in computer stereo vision to simplify the problem of finding matching points between images (i.e. the correspondence problem), and in geographic information systems to merge images taken from multiple perspectives into a common map coordinate system.

Multistable perception

Multistable perception (or bistable perception) is a perceptual phenomenon in which an observer experiences an unpredictable sequence of spontaneous subjective changes. While usually associated with visual perception (a form of optical illusion), multistable perception can also be experienced with auditory and olfactory percepts. Perceptual multistability can be evoked by visual patterns that are too ambiguous for the human visual system to definitively and uniquely interpret.

Vergence-accommodation conflict

Vergence-accommodation conflict (VAC), also known as accommodation-vergence conflict, is a visual phenomenon that occurs when the brain receives mismatching cues between vergence and accommodation of the eye. This commonly occurs in virtual reality devices, augmented reality devices, 3D movies, and other types of stereoscopic displays and autostereoscopic displays. The effect can be unpleasant and cause eye strain. Two main ocular responses can be distinguished - vergence of eyes and accommodation.

Imaging radar

Imaging radar is an application of radar which is used to create two-dimensional s, typically of landscapes. Imaging radar provides its light to illuminate an area on the ground and take a picture at radio wavelengths. It uses an antenna and digital computer storage to record its images. In a radar image, one can see only the energy that was reflected back towards the radar antenna. The radar moves along a flight path and the area illuminated by the radar, or footprint, is moved along the surface in a swath, building the image as it does so.

Head-mounted display

A head-mounted display (HMD) is a display device, worn on the head or as part of a helmet (see Helmet-mounted display for aviation applications), that has a small display optic in front of one (monocular HMD) or each eye (binocular HMD). An HMD has many uses including gaming, aviation, engineering, and medicine. Virtual reality headsets are HMDs combined with IMUs. There is also an optical head-mounted display (OHMD), which is a wearable display that can reflect projected images and allows a user to see through it.

Depth-first search

Depth-first search (DFS) is an algorithm for traversing or searching tree or graph data structures. The algorithm starts at the root node (selecting some arbitrary node as the root node in the case of a graph) and explores as far as possible along each branch before backtracking. Extra memory, usually a stack, is needed to keep track of the nodes discovered so far along a specified branch which helps in backtracking of the graph. A version of depth-first search was investigated in the 19th century by French mathematician Charles Pierre Trémaux as a strategy for solving mazes.