**Are you an EPFL student looking for a semester project?**

Work with us on data science and visualisation projects, and deploy your project as an app on top of GraphSearch.

Publication# Progressive Correspondence Pruning by Consensus Learning

Abstract

Correspondence pruning aims to correctly remove false matches (outliers) from an initial set of putative correspondences. The pruning process is challenging since putative matches are typically extremely unbalanced, largely dominated by outliers, and the random distribution of such outliers further complicates the learning process for learning-based methods. To address this issue, we propose to progressively prune the correspondences via a local-to-global consensus learning procedure. We introduce a "pruning" block that lets us identify reliable candidates among the initial matches according to consensus scores estimated using local-to-global dynamic graphs. We then achieve progressive pruning by stacking multiple pruning blocks sequentially. Our method outperforms state-of-the-arts on robust line fitting, camera pose estimation and retrieval-based image localization benchmarks by significant margins and shows promising generalization ability to different datasets and detector/descriptor combinations.

Official source

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related concepts

Loading

Related publications

Loading

Related concepts (8)

Related publications (2)

3D pose estimation

3D pose estimation is a process of predicting the transformation of an object from a user-defined reference pose, given an image or a 3D scan. It arises in computer vision or robotics where the pose

Probability distribution

In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mat

Learning

Learning is the process of acquiring new understanding, knowledge, behaviors, skills, values, attitudes, and preferences. The ability to learn is possessed by humans, animals, and some machines; th

Loading

Loading

In recent years, Machine Learning based Computer Vision techniques made impressive progress. These algorithms proved particularly efficient for image classification or detection of isolated objects. From a probabilistic perspective, these methods can predict marginals, over single or multiple variables, independently, with high accuracy.
However, in many tasks of practical interest, we need to predict jointly several correlated variables.
Practical applications include people detection in crowded scenes, image segmentation, surface reconstruction, 3D pose estimation and others. A large part of the research effort in today's computer-vision community aims at finding task-specific solutions to these problems, while leveraging the power of Deep-Learning based classifiers. In this thesis, we present our journey towards a generic and practical solution based on mean-field (MF) inference.
Mean-field is a Statistical Physics-inspired method which has long been used in Computer-Vision as a variational approximation to posterior distributions over complex Conditional Random Fields. Standard
mean-field optimization is based on coordinate descent
and in many situations can be impractical.
We therefore propose a novel proximal gradient-based
approach to optimizing the variational objective. It
is naturally parallelizable and easy to implement.
We prove its convergence, and then demonstrate that, in
practice, it yields faster convergence and often finds better
optima than more traditional mean-field optimization techniques.
Then, we show that we can replace the fully factorized distribution of mean-field by a weighted mixture of such distributions, that similarly minimizes the KL-Divergence to the true posterior. Our extension of the clamping method proposed in previous works allows us to both produce a more descriptive approximation of the true posterior and, inspired by the diverse MAP paradigms, fit a mixture of mean-field approximations. We demonstrate that this positively impacts real-world algorithms that initially relied on mean-fields.
One of the important properties of the mean-field inference algorithms is that the closed-form updates are fully differentiable operations. This naturally allows to do parameter learning by simply unrolling multiple iterations of the updates, the so-called back-mean-field algorithm. We derive a novel and efficient structured learning method for multi-modal posterior distribution based on the Multi-Modal Mean-Field approximation, which can be seamlessly combined to modern gradient-based learning methods such as CNNs.
Finally, we explore in more details the specific problem of structured learning and prediction for multiple-people detection in crowded scenes. We then present a mean-field based structured deep-learning detection algorithm that provides state of the art results on this dataset.

Many robotics problems are formulated as optimization problems. However, most optimization solvers in robotics are locally optimal and the performance depends a lot on the initial guess. For challenging problems, the solver will often get stuck at poor local optima without a good initialization. In this thesis, we consider various techniques to provide a good initial guess to the solver based on previous experience. We use the term memory of motion to collectively refer to these techniques. The key idea is to use the existing system models, cost functions, and simulation tools to generate a database of solutions, and then construct a memory of motion model. During online execution, we can then query the initial guess of a given task from the memory of motion. We show that it improves the solver performance in terms of the solution quality, the success rates, and the computation time. We consider two different formulations, i.e., supervised learning and probability density estimation. In the first part, we formulate a regression problem to find the mapping between the task parameters and the solutions. Such a formulation is convenient, as there are a lot of function approximations available, but using them as a black box tool may result in poor predictions. It is especially the case for multimodal problems where there can be several different solutions for a given task and standard function approximators will simply average the different modes. We first propose an ensemble of function approximators that can handle multimodal problems to initialize an optimization-based motion planner. We then investigate the problem of initializing an optimal control solver for legged robot locomotion, where we need to also provide the initial guess of the control sequence. We evaluate the effect of different initialization components on the optimal control solver performance. In the second part, we consider another formulation by first transforming the cost function into an unnormalized Probability Density Function (PDF) and approximating it using various models. This formulation addresses several shortcomings of the supervised learning approaches by using the cost function itself to train or construct the predictive model. It allows us to generate initial guesses that have high probabilities of having low-cost values instead of simply imitating the dataset. We first show that we can obtain a trajectory distribution of an iLQR problem as a Gaussian distribution, and tracking this distribution results in a cost-efficient and robust controller. We then propose a generative adversarial framework to learn the distribution of robot configurations under constraints. Finally, we use tensor methods to approximate the unnormalized PDF. Since it does not rely on gradient information, the method is quite robust in finding the (possibly multiple) global optima or at least the good local optima of various challenging problems including some benchmark optimization functions, inverse kinematics, and motion planning.