**Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?**

Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur GraphSearch.

Publication# Trust-region methods based on radial basis functions with application to biomedical imaging

Résumé

We have developed a new derivative-free algorithm based on Radial Basis Functions (RBFs). Derivative-free optimization is an active field of research and several algorithms have been proposed recently. Problems of this nature in the industrial setting are quite frequent. The reason is that in a number of applications the optimization process contains simulation packages which are treated as black boxes. The development of our own algorithm was originally motivated by an application in biomedical imaging: the medical image registration problem. The particular characteristics of this problem have incited us to develop a new optimization algorithm based on trust-region methods. However it has been designed to be generic and to be applied to a wide range of problems. The main originality of our approach is the use of RBFs to build the models. In particular we have adapted the existing theory based on quadratic models to our own models and developed new procedures especially designed for models based on RBFs. We have tested our algorithm called BOOSTERS against state-of-the-art methods (UOBYQA, NEWUOA, DFO). On the medical image registration problem, BOOSTERS appears to be the method of choice. The tests on problems from the CUTEr collection show that BOOSTERS is comparable to, but not better than other methods on small problems (size 2-20). It is performing very well for medium size problems (20-80). Moreover, it is able to solve problems of dimension 200, which is considered very large in derivative-free optimization. We have also developed a new class of algorithms combining the robustness of derivative-free algorithms with the faster rate of convergence characterizing Newtonlike-methods. In fact, they define a new class of algorithms lying between derivative-free optimization and quasi-Newton methods. These algorithms are built on the skeleton of our derivative-free algorithm but they can incorporate the gradient when it is available. They can be interpreted as a way of doping derivative-free algorithms with derivatives. If the derivatives are available at each iteration, then our method can be seen as an alternative to quasi-Newton methods. At the opposite, if the derivatives are never evaluated, then the algorithm is totally similar to BOOSTERS. It is a very interesting alternative to existing methods for problems whose objective function is expensive to evaluate and when the derivatives are not available. In this situation, the gradient can be approximated by finite differences and its costs corresponds to n additional function evaluations assuming that Rn is the domain of definition of the objective function. We have compared our method with CFSQP and BTRA, two gradient-based algorithms, and the results show that our doped method performs best. We have also a theoretical analysis of the medical image registration problem based on maximization of mutual information. Most of the current research in this field is concentrated on registration based on nonlinear image transformation. However, little attention has been paid to the theoretical properties of the optimization problem. In our analysis, we focus on the continuity and the differentiability of the objective function. We show in particular that performing a registration without extension of the reference image may lead to discontinuities in the objective function. But we demonstrate that, under some mild assumptions, the function is differentiable almost everywhere. Our analysis is important from an optimization point of view and conditions the choice of a solver. The usual practice is to use generic optimization packages without worrying about the differentiability of the objective function. But the use of gradient-based methods when the objective function is not differentiable may result in poor performance or even in absence of convergence. One of our objectives with this analysis is also that practitioners become aware of these problems and to propose them new algorithms having a potential interest for their applications.

Official source

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Concepts associés

Chargement

Publications associées

Chargement

Concepts associés (22)

Résolution de problème

vignette|Résolution d'un problème mathématique.
La résolution de problème est le processus d'identification puis de mise en œuvre d'une solution à un problème.
Méthodologie
Dans l'ind

Algorithme

thumb|Algorithme de découpe d'un polygone quelconque en triangles (triangulation).
Un algorithme est une suite finie et non ambiguë d'instructions et d’opérations permettant de résoudre une classe de

Optimisation (mathématiques)

L'optimisation est une branche des mathématiques cherchant à modéliser, à analyser et à résoudre analytiquement ou numériquement les problèmes qui consistent à minimiser ou maximiser une fonction sur

Publications associées (130)

Chargement

Chargement

Chargement

In recent years, Machine Learning based Computer Vision techniques made impressive progress. These algorithms proved particularly efficient for image classification or detection of isolated objects. From a probabilistic perspective, these methods can predict marginals, over single or multiple variables, independently, with high accuracy.
However, in many tasks of practical interest, we need to predict jointly several correlated variables.
Practical applications include people detection in crowded scenes, image segmentation, surface reconstruction, 3D pose estimation and others. A large part of the research effort in today's computer-vision community aims at finding task-specific solutions to these problems, while leveraging the power of Deep-Learning based classifiers. In this thesis, we present our journey towards a generic and practical solution based on mean-field (MF) inference.
Mean-field is a Statistical Physics-inspired method which has long been used in Computer-Vision as a variational approximation to posterior distributions over complex Conditional Random Fields. Standard
mean-field optimization is based on coordinate descent
and in many situations can be impractical.
We therefore propose a novel proximal gradient-based
approach to optimizing the variational objective. It
is naturally parallelizable and easy to implement.
We prove its convergence, and then demonstrate that, in
practice, it yields faster convergence and often finds better
optima than more traditional mean-field optimization techniques.
Then, we show that we can replace the fully factorized distribution of mean-field by a weighted mixture of such distributions, that similarly minimizes the KL-Divergence to the true posterior. Our extension of the clamping method proposed in previous works allows us to both produce a more descriptive approximation of the true posterior and, inspired by the diverse MAP paradigms, fit a mixture of mean-field approximations. We demonstrate that this positively impacts real-world algorithms that initially relied on mean-fields.
One of the important properties of the mean-field inference algorithms is that the closed-form updates are fully differentiable operations. This naturally allows to do parameter learning by simply unrolling multiple iterations of the updates, the so-called back-mean-field algorithm. We derive a novel and efficient structured learning method for multi-modal posterior distribution based on the Multi-Modal Mean-Field approximation, which can be seamlessly combined to modern gradient-based learning methods such as CNNs.
Finally, we explore in more details the specific problem of structured learning and prediction for multiple-people detection in crowded scenes. We then present a mean-field based structured deep-learning detection algorithm that provides state of the art results on this dataset.

Nicolas Boumal, Christopher Arnold Criscitiello

We describe the first gradient methods on Riemannian manifolds to achieve accelerated rates in the non-convex case. Under Lipschitz assumptions on the Riemannian gradient and Hessian of the cost function, these methods find approximate first-order critical points faster than regular gradient descent. A randomized version also finds approximate second-order critical points. Both the algorithms and their analyses build extensively on existing work in the Euclidean case. The basic operation consists in running the Euclidean accelerated gradient descent method (appropriately safe-guarded against non-convexity) in the current tangent space, then moving back to the manifold and repeating. This requires lifting the cost function from the manifold to the tangent space, which can be done for example through the Riemannian exponential map. For this approach to succeed, the lifted cost function (called the pullback) must retain certain Lipschitz properties. As a contribution of independent interest, we prove precise claims to that effect, with explicit constants. Those claims are affected by the Riemannian curvature of the manifold, which in turn affects the worst-case complexity bounds for our optimization algorithms.

This thesis presents the development of a new multi-objective optimisation tool and applies it to a number of industrial problems related to optimising energy systems. Multi-objective optimisation techniques provide the information needed for detailed analyses of design trade-offs between conflicting objectives. For example, if a product must be both inexpensive and high quality, the multi-objective optimiser will provide a range of optimal options from the cheapest (but lowest quality) alternative to the highest quality (but most expensive), and a range of designs in between – those that are the most interesting to the decision-maker. The optimisation tool developed is the queueing multi-objective optimiser (QMOO), an evolutionary algorithm (EA). EAs are particularly suited to multi-objective optimisation because they work with a population of potential solutions, each representing a different trade-off between objectives. EAs are ideal to energy system optimisation because problems from that domain are often non-linear, discontinuous, disjoint, and multi-modal. These features make energy system optimisation problems difficult to resolve with other optimisation techniques. QMOO has several features that improve its performance on energy systems problems – features that are applicable to a wide range of optimisation problems. QMOO uses cluster analysis techniques to identify separate local optima simultaneously. This technique preserves diversity and helps convergence to difficult-to-find optima. Once normal dominance relations no longer discriminate sufficiently between population members certain individuals are chosen and removed from the population. Careful choice of the individuals to be removed ensures that convergence continues throughout the optimisation. Preserving of the "tail regions" of the population helps the algorithm to explore the full extent of the problem's optimal regions. QMOO is applied to a number of problems: coke factory placement in Shanxi Province, China; choice of heat recovery system operating temperatures; design of heat-exchanger networks; hybrid vehicle configuration; district heating network design, and others. Several of the problems were optimised previously using single-objective EAs. QMOO proved capable of finding entire ranges of solutions faster than the earlier methods found a single solution. In most cases, QMOO successfully optimises the problems without requiring any specific tuning to each problem. QMOO is also tested on a number of test problems found in the literature. QMOO's techniques for improving convergence prove effective on these problems, and its non-tuned performance is excellent compared to other algorithms found in the literature.