Êtes-vous un étudiant de l'EPFL à la recherche d'un projet de semestre?
Travaillez avec nous sur des projets en science des données et en visualisation, et déployez votre projet sous forme d'application sur Graph Search.
Programming intelligent robots requires robust controllers that can achieve desired tasks while adapting to the changes in the task and the environment. In this thesis, we address the challenges in designing such adaptive and anticipatory feedback controllers in robot manipulation tasks from the perspective of two main approaches: optimization and learning. Optimization methods determine a feedback or feedforward controller that achieves tasks by using a model of the task and the dynamics of the tasks. An optimization expert is often required for tuning, modeling, and solver selection. On the other hand, learning from demonstration (LfD) is an intuitive way of programming robots by showing them demonstrations of how to achieve a task. It requires an expert who can demonstrate the task, without the need for coding or modeling.Many existing solvers for optimization problems in robotics are not easily adaptable, difficult to implement, and/or require tuning effort. This prevents their wide adoption, benchmarking, and potential future contributions to the solvers, as well as their direct real-time applications. Furthermore, they do not fully exploit the geometric structures that we often have in robotic tasks. In the first part of the thesis, we address these challenges by proposing a projection-based first-order optimization solver for robotics problems with geometric constraints. We show that Euclidean projections onto the manifold defined by these geometric shapes can significantly improve performance even when compared to second-order methods.The adaptive behavior of the feedback control gain matrices found by optimal control is under-exploited in robotics. However, they are known to contain important local information about the dynamics task. We extend the system level synthesis (SLS) framework to build novel capabilities of these gains. In particular, we show that this anticipatory feedback controller with memory can remember and act on past states, which is crucial for tasks with time correlations. We further exploit their capabilities in the real-time adaptation of the task parameters using local information from the optimization and in hierarchical optimal control using the redundancies at the planning level.In the second part of the thesis, we address the challenges in modeling for optimization by turning our attention to learning methods. Several open questions are discussed in this thesis: 1. What to model? (or What to learn?); 2. How to execute these models on the real robot?; 3. How to demonstrate?; and 4. How many times to demonstrate? We propose two different ways of exploiting demonstrations for designing feedback controllers. The first method uses demonstrations to warm-start and guides an optimal control problem of planar pushing which otherwise can get stuck at local optima. The second method proposes an adaptive impedance controller that can mimic the generalization and multimodality capabilities of a learned trajectory policy. We then investigate the epistemic uncertainties in such policies and provide an active learning method to refine them iteratively.
Dario Floreano, Valentin Wüest, Davide Scaramuzza