From High-Level to Low-Level Robot Learning of Complex Tasks: Leveraging Priors, Metrics and Dynamical Systems

Humans have a remarkable way of learning, adapting and mastering new manipulation tasks. With the current advances in Machine Learning (ML), the promise of having robots with such capabilities seems to be on the cusp of reality. Transferring human-level skills to robots, however, is complicated as they involve a level of complexity that cannot be tackled by classical ML methods in an unsupervised way. Such complexities involve: (i) automatically decomposing tasks into control-oriented encodings, (ii) extracting invariances and handling idiosyncrasies of data acquired from human demonstrations, and (iii) learning models that guarantee stability and convergence. In this thesis, we push the boundaries in the learning from demonstration (LfD) domain by addressing these challenges with minimal human intervention and parameter tuning. We introduce novel learning approaches based on Bayesian non-parametrics and kernel-methods while encoding novel metrics and priors, leveraging them with dynamical systems (DS) theory.

In the first part of this thesis we focus on learning complex sequential manipulation tasks from heterogeneous and unstructured demonstrations. Such tasks are those composed of a sequence of discrete actions. The particular challenge is learning these tasks without any prior knowledge on the number of actions (unstructuredness) or restrictions as to how the human is demonstrating the task (heterogeneous), e.g. changes in reference frames. We propose a Bayesian non-parametric learning framework that can jointly segment and discover unique discrete actions (and their sequence) in continuous demonstrations of the task. Hence, we learn an entire task from a continuous, unrestricted natural demonstration. The learned structure of the complex tasks and the segmented data were then used to parametrize a hybrid controller to execute two cooking tasks of increasing complexity: (i) single-arm pizza dough rolling, and (ii) a dual-arm vegetable peeling task.

Throughout this thesis, we assume that both the human and robot motions are driven by autonomous state-dependent DS. Hence, in the second part of this thesis we offer two novel DS formulations to represent and execute a complex task. We begin by proposing a DS-based motion generator formulation and learning scheme that is capable of automatically encoding continuous and complex motions while ensuring global asymptotic stability. The type of tasks that can be learned with this approach are unparalleled to previous work in DS-based LfD and are validated on production line and household activities. Further, we propose a novel DS formulation and learning scheme that can encode both complex motions and varying impedance requirements along the task; i.e., the robot must be compliant in some regions of the task, while stiff in others. This approach is validated on trajectory tracking tasks where a robot arm must precisely draw letters on a surface. Due to the generalization power and straight-forward learning schemes of the proposed DS-based motion generators, we also apply them to more complex application. Such applications include providing adaptive (i) navigation strategies for mobile agents, and (ii) locomotion and co-manipulation tasks of biped robots. These applications are particularly novel in the LfD domain, as most work is solely focused on robotic arm control.

In the last part of the thesis, we explore learning complex behaviors in joint space for single and multi-arm systems.

From High-Level to Low-Level Robot Learning of Complex Tasks: Leveraging Priors, Metrics and Dynamical Systems

Graph Chatbot

Chattez avec Graph Search

Online Multicontact Receding Horizon Planning via Value Function Approximation

Exact Obstacle Avoidance for Robots in Complex and Dynamic Environments Using Local Modulation

Hitting with Different Joints of a Robotic Manipulator

Hitting with Different Joints of a Robotic Manipulator

Exact Obstacle Avoidance for Robots in Complex and Dynamic Environments Using Local Modulation

Online Multicontact Receding Horizon Planning via Value Function Approximation