Exploration-based model learning with self-attention for risk-sensitive robot control

Model-based reinforcement learning for robot control offers the advantages of overcoming concerns on data collection and iterative processes for policy improvement in model-free methods. However, both methods use exploration strategy relying on heuristics that involve inherent randomness, which may cause instability or malfunction of the target system and render the system susceptible to external perturbations. In this paper, we propose an online model update algorithm that can be directly operated in real-world robot systems. The algorithm leverages a self-attention mechanism embedded in neural networks for the kinematics and the dynamics models of the target system. The approximated model involves redundant self-attention paths to the time-independent kinematics and dynamics models, allowing us to detect abnormalities by calculating the trace values of the self-attention matrices. This approach reduces the randomness during the exploration process and enables the detection and rejection of detected perturbations while updating the model. We validate the proposed method in simulation and with real-world robot systems in three application scenarios: path tracking of a soft robotic manipulator, kinesthetic teaching and behavior cloning of an industrial robotic arm, and gait generation of a legged robot. All of these demonstrations are achieved without the aid of simulation or prior knowledge of the models, which supports the proposed method’s universality for various robotics applications.

Exploration-based model learning with self-attention for risk-sensitive robot control

Graph Chatbot

Chattez avec Graph Search

Robot Learning using Tensor Networks

Hitting with Different Joints of a Robotic Manipulator

Online Multicontact Receding Horizon Planning via Value Function Approximation

Robot Learning using Tensor Networks

Hitting with Different Joints of a Robotic Manipulator

Online Multicontact Receding Horizon Planning via Value Function Approximation