Publication

Robust Reinforcement Learning via Adversarial training with Langevin Dynamics

Volkan Cevher, Paul Thierry Yves Rolland, Ya-Ping Hsieh, Yu-Ting Huang, Kamalaruban Parameswaran, Cheng Shi
2020
Rapport ou document de travail

Résumé

We introduce a sampling perspective to tackle the challenging task of training robust Reinforcement Learning (RL) agents. Leveraging the powerful Stochastic Gradient Langevin Dynamics, we present a novel, scalable two-player RL algorithm, which is a sampling variant of the two-player policy gradient method. Our algorithm consistently outperforms existing baselines, in terms of generalization across different training and testing conditions, on several MuJoCo environments. Our experiments also show that, even for objective functions that entirely ignore potential environmental shifts, our sampling approach remains highly robust in comparison to standard RL algorithms.

Source officielle

https://infoscience.epfl.ch/record/274660?ln=fr

À propos de ce résultat

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Graph Chatbot

Chattez avec Graph Search

Posez n’importe quelle question sur les cours, conférences, exercices, recherches, actualités, etc. de l’EPFL ou essayez les exemples de questions ci-dessous.

AVERTISSEMENT : Le chatbot Graph n'est pas programmé pour fournir des réponses explicites ou catégoriques à vos questions. Il transforme plutôt vos questions en demandes API qui sont distribuées aux différents services informatiques officiellement administrés par l'EPFL. Son but est uniquement de collecter et de recommander des références pertinentes à des contenus que vous pouvez explorer pour vous aider à répondre à vos questions.

Connectez-vous pour utiliser Chat avec Graph Search

Robust Reinforcement Learning via Adversarial training with Langevin Dynamics

Graph Chatbot

Chattez avec Graph Search

Residual-based attention in physics-informed neural networks

Fusing Pre-existing Knowledge and Machine Learning for Enhanced Building Thermal Modeling and Control

Reinforcement learning approach to control an inverted pendulum: A general framework for educational purposes

Residual-based attention in physics-informed neural networks

Fusing Pre-existing Knowledge and Machine Learning for Enhanced Building Thermal Modeling and Control

Reinforcement learning approach to control an inverted pendulum: A general framework for educational purposes