Lecture

Proximal Policy Optimization for Continuous Control

In course

Est ullamco incididunt in mollit ex Lorem do elit mollit. Id et nostrud cupidatat do nulla id laborum sunt amet amet dolore veniam culpa aliqua. Aliquip deserunt Lorem laboris laborum anim.

Description

This lecture covers Proximal Policy Optimization (PPO) for continuous control in deep reinforcement learning. It explains the challenges of applying standard policy gradient methods and introduces the idea of PPO to address stability and sample efficiency issues. The lecture delves into the concept of maximizing a surrogate objective function, comparing TRPO and PPO-CLIP approaches. It also discusses Advantage Actor-Critic (A2C) algorithms for improving training stability and efficiency. The instructor emphasizes the importance of updating policy gradients with a fixed learning rate to ensure positive progress. The lecture concludes with a summary highlighting the benefits of using surrogate objectives in policy gradient methods.

Instructor

qui deserunt

Veniam eu fugiat enim et tempor deserunt. Eu consectetur eiusmod quis mollit est ut nulla laborum nostrud quis qui. Dolore sint adipisicing non velit in culpa consectetur culpa sunt non cupidatat ea. Anim ea ex sint enim tempor nostrud est minim nulla veniam ipsum. Consectetur cupidatat cupidatat nisi nisi exercitation sint nostrud cillum officia magna sint sunt.

Official source

https://mediaspace.epfl.ch/media/0_4fzvkzp4

About this result

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related lectures (32)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.