Policy Gradient Methods in Reinforcement Learning

In course

This course describes theory and methods for Reinforcement Learning (RL), which revolves around decision making under uncertainty. The course covers classic algorithms in RL as well as recent algorith

Description

This lecture focuses on policy gradient methods within the context of reinforcement learning. It begins with an overview of reinforcement learning approaches, contrasting value-based and policy-based methods. The instructor discusses the optimization formulation for policy-based methods, emphasizing the importance of parameterizing policies for both discrete and continuous actions. Various parameterization techniques, including softmax and neural networks, are introduced. The lecture then delves into the policy gradient method, explaining how to compute gradients using stochastic estimates and the significance of unbiased gradient estimators. The instructor highlights the challenges of high variance in policy gradient methods and introduces techniques to reduce this variance, such as using baseline functions. The lecture concludes with practical examples, including the application of policy gradient methods to the cartpole problem, illustrating how these methods can effectively learn to balance the pole. Overall, the lecture provides a comprehensive understanding of policy gradient methods and their applications in reinforcement learning.

Instructor

Volkan Cevher

Volkan Cevher received the B.Sc. (valedictorian) in electrical engineering from Bilkent University in Ankara, Turkey, in 1999 and the Ph.D. in electrical and computer engineering from the Georgia Institute of Technology in Atlanta, GA in 2005. He was a Research Scientist with the University of Maryland, College Park from 2006-2007 and also with Rice University in Houston, TX, from 2008-2009. Currently, he is an Associate Professor at the Swiss Federal Institute of Technology Lausanne and a Faculty Fellow in the Electrical and Computer Engineering Department at Rice University. His research interests include machine learning, signal processing theory, optimization theory and methods, and information theory. Dr. Cevher is an ELLIS fellow and was the recipient of the Google Faculty Research award in 2018, the IEEE Signal Processing Society Best Paper Award in 2016, a Best Paper Award at CAMSAP in 2015, a Best Paper Award at SPARS in 2009, and an ERC CG in 2016 as well as an ERC StG in 2011.

Official source