Lecture

Policy Gradient Algorithms and V Values

In course

This course provides an overview and introduces modern methods for reinforcement learning (RL.) The course starts with the fundamentals of RL, such as Q-learning, and delves into commonly used approac

Description

This lecture covers the relationship between policy gradient algorithms and V values, explaining how V values can be used to speed up the convergence of the algorithms through active critic networks. It also discusses the calculation of V values in a separate network and the potential sharing of neurons with the actual network.

Instructors (2)

Caglar Gulcehre

Wulfram Gerstner

Wulfram Gerstner is Director of the Laboratory of Computational Neuroscience LCN at the EPFL. His research in computational neuroscience concentrates on models of spiking neurons and spike-timing dependent plasticity, on the problem of neuronal coding in single neurons and populations, as well as on the link between biologically plausible learning rules and behavioral manifestations of learning. He teaches courses for Physicists, Computer Scientists, Mathematicians, and Life Scientists at the EPFL. After studies of Physics in Tübingen and at the Ludwig-Maximilians-University Munich (Master 1989), Wulfram Gerstner spent a year as a visiting researcher in Berkeley. He received his PhD in theoretical physics from the Technical University Munich in 1993 with a thesis on associative memory and dynamics in networks of spiking neurons. After short postdoctoral stays at Brandeis University and the Technical University of Munich, he joined the EPFL in 1996 as assistant professor. Promoted to Associate Professor with tenure in February 2001, he is since August 2006 a full professor with double appointment in the School of Computer and Communication Sciences and the School of Life Sciences. Wulfram Gerstner has been invited speaker at numerous international conferences and workshops. He has served on the editorial board of the Journal of Neuroscience, Network: Computation in Neural Systems', Journal of Computational Neuroscience', and `Science'.

Official source

Related lectures (31)

Conclusions on Statistical Learning Theory

Explores conclusions from statistical learning theory, emphasizing function complexity, generalization, and the bias-variance trade-off.

The Hidden Convex Optimization Landscape of Deep Neural Networks

Explores the hidden convex optimization landscape of deep neural networks, showcasing the transition from non-convex to convex models.

Nonlinear Supervised Learning

Explores the inductive bias of different nonlinear supervised learning methods and the challenges of hyper-parameter tuning.

Neural Networks: Two Layers Neural Network

Covers the basics of neural networks, focusing on the development from two layers neural networks to deep neural networks.

Statistical Physics in Machine Learning: Understanding Deep Learning

Explores the application of statistical physics in understanding deep learning with a focus on neural networks and machine learning challenges.