Temporal difference learning

About
Privacy
Disclaimer

Graph Chatbot

Related lectures (30)

Page 3 of 3

Covers Q-Learning, a model-free reinforcement learning algorithm, and its application to Tic-Tac-Toe with examples and quizzes.

Comparison n-step SARSA and eligibility traces

Presents a quiz comparing the n-step SARSA algorithm with SARSA using eligibility traces.

Optimal Control Theory: Basics

Covers the fundamentals of optimal control theory, focusing on defining OCPs, existence of solutions, performance criteria, physical constraints, and the principle of optimality.

Learning Agents: Exploration-Exploitation Tradeoff

Explores the exploration-exploitation tradeoff in learning unknown effects of actions using multi-armed bandits and Q-learning.

Collective Learning Dynamics: Similarity Exploitation

Delves into collective learning dynamics with similarity exploitation, covering structured learning, adaptive frameworks, modeling, simulation, and experimental results.

First steps toward deep reinforcement learning

Explores the shift to deep reinforcement learning through neural networks for direct policy learning, bypassing Q-values and V-values.

Acquiring Data for Learning: Modern Approaches and Challenges

Explores modern approaches and challenges in acquiring data for learning optimal controllers through demonstrations and data-driven methods.

Elements of Reinforcement Learning

Introduces the fundamental elements of Reinforcement Learning and demonstrates their application with the Acrobot system.

Safe Learning and Control

Explores safe learning, control, multi-agent coordination, and Nash equilibrium convergence in intelligent systems.

Reinforcement Learning: Basics and Applications

Covers the basics of reinforcement learning, including trial-and-error learning, Q-learning, deep RL, and applications in gaming and planning.