Covers model-free prediction methods in reinforcement learning, focusing on Monte Carlo and Temporal Differences for estimating value functions without transition dynamics knowledge.
Covers the basics of reinforcement learning, including Markov Decision Processes and policy gradient methods, and explores real-world applications and recent advances.