Covers model-free prediction methods in reinforcement learning, focusing on Monte Carlo and Temporal Differences for estimating value functions without transition dynamics knowledge.
Explores Monte-Carlo methods for reinforcement learning, comparing them with TD-methods and emphasizing the efficiency of TD methods in propagating information.