This lecture covers the theory of Reinforcement Learning, focusing on the Exploration/Exploitation dilemma, Temporal Difference Learning, and Eligibility Traces in continuous state/action spaces. It discusses the challenges of estimating reward probabilities and the strategies to balance exploration to estimate rewards and exploitation to maximize rewards.