This lecture covers Q-Learning, a model-free reinforcement learning algorithm. It explains how Q-Learning estimates action values, stops at convergence, and compares to Monte Carlo Estimation. The application to Tic-Tac-Toe is discussed with examples and quizzes.
This video is available exclusively on Mediaspace for a restricted audience. Please log in to MediaSpace to access it if you have the necessary permissions.
Watch on Mediaspace