Value Iteration Acceleration: PID and Operator Splitting

This lecture explores accelerating the Value Iteration (VI) algorithm for solving sequential decision-making problems with long planning horizons. The instructor presents two innovative ideas: PID VI, which modifies VI using control theoretic tools, and Operator Splitting Value Iteration, which leverages an inaccurate but cheap model to achieve faster convergence. The lecture delves into the dynamics of VI, the challenges of slow convergence, and the proposed solutions. It also discusses the convergence behavior of PID VI and the benefits of using matrix splitting techniques. The presentation concludes with empirical results, demonstrating the effectiveness of the proposed acceleration methods and the potential for future research in combining accurate and inaccurate models.