Lecture

Value Iteration Acceleration: PID and Operator Splitting

Description

This lecture explores accelerating the Value Iteration (VI) algorithm for solving sequential decision-making problems with long planning horizons. The instructor presents two innovative ideas: PID VI, which modifies VI using control theoretic tools, and Operator Splitting Value Iteration, which leverages an inaccurate but cheap model to achieve faster convergence. The lecture delves into the dynamics of VI, the challenges of slow convergence, and the proposed solutions. It also discusses the convergence behavior of PID VI and the benefits of using matrix splitting techniques. The presentation concludes with empirical results, demonstrating the effectiveness of the proposed acceleration methods and the potential for future research in combining accurate and inaccurate models.

About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.