Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
This lecture covers a two-line proof of the convergence in expectation for the learning rule used in reinforcement learning with a 1-step horizon, demonstrating that the empirical estimate of the Q value converges to the real Q value.