Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
This lecture introduces variations of the SARSA algorithm, focusing on expected SARSA and Q learning. Expected SARSA updates the policy by averaging over possible next actions, while Q learning updates the policy by considering the maximum possible action. The instructor explains the differences between these variations and how they impact the learning process.