Are you an EPFL student looking for a semester project?
Work with us on data science and visualisation projects, and deploy your project as an app on top of Graph Search.
This lecture covers the implementation of reactive agents that learn from observations, focusing on the exploration-exploitation tradeoff in learning unknown effects of actions. It discusses scenarios where an adversary can influence the world and techniques to develop robust strategies. Topics include multi-armed bandits, Q-learning, contextual bandits, and strategies like epsilon-greedy, Thompson sampling, and regret matching. The lecture also explores the challenges of learning with state transitions and the use of deep Q-learning and experience replay to address them.