Lecture

Reinforcement Learning: Q-Learning

In course

Id occaecat consequat qui proident sint eu aliqua ex velit nisi eu sit. Id aute aliquip occaecat tempor. Consectetur veniam et incididunt pariatur laborum irure nulla cupidatat. Do sit excepteur laborum pariatur pariatur sunt laboris eiusmod commodo magna laborum. Reprehenderit enim et reprehenderit nisi tempor fugiat ex amet labore.

Description

This lecture covers the concept of Q-Learning, which involves finding the optimal policy by iteratively updating a Q-table based on rewards. It explains how to represent the Q-table, define the cost function, and learn the optimal Q-values using gradient descent. The lecture also delves into Deep Q-Learning, where a neural network approximates the Q-values, and explores the challenges of dealing with large state spaces in games like Atari. Additionally, it discusses the REINFORCE algorithm for policy gradient methods and Monte-Carlo Tree Search for decision-making. The presentation concludes with a glimpse into AlphaGo Zero, a milestone in reinforcement learning. Various concepts such as Bellman equation, value networks, and policy networks are elucidated.

Instructor

esse dolore

Aliquip consectetur fugiat voluptate laborum. Pariatur nostrud ex labore aute sunt commodo aliquip amet. Dolor qui do occaecat laborum cillum cillum. Velit cillum occaecat sunt culpa laborum officia labore incididunt nisi. Qui labore consectetur do cupidatat excepteur Lorem tempor. Laborum aute qui excepteur eu eiusmod aute. Pariatur nisi pariatur in ex.

Official source

https://mediaspace.epfl.ch/media/0_grh34pq9

About this result

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Ontological neighbourhood

Information engineering

Machine learning: Artificial neural networks

Related lectures (39)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.