Lecture

Monte-Carlo Methods for Reinforcement Learning

In course

Enim proident quis duis esse reprehenderit sit. Ea dolore aute nostrud id officia labore enim non exercitation esse duis non veniam. Lorem dolor proident tempor exercitation laboris cillum commodo cupidatat enim consequat nostrud cillum nulla. Et aliquip dolore sint nisi adipisicing id. Commodo nostrud nulla exercitation nulla. Aute veniam esse consectetur elit culpa minim qui minim consectetur incididunt fugiat aliquip.

Description

This lecture introduces Monte-Carlo methods for reinforcement learning, which directly estimate values by averaging over empirically measured returns, contrasting them with TD-methods that exploit the Bellman equation. The lecture covers Monte-Carlo estimation, first-visit MC prediction, Monte-Carlo estimation of Q-values, and Batch-expected SARSA. It also discusses the comparison between SARSA, Monte-Carlo, and Batch-expected-SARSA learning, emphasizing the importance of the empirical Bellman equation. The lecture concludes with a comparison of Monte-Carlo versus batch-TD methods, highlighting the efficiency of TD methods in propagating information back into the graph through the 'bootstrap' step.

Instructor

dolor est ea

Velit do commodo voluptate ad et magna culpa tempor Lorem quis tempor officia aute ex. Eiusmod mollit enim dolore consectetur culpa ullamco elit irure occaecat culpa proident aute ut sint. Incididunt laboris fugiat velit aliqua. Sit elit laborum mollit Lorem dolore consectetur excepteur aliquip pariatur adipisicing officia aliquip anim et. Ipsum elit eiusmod occaecat dolore deserunt quis sint officia eu voluptate cupidatat amet tempor cillum. Consectetur sint do consequat ut eiusmod et nisi qui.

Official source

https://mediaspace.epfl.ch/media/0_ky2g1hu5

About this result

This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.

Related lectures (31)

Graph Chatbot

Chat with Graph Search

Ask any question about EPFL courses, lectures, exercises, research, news, etc. or try the example questions below.

DISCLAIMER: The Graph Chatbot is not programmed to provide explicit or categorical answers to your questions. Rather, it transforms your questions into API requests that are distributed across the various IT services officially administered by EPFL. Its purpose is solely to collect and recommend relevant references to content that you can explore to help you answer your questions.