Séance de cours

Bandits à bras multiples : regrets et exploration

Dans cours

We discuss a set of topics that are important for the understanding of modern data science but that are typically not taught in an introductory ML course. In particular we discuss fundamental ideas an

Description

Cette séance de cours se penche sur le concept de regret dans les problèmes de bandit multi-bras, explorant le compromis entre l'exploration et l'exploitation. L'instructeur explique comment calculer le regret attendu au fil du temps, en soulignant l'importance de l'écart entre les choix optimaux. La séance de cours couvre l'impact de l'horizon temporel sur la prise de décision et introduit des limites de concentration pour les probabilités de queue. La discussion s'étend aux variables aléatoires gaussiennes, aux fonctions génératrices de moment et à la limite de désactivation. L'instructeur met en évidence les défis de l'exploration et de l'exploitation, en mettant en évidence les implications pour les applications du monde réel comme la publicité sur Internet. La séance de cours se termine par des allusions à des sujets futurs, y compris les concepts théoriques de l'information et les extensions pratiques des algorithmes de bandits.

Enseignants (2)

Michael Christoph Gastpar

Michael Gastpar is a (full) Professor at EPFL. From 2003 to 2011, he was a professor at the University of California at Berkeley, earning his tenure in 2008. He received his Dipl. El.-Ing. degree from ETH Zürich, Switzerland, in 1997 and his MS degree from the University of Illinois at Urbana-Champaign, IL, USA, in 1999. He defended his doctoral thesis at EPFL on Santa Claus day, 2002. He was also a (full) Professor at Delft University of Technology, The Netherlands. His research interests are in network information theory and related coding and signal processing techniques, with applications to sensor networks and neuroscience. He is a Fellow of the IEEE. He is the co-recipient of the 2013 Communications Society & Information Theory Society Joint Paper Award. He was an Information Theory Society Distinguished Lecturer (2009-2011). He won an ERC Starting Grant in 2010, an Okawa Foundation Research Grant in 2008, an NSF CAREER award in 2004, and the 2002 EPFL Best Thesis Award. He has served as an Associate Editor for Shannon Theory for the IEEE Transactions on Information Theory (2008-11), and as Technical Program Committee Co-Chair for the 2010 International Symposium on Information Theory, Austin, TX.

Rüdiger Urbanke

Rüdiger L. Urbanke obtained his Dipl. Ing. degree from the Vienna University of Technology, Austria in 1990 and the M.Sc. and PhD degrees in Electrical Engineering from Washington University in St. Louis, MO, in 1992 and 1995, respectively. He held a position at the Mathematics of Communications Department at Bell Labs from 1995 till 1999 before becoming a faculty member at the School of Computer & Communication Sciences (I&C) of EPFL. He is a member of the Information Processing Group. He is principally interested in the analysis and design of iterative coding schemes, which allow reliable transmission close to theoretical limits at low complexities. Such schemes are part of most modern communications standards, including wireless transmission, optical communication and hard disk storage. More broadly, his research focuses on the analysis of graphical models and the application of methods from statistical physics to problems in communications. From 2000-2004 he was an Associate Editor of the IEEE Transactions on Information Theory and he is currently on the board of the series "Foundations and Trends in Communications and Information Theory." In 2017 he was President of the Information Theory Society. From 2009 till 2012 he was the head of the I&C doctoral school, in 2013 he served as Dean a. i. of I&C, and since 2016 he is the Associated Dean for teaching of I&C. He is a co-author of the book "Modern Coding Theory" published by Cambridge University Press. Awards: 2021 IEEE Information Theory Society Paper Award 2016 STOC Best Paper Award 2014 La Polysphere Teaching Award 2014 IEEE Hamming Medal 2013 IEEE Information Theory Society Paper Award 2011 MASCO Best Paper Award 2011 IEEE Koji Kobayashi Award 2009 La Polysphere Teaching Award 2002 IEEE Information Theory Society Paper Award Fulbright Scholarship My students have won the following awards: M. Mondelli, 2021 IEEE Information Theory Paper Award M. Mondelli, EPFL Doctorate Award 2018 M. Mondelli, Patrick Denantes Award, 2017 M. Mondelli, IEEE IT Society Student Paper Award at ISIT, 2015 M. Mondelli, Dan David Prize Scholarship, 2015 H. Hassani, Inaugural Thomas Cover Dissertation Award, 2014 S. Kudekar, 2013 & 2021 IEEE Information Theory Paper Award A. Karbasi, Patrick Denantes Award, 2013 V. Venkatesan, Best Paper Award at MASCOTS, 2011 A. Karbasi, Best Student Paper Award at ICASSP, 2011 (with R. Parhizkar) A. Karbasi, Best Student Paper Award at ACM SIGMETRICS, 2010 (with S. Oh) S. Korada, ABB Dissertation Award, 2010 S. Korada, IEEE IT Society Student Paper Award at ISIT, 2009 (with E. Sasoglu) S. Korada, IEEE IT Society Student Paper Award at ISIT, 2008

Source officielle

Proximité ontologique

Statistique

Inférence statistique: Statistique mathématique

Séances de cours associées (32)

Probabilité de révision

Introduit des variables aléatoires sous-gaussiennes et sous-exponentielles, des attentes conditionnelles et des normes Orlicz.

Combinaisons linéaires : fonctions génératrices de temps

Explore les fonctions génératrices de moments, les combinaisons linéaires et la normalité des variables aléatoires.

Vecteurs gaussiens : propriétés et distributions

Explique les propriétés de distribution gaussienne multivariées et les fonctions génératrices de moment pour les vecteurs aléatoires.

Génération du moment de la fonction et distribution normale multivariée

Explore les fonctions génératrices de moment et les distributions normales multivariées dans les probabilités et les statistiques.

Éléments de statistiques: Exercices

Couvre les exercices sur le théorème de Bayes, les fonctions génératrices de moment, le nombre de photons, les probabilités de maladie et les propriétés de distribution.

Afficher plus