Concept

Apprentissage par renforcement

En intelligence artificielle, plus précisément en apprentissage automatique, l'apprentissage par renforcement consiste, pour un agent autonome ( robot, agent conversationnel, personnage dans un jeu vidéo), à apprendre les actions à prendre, à partir d'expériences, de façon à optimiser une récompense quantitative au cours du temps. L'agent est plongé au sein d'un environnement et prend ses décisions en fonction de son état courant. En retour, l'environnement procure à l'agent une récompense, qui peut être positive ou négative. L'agent cherche, au travers d'expériences itérées, un comportement décisionnel (appelé stratégie ou politique, et qui est une fonction associant à l'état courant l'action à exécuter) optimal, en ce sens qu'il maximise la somme des récompenses au cours du temps. L'apprentissage par renforcement est l'une des trois grandes techniques d'apprentissage automatique, au côté de l'apprentissage supervisé et de l'apprentissage non supervisé. vignette|Jeux vidéo Atari. Hessel et al. ont montré que l'apprentissage par renforcement donne des programmes meilleurs que les humains. vignette|Jeu de go. AlphaGo Zero sont des programmes qui ont appris à jouer grâce à l'apprentissage par renforcement. L'apprentissage par renforcement est utilisé dans plusieurs applications : robotique, gestion de ressources, vol d'hélicoptères, chimie. Cette méthode a été appliquée avec succès à des problèmes variés, tels que le contrôle robotique, le pendule inversé, la planification de tâches, les télécommunications, le backgammon et les échecs. En 2015, Mnih et al. ont montré que l'apprentissage par renforcement permettait de créer un programme jouant à des jeux Atari. Leur système apprend à jouer à des jeux, en recevant en entrée les pixels de l'écran et le score. Un point intéressant est que leur système n'a pas accès à l'état mémoire interne du jeu (sauf le score). En 2018, Hessel et al. ont combiné plusieurs techniques pour améliorer les performances du programme.

Source officielle

https://fr.wikipedia.org/wiki/Apprentissage_par_renforcement

À propos de ce résultat

Cette page est générée automatiquement et peut contenir des informations qui ne sont pas correctes, complètes, à jour ou pertinentes par rapport à votre recherche. Il en va de même pour toutes les autres pages de ce site. Veillez à vérifier les informations auprès des sources officielles de l'EPFL.

Cours associés (32)

CS-456: Deep reinforcement learning

This course provides an overview and introduces modern methods for reinforcement learning (RL.) The course starts with the fundamentals of RL, such as Q-learning, and delves into commonly used approac

CS-430: Intelligent agents

Software agents are widely used to control physical, economic and financial processes. The course presents practical methods for implementing software agents and multi-agent systems, supported by prog

EE-568: Reinforcement learning

This course describes theory and methods for Reinforcement Learning (RL), which revolves around decision making under uncertainty. The course covers classic algorithms in RL as well as recent algorith

Afficher plus

Publications associées (31)

Fusing Pre-existing Knowledge and Machine Learning for Enhanced Building Thermal Modeling and Control

Loris Di Natale

Buildings play a pivotal role in the ongoing worldwide energy transition, accounting for 30% of the global energy consumption. With traditional engineering solutions reaching their limits to tackle such large-scale problems, data-driven methods and Machine ...

EPFL2024

Towards practical reinforcement learning for tokamak magnetic control

Federico Alberto Alfredo Felici, Cristian Galperti, Jonas Buchli, Brendan Tracey

Reinforcement learning (RL) has shown promising results for real-time control systems, including the domain of plasma magnetic control. However, there are still significant drawbacks compared to traditional feedback control approaches for magnetic confinem ...

Elsevier Science Sa2024

Social Opinion Formation and Decision Making Under Communication Trends

Ali H. Sayed, Virginia Bordignon, Mert Kayaalp

This work studies the learning process over social networks under partial and random information sharing. In traditional social learning models, agents exchange full belief information with each other while trying to infer the true state of nature. We stud ...

Ieee-Inst Electrical Electronics Engineers Inc2024

Afficher plus

Unités associées (17)

Laboratoire de systèmes adaptatifs

IGM - Gestion

Laboratoire de systèmes d'information et d'inférence

Afficher plus

Apprentissage par renforcement — Wikipédia

Source officielle

https://fr.wikipedia.org/wiki/Apprentissage_par_renforcement

À propos de ce résultat

Cours associés (32)

CS-456: Deep reinforcement learning

CS-430: Intelligent agents

EE-568: Reinforcement learning

Afficher plus

Séances de cours associées (29)

Agents d'apprentissage profond : Renforcement de l'apprentissage

Explore les agents d'apprentissage profond dans l'apprentissage du renforcement, en mettant l'accent sur les approximations du réseau neuronal et les défis dans la formation des systèmes multiactifs.

Renforcement de l'apprentissage : Q-Learning

Couvre l'apprentissage Q en renforçant l'apprentissage, en explorant les valeurs d'action, les politiques et l'impact sociétal des algorithmes.

Agents d'apprentissage: Tradeoff Exploration-Exploitation

Explore le compromis exploration-exploitation dans l'apprentissage des effets inconnus des actions en utilisant des bandits multi-armés et Q-learning.

Afficher plus

Publications associées (31)

Fusing Pre-existing Knowledge and Machine Learning for Enhanced Building Thermal Modeling and Control

Loris Di Natale

EPFL2024

Towards practical reinforcement learning for tokamak magnetic control

Federico Alberto Alfredo Felici, Cristian Galperti, Jonas Buchli, Brendan Tracey

Elsevier Science Sa2024

Social Opinion Formation and Decision Making Under Communication Trends

Ali H. Sayed, Virginia Bordignon, Mert Kayaalp

Ieee-Inst Electrical Electronics Engineers Inc2024

Afficher plus

Personnes associées (15)

Ali H. Sayed

Ali H. Sayed est doyen de la Faculté des sciences et techniques de l’ingénieur (STI) de l'EPFL, en Suisse, où il dirige également le laboratoire de systèmes adaptatifs. Il a également été professeur émérite et président du département d'ingénierie électrique de l'UCLA. Il est reconnu comme un chercheur hautement cité et est membre de la US National Academy of Engineering. Il est également membre de l'Académie mondiale des sciences et a été président de l'IEEE Signal Processing Society en 2018 et 2019. Le professeur Sayed est auteur et co-auteur de plus de 570 publications et de six monographies. Ses recherches portent sur plusieurs domaines, dont les théories d'adaptation et d'apprentissage, les sciences des données et des réseaux, l'inférence statistique et les systèmes multi-agents, entre autres. Ses travaux ont été récompensés par plusieurs prix importants, notamment le prix Fourier de l'IEEE (2022), le prix de la société Norbert Wiener (2020) et le prix de l'éducation (2015) de la société de traitement des signaux de l'IEEE, le prix Papoulis (2014) de l'Association européenne de traitement des signaux, le Meritorious Service Award (2013) et le prix de la réalisation technique (2012) de la société de traitement des signaux de l'IEEE, le prix Terman (2005) de la société américaine de formation des ingénieurs, le prix de conférencier émérite (2005) de la société de traitement des signaux de l'IEEE, le prix Koweït (2003) et le prix Donald G. Fink (1996) de l'IEEE. Ses publications ont été récompensées par plusieurs prix du meilleur article de l'IEEE (2002, 2005, 2012, 2014) et de l'EURASIP (2015). Pour finir, Ali H. Sayed est aussi membre de l'IEEE, d'EURASIP et de l'American Association for the Advancement of Science (AAAS), l'éditeur de la revue Science.

Volkan Cevher

Volkan Cevher received the B.Sc. (valedictorian) in electrical engineering from Bilkent University in Ankara, Turkey, in 1999 and the Ph.D. in electrical and computer engineering from the Georgia Institute of Technology in Atlanta, GA in 2005. He was a Research Scientist with the University of Maryland, College Park from 2006-2007 and also with Rice University in Houston, TX, from 2008-2009. Currently, he is an Associate Professor at the Swiss Federal Institute of Technology Lausanne and a Faculty Fellow in the Electrical and Computer Engineering Department at Rice University. His research interests include machine learning, signal processing theory, optimization theory and methods, and information theory. Dr. Cevher is an ELLIS fellow and was the recipient of the Google Faculty Research award in 2018, the IEEE Signal Processing Society Best Paper Award in 2016, a Best Paper Award at CAMSAP in 2015, a Best Paper Award at SPARS in 2009, and an ERC CG in 2016 as well as an ERC StG in 2011.

Wulfram Gerstner

Wulfram Gerstner is Director of the Laboratory of Computational Neuroscience LCN at the EPFL. His research in computational neuroscience concentrates on models of spiking neurons and spike-timing dependent plasticity, on the problem of neuronal coding in single neurons and populations, as well as on the link between biologically plausible learning rules and behavioral manifestations of learning. He teaches courses for Physicists, Computer Scientists, Mathematicians, and Life Scientists at the EPFL. After studies of Physics in Tübingen and at the Ludwig-Maximilians-University Munich (Master 1989), Wulfram Gerstner spent a year as a visiting researcher in Berkeley. He received his PhD in theoretical physics from the Technical University Munich in 1993 with a thesis on associative memory and dynamics in networks of spiking neurons. After short postdoctoral stays at Brandeis University and the Technical University of Munich, he joined the EPFL in 1996 as assistant professor. Promoted to Associate Professor with tenure in February 2001, he is since August 2006 a full professor with double appointment in the School of Computer and Communication Sciences and the School of Life Sciences. Wulfram Gerstner has been invited speaker at numerous international conferences and workshops. He has served on the editorial board of the Journal of Neuroscience, Network: Computation in Neural Systems', Journal of Computational Neuroscience', and `Science'.

Michael Herzog

Aude Billard

Colin Neil Jones

Colin Jones is an Associate Professor in the Automatic Control Laboratory at the Ecole Polytechnique Federale de Lausanne (EPFL) in Switzerland. He was a Senior Researcher at the Automatic Control Lab at ETH Zurich until 2011 and obtained a PhD in 2005 from the University of Cambridge for his work on polyhedral computational methods for constrained control. Prior to that, he was at the University of British Columbia in Canada, where he took a BASc and MASc in Electrical Engineering and Mathematics. Colin has worked in a variety of industrial roles, ranging from commercial building control to the development of custom optimization tools focusing on retail human resource scheduling. His current research interests are in the theory and computation of predictive control and optimization, and their application to green energy generation, distribution and management.

Devis Tuia

I come from Ticino and studied in Lausanne, between UNIL and EPFL. After my PhD at UNIL in remote sensing, I was postdoc in Valencia (Spain), Boulder (CO) and EPFL, working on model adaptation and prior knowledge integration in machine learning. In 2014 I became Research Assistant Professor at University of Zurich, where I started the 'multimodal remote sensing' group. In 2017, I joined Wageningen University (NL), where I was professor of the GeoInformation Science and Remote Sensing Laboratory. Since 2020, I joined EPFL Valais, to start the ECEO lab, working at the interface between Earth observation, machine learning and environmental sciences.

Maryam Kamgarpour

Maryam Kamgarpour holds a Doctor of Philosophy in Engineering from the University of California, Berkeley and a Bachelor of Applied Science from University of Waterloo, Canada. Her research is on safe decision-making and control under uncertainty, game theory and mechanism design, mixed integer and stochastic optimization and control. Her theoretical research is motivated by control challenges arising in intelligent transportation networks, robotics, power grid systems and healthcare. She is the recipient of NASA High Potential Individual Award, NASA Excellence in Publication Award, and the European Union (ERC) Starting Grant.

Pascal Fua

Pascal Fua received an engineering degree from Ecole Polytechnique, Paris, in 1984 and the Ph.D. degree in Computer Science from the University of Orsay in 1989. He then worked at SRI International and INRIA Sophia-Antipolis as a Computer Scientist. He joined EPFL in 1996 where he is now a Professor in the School of Computer and Communication Science and heads the Computer Vision Laboratory. His research interests include shape modeling and motion recovery from images, analysis of microscopy images, and Augmented Reality. His research interests include shape modeling and motion recovery from images, analysis of microscopy images, and machine learning. He has (co)authored over 300 publications in refereed journals and conferences. He is an IEEE Fellow and has been an Associate Editor of IEEE journal Transactions for Pattern Analysis and Machine Intelligence. He often serves as program committee member, area chair, and program chair of major vision conferences and has cofounded three spinoff companies (Pix4D, PlayfulVision, and NeuralConcept).

Alexandre Massoud Alahi

Alexandre Alahi is currently an Assistant Professor at EPFL. He spent five years at Stanford University as a Post-doc and Research Scientist after obtaining his Ph.D. from EPFL. His research enables machines to perceive the world and make decisions in the context of transportation problems and smart environments. He has worked on the theoretical challenges and practical applications of socially-aware Artificial Intelligence, i.e., systems equipped with perception and social intelligence. He was awarded the Swiss NSF early and advanced researcher grants for his work on predicting human social behavior. He won the CVPR Open Source Award (2012) for his work on Retina-inspired image descriptors, and the ICDSC Challenge Prize (2009) for his sparsity-driven algorithm that has tracked more than 100 million pedestrians to date. His research has been covered internationally by BBC, abc, PBS, Euronews, Wall street journal, and other national news outlets around the world. Alexandre has also co-founded multiple startups such as Visiosafe, and won several startup competitions. He was elected as one of the Top 20 Swiss Venture leaders in 2010.

Afficher plus

Unités associées (17)

Laboratoire de systèmes adaptatifs

IGM - Gestion

Laboratoire de systèmes d'information et d'inférence

Afficher plus

Concepts associés (7)

Apprentissage automatique

L'apprentissage automatique (en anglais : machine learning, « apprentissage machine »), apprentissage artificiel ou apprentissage statistique est un champ d'étude de l'intelligence artificielle qui se fonde sur des approches mathématiques et statistiques pour donner aux ordinateurs la capacité d'« apprendre » à partir de données, c'est-à-dire d'améliorer leurs performances à résoudre des tâches sans être explicitement programmés pour chacune. Plus largement, il concerne la conception, l'analyse, l'optimisation, le développement et l'implémentation de telles méthodes.

Q-learning

vignette|400x400px|Dans le Q-learning, l'agent exécute une action a en fonction de l'état s et d'une fonction Q. Il perçoit alors le nouvel état s' et une récompense r de l'environnement. Il met alors à jour la fonction Q. Le nouvel état s' devient alors l'état s, et l'apprentissage continue. En intelligence artificielle, plus précisément en apprentissage automatique, le Q-learning est un algorithme d'apprentissage par renforcement. Il ne nécessite aucun modèle initial de l'environnement.

Agent intelligent

En intelligence artificielle, un agent intelligent (AI) est une entité autonome capable de percevoir son environnement grâce à des capteurs et aussi d'agir sur celui-ci via des effecteurs afin de réaliser des objectifs. Un agent intelligent peut également apprendre ou utiliser des connaissances pour pouvoir réaliser ses objectifs. Ils peuvent être simples ou complexes. Par exemple, un simple système réactif, comme le thermostat est considéré comme étant un agent intelligent.

Afficher plus

MOOCs associés (10)

Neuronal Dynamics - Computational Neuroscience of Single Neurons

The activity of neurons in the brain and the code used by these neurons is described by mathematical neuron models at different levels of detail.

Neuronal Dynamics 2- Computational Neuroscience: Neuronal Dynamics of Cognition

This course explains the mathematical and computational models that are used in the field of theoretical neuroscience to analyze the collective dynamics of thousands of interacting neurons.

Neuronal Dynamics 2- Computational Neuroscience: Neuronal Dynamics of Cognition

This course explains the mathematical and computational models that are used in the field of theoretical neuroscience to analyze the collective dynamics of thousands of interacting neurons.

Afficher plus

Personnes associées (15)

Alexandre Massoud Alahi

Afficher plus