Skip to main content
Graph
Search
fr
en
Login
Search
All
Categories
Concepts
Courses
Lectures
MOOCs
People
Practice
Publications
Startups
Units
Show all results for
Home
Lecture
Actor-Critic Architecture and Advantage-Actor-Critic
Graph Chatbot
Related lectures (29)
Previous
Page 1 of 3
Next
Reinforcement Learning: Policy Gradient and Actor-Critic Methods
Provides an overview of reinforcement learning, focusing on policy gradient and actor-critic methods for deep artificial neural networks.
Introduction to Reinforcement Learning: Concepts and Applications
Introduces reinforcement learning, covering its concepts, applications, and key algorithms.
Reinforcement Learning: Q-Learning
Covers Q-Learning in reinforcement learning, exploring action values, policies, and the societal impact of algorithms.
Continuous Reinforcement Learning: Advanced Machine Learning
Explores continuous-state reinforcement learning challenges, value function estimation, policy gradients, and Policy learning by Weighted Exploration.
Principled Reinforcement Learning with Human Feedback
Delves into Reinforcement Learning with Human Feedback, discussing convergence of estimators and introducing a pessimistic approach for improved performance.
Policy Gradient Methods in Reinforcement Learning
Covers policy gradient methods in reinforcement learning, focusing on optimization techniques and practical applications like the cartpole problem.
TD Learning: Temporal Difference Learning
Covers Temporal Difference Learning, V-values, state-values, and TD methods in reinforcement learning.
Reinforcement Learning: TD Learning and SARSA Variants
Discusses reinforcement learning, focusing on temporal difference learning and SARSA algorithm variations.
Deep Learning Agents: Reinforcement Learning
Explores Deep Learning Agents in Reinforcement Learning, emphasizing neural network approximations and challenges in training multiagent systems.
Linear Programming Techniques in Reinforcement Learning
Covers the linear programming approach to reinforcement learning, focusing on its applications and advantages in solving Markov decision processes.