Summary
AlphaZero is a computer program developed by artificial intelligence research company DeepMind to master the games of chess, shogi and go. This algorithm uses an approach similar to AlphaGo Zero. On December 5, 2017, the DeepMind team released a preprint paper introducing AlphaZero, which within 24 hours of training achieved a superhuman level of play in these three games by defeating world-champion programs Stockfish, Elmo, and the three-day version of AlphaGo Zero. In each case it made use of custom tensor processing units (TPUs) that the Google programs were optimized to use. AlphaZero was trained solely via self-play using 5,000 first-generation TPUs to generate the games and 64 second-generation TPUs to train the neural networks, all in parallel, with no access to opening books or endgame tables. After four hours of training, DeepMind estimated AlphaZero was playing chess at a higher Elo rating than Stockfish 8; after nine hours of training, the algorithm defeated Stockfish 8 in a time-controlled 100-game tournament (28 wins, 0 losses, and 72 draws). The trained algorithm played on a single machine with four TPUs. DeepMind's paper on AlphaZero was published in the journal Science on 7 December 2018; however, the AlphaZero program itself has not been made available to the public. In 2019, DeepMind published a new paper detailing MuZero, a new algorithm able to generalise AlphaZero's work, playing both Atari and board games without knowledge of the rules or representations of the game. AlphaZero (AZ) is a more generalized variant of the AlphaGo Zero (AGZ) algorithm, and is able to play shogi and chess as well as Go. Differences between AZ and AGZ include: AZ has hard-coded rules for setting search hyperparameters. The neural network is now updated continually. AZ doesn't use symmetries, unlike AGZ. Chess can end in a draw unlike Go; therefore, AlphaZero takes into account the possibility of a drawn game. Comparing Monte Carlo tree search searches, AlphaZero searches just 80,000 positions per second in chess and 40,000 in shogi, compared to 70 million for Stockfish and 35 million for Elmo.
About this result
This page is automatically generated and may contain information that is not correct, complete, up-to-date, or relevant to your search query. The same applies to every other page on this website. Please make sure to verify the information with EPFL's official sources.
Related concepts (12)
Stockfish (chess)
Stockfish is a free and open-source chess engine, available for various desktop and mobile platforms. It can be used in chess software through the Universal Chess Interface. Stockfish has consistently ranked first or near the top of most chess-engine rating lists and, as of April 2023, is the strongest CPU chess engine in the world. Its estimated Elo rating is around 3550 (CCRL 40/15). It has won the Top Chess Engine Championship 14 times and the Chess.com Computer Chess Championship 19 times.
Chess.com
Chess.com is an internet chess server and social networking website. The site has a freemium model in which some features are available for free, and others are available for accounts with subscriptions. Live online chess can be played against other users in daily, rapid, blitz or bullet time controls, with a number of chess variants also available. Chess versus a chess engine, computer analysis, chess puzzles and teaching resources are also offered.
Deep reinforcement learning
Deep reinforcement learning (deep RL) is a subfield of machine learning that combines reinforcement learning (RL) and deep learning. RL considers the problem of a computational agent learning to make decisions by trial and error. Deep RL incorporates deep learning into the solution, allowing agents to make decisions from unstructured input data without manual engineering of the state space. Deep RL algorithms are able to take in very large inputs (e.g.
Show more
Related courses (3)
CS-456: Artificial neural networks/reinforcement learning
Since 2010 approaches in deep learning have revolutionized fields as diverse as computer vision, machine learning, or artificial intelligence. This course gives a systematic introduction into influent
ME-390: Foundations of artificial intelligence
This course provides the students with basic theory to understand the machine learning approach, and the tools to use the approach for problems arising in engineering applications.
CS-430: Intelligent agents
Software agents are widely used to control physical, economic and financial processes. The course presents practical methods for implementing software agents and multi-agent systems, supported by prog
Related lectures (15)
Model-Based Deep RL: Planning and VAST
Covers model-based reinforcement learning, planning, variational state tabulation, and efficient Q- and V-values updating.
Hand Pose Estimation
Covers hand pose estimation, regression techniques, and the evolution of image classification models from LeNet to VGG19.
Reinforcement Learning: Q-Learning
Introduces Q-Learning, Deep Q-Learning, REINFORCE algorithm, and Monte-Carlo Tree Search in reinforcement learning, culminating in AlphaGo Zero.
Show more