Skip to content

JoeRRPhillips/alpha_zero

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

alpha_zero

This is my own implementation of DeepMind's AlphaZero, using TensorFlow 2.0.

Usage:
• Clone this repository
• cd alpha_zero
• RUN: python training/train_connect_four.py

Neural Network Architecture:
• ResNet backbone to encode game state.
• ActorHead to approximate policy function.
• CriticHead to approximate value function.

Training Method:
• Run Monte Carlo Tree Search simulations to select each action (move) in the game.
• Update the policy function approximator towards the MCTS node visit counts.
• Update the value function approximator towards the real Monte Carlo return (reward) gained at the end of an episode (game).

Comments:
• Current implementation is on a custom Connect4 environment.
• Environment design can be made more efficient if needed.

About

Implementation of DeepMind's AlphaZero

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages