alpha_zero

This is my own implementation of DeepMind's AlphaZero, using TensorFlow 2.0.

Usage:
• Clone this repository
• cd alpha_zero
• RUN: python training/train_connect_four.py

Neural Network Architecture:
• ResNet backbone to encode game state.
• ActorHead to approximate policy function.
• CriticHead to approximate value function.

Training Method:
• Run Monte Carlo Tree Search simulations to select each action (move) in the game.
• Update the policy function approximator towards the MCTS node visit counts.
• Update the value function approximator towards the real Monte Carlo return (reward) gained at the end of an episode (game).

Comments:
• Current implementation is on a custom Connect4 environment.
• Environment design can be made more efficient if needed.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
agent		agent
core		core
environments		environments
training		training
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

alpha_zero

About

Releases

Packages

Languages

JoeRRPhillips/alpha_zero

Folders and files

Latest commit

History

Repository files navigation

alpha_zero

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages