A Connect Four AI that learns from scratch via self-play deep reinforcement learning.
Training:
python3 train.py
Running:
python3 main.py
Results (~9 hours of training):
Generation 0 vs. Optimal Player:
Generation 5 vs. Optimal Player:
Generation 10 vs. Optimal Player: