p3achygo/citations.md at main · p3achyjr/p3achygo · GitHub

David J. Wu, Accelerating Self Play Learning in Go

David Silver et. al., Mastering the game of Go without human knowledge

Ivo Danihelka et .al., Policy Improvement By Planning with Gumbel

Brian Lee et .al., Minigo: A Case Study in Reproducing Reinforcement Learning Research

Alexander Trudeau, Michael Bowling, Target Search Control in AlphaZero for Effective Policy Improvement

Not exhaustive.