-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About using different players for training game generation #56
Comments
Hi @eddh
Though I also enabled sharing tree search information by If I have rich computational resources, it might be better to separate perfectly because it brings a little randomness. |
Thank you for the answer. I have been curious about this but maybe it has less of an effect than I expected. Did you do tests regarding reusing tree information and the effect it has on the effictiveness of Dirichlet noise ? In other related projects, the consensus seems to be that it does make the Dirichlet noise less effictive but that as long as it doesn't prevent new moves discovery completely the speed boost is worth the cost. |
I tested reusing tree information and checked the moves. Although it will be a little different topic... there is a draw in reversi. I was annoyed with this "many draw games(80~90%) problem". Reusing tree information tends to bring the problem. |
Maybe that is just the nature of Reversi game, see https://en.wikipedia.org/wiki/Computer_Othello, the "Othello 8 x 8" section. That being said, even if enough randomness is promised when traning, it will leads to a draw at last. As said in that link:
Is your model playing diagonal opening or perpendicular opening? |
Several opening including diagonal and perpendicular were played. |
I see. Looking forward to a solution being found~ |
So I have a question which is related to another similar project for go https://github.com/gcp/leela-zero
In this project, self play game are generated from the same player playing against itself. So black and white have the same random seed, and have a shared search tree through tree reuse.
If I'm reading the code right, in reversi-alpha-zero, 2 independent players are used to generate self-play games, with their own separate search tree and different random seed.
I am very curious about the effects of the 2 different ways of doing this. What have been your results ?
The text was updated successfully, but these errors were encountered: