Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sample N games at one time in replay_buffer #117

Conversation

mokemokechicken
Copy link
Contributor

When there are many games(about over 2000) in replay_buffer and large batch_size(for example over 256), cpu usage of ReplayBuffer is nearly 100% in my environment.
Slow get_batch() makes slow training.

Sampling games at one time reduce the cpu usage to around 50% and training keep fast.

@logar16
Copy link

logar16 commented Jan 15, 2021 via email

@mokemokechicken
Copy link
Contributor Author

Hi, @logar16
thank you for your reply!

Can you give metrics comparing before and after values?

Ok, I will show the metrics.

How does this change affect the overall speed of training?

In the current implementation, sample_game() is called batch_size times in one get_batch().
In sample_game(), this code may be slow when len(self.buffer) is large.

            game_probs = numpy.array(
                [game_history.game_priority for game_history in self.buffer.values()],
                dtype="float32",
            )

In the patched implementation, these code is called at once in one get_batch().


This is not conclusive evidence, but it is a graph that shows how training slows down over time.
In the patched implementation, this no longer occurs.
image

Later, I will report a comparison of the two implementations under the same conditions.

@mokemokechicken
Copy link
Contributor Author

I ran connect4 with the following environment and config.

  • Enviroment
    • GeForce GTX 1080
    • 8 CPU, 64GB Memory
  • Config(changed only)
    • num_simulations = 2 # for fast game generation
    • batch_size = 256
    • num_unroll_steps = 5
    • replay_buffer_size = 10000 # (original)

Red is before, Blue is after.
After about 1k-1.5k LogSteps, red training became slower (about 30-50% in this case).
ReplayBuffer size around 1k-1.5k LogSteps is about 2k-3k.

image

※ I don't know why both self-plays became slower after 1.5k~2k LogSteps...

@ahainaut
Copy link
Collaborator

ahainaut commented Feb 9, 2021

@mokemokechicken
Thank you for this great feature !
It looks good to me.

@ahainaut ahainaut merged commit 97e4931 into werner-duvaud:master Feb 9, 2021
egafni pushed a commit to egafni/muzero-general that referenced this pull request Apr 15, 2021
…ple_n_games_at_one_time_in_get_batch

sample N games at one time in replay_buffer
EpicLiem pushed a commit to EpicLiem/muzero-general-chess-archive that referenced this pull request Feb 4, 2023
…ple_n_games_at_one_time_in_get_batch

sample N games at one time in replay_buffer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants