-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve the Arena allocator to reduce memory fragmentation #916
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
rongou
added
2 - In Progress
Currently a work in progress
non-breaking
Non-breaking change
5 - DO NOT MERGE
Hold off on merging; see PR for details
improvement
Improvement / enhancement to an existing function
cpp
Pertains to C++ code
labels
Nov 12, 2021
jrhemstad
reviewed
Nov 12, 2021
Please use draft PRs instead of "WIP" titles. |
rerun tests |
rongou
requested review from
jrhemstad and
harrism
and removed request for
cwharris and
codereport
December 14, 2021 18:42
rerun tests |
@harrism @jrhemstad I think this is ready to be merged. Please take another look. Thanks! |
jrhemstad
approved these changes
Jan 10, 2022
harrism
requested changes
Jan 11, 2022
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. A few really minor cleanup requests.
harrism
approved these changes
Jan 12, 2022
@gpucibot merge |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
3 - Ready for review
Ready for review by team
cpp
Pertains to C++ code
improvement
Improvement / enhancement to an existing function
non-breaking
Non-breaking change
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Currently the arena allocator divides GPU memory into a global arena and per-thread arenas. For smaller allocations, a per-thread arena allocates large chunks of memory (superblocks) from the global arena and divides them up for individual allocations. However, when deallocating from another arena (producer/consumer pattern), or when we run out of memory and return everything to the global arena, the superblock boundaries are broken. Overtime, this could cause the memory to get more and more fragmented.
This PR makes superblocks concrete objects, not just virtual boundaries, and the only units of exchange between the global arena and per-thread arenas. This should make the allocator more resistant to memory fragmentation, especially for long running processes under constant memory pressure.
Other notable changes:
std::shared_mutex
.fixes #919
fixes #906