-
Notifications
You must be signed in to change notification settings - Fork 7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce Keeper memory usage #59002
Reduce Keeper memory usage #59002
Conversation
This is an automated comment for commit 274c128 with description of existing statuses. It's updated for the latest CI running ❌ Click here to open a full report in a separate page Successful checks
|
94ddbd1
to
da3153c
Compare
}; | ||
|
||
template <class V> | ||
class SnapshotableHashTable | ||
{ | ||
private: | ||
struct GlobalArena | ||
{ | ||
char * alloc(const size_t size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not returning std::unique_ptr<char[]>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because I don't want to store that pointer anywhere but inside the StringRef key
, it will only take extra space
also, this way it's compatible with other Arenas and it's able to use e.g. copyStringInArena
so I can change them easily if needed
da3153c
to
0132455
Compare
92cbe9a
to
82dce23
Compare
82dce23
to
274c128
Compare
/// Arena used for keys | ||
/// we don't use std::string because it uses 24 bytes (because of SSO) | ||
/// we want to always allocate the key on heap and use StringRef to it | ||
GlobalArena arena; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am curious that why we used ArenaWithFreeLists
before.
Now we use GlobalArena, but why not Common/Allocator.h
. It looks having same features but traced by MemoryTracker
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All our new/delete calls are tracked with MemoryTracker
.
As this is a simple new/delete call I didn't see a reason to use something more complex like Allocator.h
I am curious that why we used ArenaWithFreeLists before.
I'm not sure what were the benefits but it had huge downsides IMO.
ArenaWithFreeLists
doesn't work for global usage like this because it doesn't deallocate memory on free, just adds it to free list. E.g. you create 50m nodes and delete all of them, all memory allocated for their paths will be kept in memory for no reason.
ArenaWithFreeList
has bins with 2^n size. The overhead is much worse for larger bins and we end up with a lot of unused memory. In case of global allocator we will use jemallocs bins which are much closer to the real size we want to allocate (https://jemalloc.net/jemalloc.3.html#size_classes)
@@ -288,6 +363,7 @@ class SnapshotableHashTable | |||
|
|||
void clear() | |||
{ | |||
clearOutdatedNodes(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function seems to clear all the map. Why clear outdated nodes first?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because the key is shared so we don't want to deallocate the same key twice.
We first clear all duplicates created for snapshot with clearOutdatedNodes
and then all the nodes.
approximate_data_size += node.key.size; | ||
approximate_data_size += node.value.sizeInBytes(); | ||
} | ||
} | ||
|
||
uint64_t keyArenaSize() const { return arena.allocatedBytes(); } | ||
uint64_t keyArenaSize() const { return 0; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why return 0? Do we still need this metric?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't but I don't want to break backwards compatibility for now by removing this row from mntr
command.
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Keeper improvement: reduce Keeper's memory usage for stored nodes.
Documentation entry for user-facing changes