-
Notifications
You must be signed in to change notification settings - Fork 663
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: limit trie cache by memory consumption #7749
Conversation
Instead of checking the number of value and their sizes, the caches are now limited by the actual (approximated) cache consumption. This changes what `total_size` in `TrieCacheInner` means, which is also observable through prometheus metrics. Existing configuration continues to work as it did, limits on number of entries are converted to an implicit size limit. Since the explicit default size limit currently is 3GB and the default max entries is set 50k, the implicit limit = 50k * 100B = 50kB is stronger. Therefore this change is not visible in practice for shard with default config. For shard 3, however, where the number of entries is set to 45M in code, the memory limit of 3GB is active. Since we change how this limit is, calculated we will see fewer entries cached with this change. Shard 3 should still be okay since we have a prefetcher in place now that works even when the cache is empty.
If we merge this change first, I can finalise #7578 without adding temporary code only to deal with capacity configuration that we want to ditch anyway. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Broadly ok, but left a couple of Qs
core/store/src/trie/trie_storage.rs
Outdated
assert!(total_size_limit > 0); | ||
// upper bound on capacity for allocation purposes only | ||
let cache_capacity = | ||
(total_size_limit + Self::PER_ENTRY_OVERHEAD - 1) / Self::PER_ENTRY_OVERHEAD; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting! So in practice I think this means we always end up over-allocating the cache?
There's perhaps something smarter here (like dividing by 2 * per entry overhead) I don't immediately see an obviously better appraoch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems to make much more sense to just use unbounded lru cache now
deletions_queue_capacity: usize, | ||
total_size_limit: u64, | ||
shard_id: ShardId, | ||
is_view: bool, | ||
) -> Self { | ||
assert!(cache_capacity > 0 && total_size_limit > 0); | ||
assert!(total_size_limit > 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you sanity-check that the behavior is correct when total_size_limit
is less than even a single entry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a test that shows that we can still insert one value. I hope I understood your concern right and that addresses it, otherwise let me know.
unbounded cache means caches might use more memory in non-worst cases
Thanks for the review @matklad! I think I addressed all your comments. cc also @Longarithm, to be very clear here, this change is going to have an effect on existing shard caches.
I believe this should all be okay. But I am not going to merge this without your approval on these two specific changes @Longarithm :) |
Note to self: Add a comment in CHANGELOG after rebasing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd delegate final review to @Longarithm (but here's ✔️ to avoid blocking)
I'll leave a review on Monday/Tuesday |
core/store/src/trie/trie_storage.rs
Outdated
pub fn len(&self) -> usize { | ||
self.cache.len() | ||
} | ||
|
||
/// Account consumed memory for a new entry in the cache. | ||
pub(crate) fn add_value_of_size(&mut self, len: usize) { | ||
let memory_consumed = len as u64 + Self::PER_ENTRY_OVERHEAD; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it may make sense to make it a function entry_size(len)
and to avoid duplication. We can even replace (TrieCacheInner::PER_ENTRY_OVERHEAD + TrieConfig::max_cached_value_size() as u64)
with it while we support old config field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Is it right that after this cache for shard 3 will hold less entries, but we don't care much because we have prefetching as well?
- link todos to issues - Self::entry_size(len)
yes, that's exactly right |
Instead of checking the number of values and their sizes, the caches are now limited by the actual (approximated) memory consumption. This changes what `total_size` in `TrieCacheInner` means, which is also observable through Prometheus metrics. Existing configuration works with slightly altered effects. Number of entries convert to an implicit size limit. Since the explicit default size limit currently is 3GB and the default max entries is set 50k, the implicit limit = 50k * 1000B = 50MB is stronger. This still limits the number of largest entries to 50k but allows the cache to be filled with more entries when the values are smaller. For shard 3, however, where the number of entries is set to 45M in code, the memory limit of 3GB is active. Since we change how this limit is calculated we will see fewer entries cached with this change. Shard 3 should still be okay since we have a prefetcher in place now that works even when the cache is empty.
Instead of checking the number of values and their sizes, the caches are now limited by the actual (approximated) memory consumption. This changes what `total_size` in `TrieCacheInner` means, which is also observable through Prometheus metrics. Existing configuration works with slightly altered effects. Number of entries convert to an implicit size limit. Since the explicit default size limit currently is 3GB and the default max entries is set 50k, the implicit limit = 50k * 1000B = 50MB is stronger. This still limits the number of largest entries to 50k but allows the cache to be filled with more entries when the values are smaller. For shard 3, however, where the number of entries is set to 45M in code, the memory limit of 3GB is active. Since we change how this limit is calculated we will see fewer entries cached with this change. Shard 3 should still be okay since we have a prefetcher in place now that works even when the cache is empty.
Instead of checking the number of values and their sizes, the caches are now limited by the actual (approximated) memory consumption. This changes what `total_size` in `TrieCacheInner` means, which is also observable through Prometheus metrics. Existing configuration works with slightly altered effects. Number of entries convert to an implicit size limit. Since the explicit default size limit currently is 3GB and the default max entries is set 50k, the implicit limit = 50k * 1000B = 50MB is stronger. This still limits the number of largest entries to 50k but allows the cache to be filled with more entries when the values are smaller. For shard 3, however, where the number of entries is set to 45M in code, the memory limit of 3GB is active. Since we change how this limit is calculated we will see fewer entries cached with this change. Shard 3 should still be okay since we have a prefetcher in place now that works even when the cache is empty.
Instead of checking the number of values and their sizes, the caches are
now limited by the actual (approximated) memory consumption.
This changes what
total_size
inTrieCacheInner
means, which is alsoobservable through Prometheus metrics.
Existing configuration works with slightly altered effects.
Number of entries convert to an implicit size limit. Since the explicit
default size limit currently is 3GB and the default max entries is set
50k, the implicit limit = 50k * 1000B = 50MB is stronger. This still
limits the number of largest entries to 50k but allows the cache to
be filled with more entries when the values are smaller.
For shard 3, however, where the number of entries is set to 45M in code,
the memory limit of 3GB is active. Since we change how this limit is
calculated we will see fewer entries cached with this change.
Shard 3 should still be okay since we have a prefetcher in place now
that works even when the cache is empty.