Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: limit trie cache by memory consumption #7749

Merged
merged 6 commits into from
Oct 21, 2022

Conversation

jakmeier
Copy link
Contributor

@jakmeier jakmeier commented Oct 3, 2022

Instead of checking the number of values and their sizes, the caches are
now limited by the actual (approximated) memory consumption.

This changes what total_size in TrieCacheInner means, which is also
observable through Prometheus metrics.

Existing configuration works with slightly altered effects.
Number of entries convert to an implicit size limit. Since the explicit
default size limit currently is 3GB and the default max entries is set
50k, the implicit limit = 50k * 1000B = 50MB is stronger. This still
limits the number of largest entries to 50k but allows the cache to
be filled with more entries when the values are smaller.

For shard 3, however, where the number of entries is set to 45M in code,
the memory limit of 3GB is active. Since we change how this limit is
calculated we will see fewer entries cached with this change.
Shard 3 should still be okay since we have a prefetcher in place now
that works even when the cache is empty.

Instead of checking the number of value and their sizes, the caches are
now limited by the actual (approximated) cache consumption.

This changes what `total_size` in `TrieCacheInner` means, which is also
observable through prometheus metrics.

Existing configuration continues to work as it did, limits on number of
entries are converted to an implicit size limit. Since the explicit
default size limit currently is 3GB and the default max entries is set
50k, the implicit limit = 50k * 100B = 50kB is stronger. Therefore this
change is not visible in practice for shard with default config.

For shard 3, however, where the number of entries is set to 45M in code,
the memory limit of 3GB is active. Since we change how this limit is,
calculated we will see fewer entries cached with this change.
Shard 3 should still be okay since we have a prefetcher in place now
that works even when the cache is empty.
@jakmeier jakmeier requested review from matklad and Longarithm October 3, 2022 10:05
@jakmeier jakmeier requested a review from a team as a code owner October 3, 2022 10:05
@jakmeier
Copy link
Contributor Author

jakmeier commented Oct 3, 2022

If we merge this change first, I can finalise #7578 without adding temporary code only to deal with capacity configuration that we want to ditch anyway.

Copy link
Contributor

@matklad matklad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Broadly ok, but left a couple of Qs

core/store/src/trie/config.rs Outdated Show resolved Hide resolved
core/store/src/trie/config.rs Show resolved Hide resolved
assert!(total_size_limit > 0);
// upper bound on capacity for allocation purposes only
let cache_capacity =
(total_size_limit + Self::PER_ENTRY_OVERHEAD - 1) / Self::PER_ENTRY_OVERHEAD;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting! So in practice I think this means we always end up over-allocating the cache?

There's perhaps something smarter here (like dividing by 2 * per entry overhead) I don't immediately see an obviously better appraoch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems to make much more sense to just use unbounded lru cache now

deletions_queue_capacity: usize,
total_size_limit: u64,
shard_id: ShardId,
is_view: bool,
) -> Self {
assert!(cache_capacity > 0 && total_size_limit > 0);
assert!(total_size_limit > 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you sanity-check that the behavior is correct when total_size_limit is less than even a single entry?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a test that shows that we can still insert one value. I hope I understood your concern right and that addresses it, otherwise let me know.

core/store/src/trie/trie_storage.rs Outdated Show resolved Hide resolved
unbounded cache means caches might use more memory in non-worst cases
@jakmeier
Copy link
Contributor Author

jakmeier commented Oct 4, 2022

Thanks for the review @matklad! I think I addressed all your comments.

cc also @Longarithm, to be very clear here, this change is going to have an effect on existing shard caches.

  • The old config of 50k per slot is now converted to 50k * 1kB = 50MB. If it means we fit in more than 50k (most likely we will) then there will be more entries in the shard cache now.
  • The old config for shard 3, 3GB / 45M slots, is still limited by 3GB, but now factoring in the 100B overhead. So it will effectively contain less in the cache.

I believe this should all be okay. But I am not going to merge this without your approval on these two specific changes @Longarithm :)

@jakmeier jakmeier requested a review from matklad October 4, 2022 15:14
@jakmeier
Copy link
Contributor Author

jakmeier commented Oct 4, 2022

Note to self: Add a comment in CHANGELOG after rebasing

Copy link
Contributor

@matklad matklad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd delegate final review to @Longarithm (but here's ✔️ to avoid blocking)

@Longarithm
Copy link
Member

Longarithm commented Oct 14, 2022

I'll leave a review on Monday/Tuesday
UPD: let it be Wednesday because today I took another deep look on #7445, and there is no rush

pub fn len(&self) -> usize {
self.cache.len()
}

/// Account consumed memory for a new entry in the cache.
pub(crate) fn add_value_of_size(&mut self, len: usize) {
let memory_consumed = len as u64 + Self::PER_ENTRY_OVERHEAD;
Copy link
Member

@Longarithm Longarithm Oct 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it may make sense to make it a function entry_size(len) and to avoid duplication. We can even replace (TrieCacheInner::PER_ENTRY_OVERHEAD + TrieConfig::max_cached_value_size() as u64) with it while we support old config field.

Copy link
Member

@Longarithm Longarithm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Is it right that after this cache for shard 3 will hold less entries, but we don't care much because we have prefetching as well?

core/store/src/trie/config.rs Show resolved Hide resolved
- link todos to issues
- Self::entry_size(len)
@jakmeier
Copy link
Contributor Author

LGTM. Is it right that after this cache for shard 3 will hold less entries, but we don't care much because we have prefetching as well?

yes, that's exactly right

@near-bulldozer near-bulldozer bot merged commit faa3fed into near:master Oct 21, 2022
nikurt pushed a commit that referenced this pull request Oct 25, 2022
Instead of checking the number of values and their sizes, the caches are
now limited by the actual (approximated) memory consumption.

This changes what `total_size` in `TrieCacheInner` means, which is also
observable through Prometheus metrics.

Existing configuration works with slightly altered effects.
Number of entries convert to an implicit size limit. Since the explicit
default size limit currently is 3GB and the default max entries is set
50k, the implicit limit = 50k * 1000B = 50MB is stronger. This still
limits the number of largest entries to 50k but allows the cache to
be filled with more entries when the values are smaller.

For shard 3, however, where the number of entries is set to 45M in code,
the memory limit of 3GB is active. Since we change how this limit is
calculated we will see fewer entries cached with this change.
Shard 3 should still be okay since we have a prefetcher in place now
that works even when the cache is empty.
@jakmeier jakmeier deleted the memory-limit-trie-cache branch October 28, 2022 19:45
nikurt pushed a commit that referenced this pull request Nov 7, 2022
Instead of checking the number of values and their sizes, the caches are
now limited by the actual (approximated) memory consumption.

This changes what `total_size` in `TrieCacheInner` means, which is also
observable through Prometheus metrics.

Existing configuration works with slightly altered effects.
Number of entries convert to an implicit size limit. Since the explicit
default size limit currently is 3GB and the default max entries is set
50k, the implicit limit = 50k * 1000B = 50MB is stronger. This still
limits the number of largest entries to 50k but allows the cache to
be filled with more entries when the values are smaller.

For shard 3, however, where the number of entries is set to 45M in code,
the memory limit of 3GB is active. Since we change how this limit is
calculated we will see fewer entries cached with this change.
Shard 3 should still be okay since we have a prefetcher in place now
that works even when the cache is empty.
nikurt pushed a commit that referenced this pull request Nov 9, 2022
Instead of checking the number of values and their sizes, the caches are
now limited by the actual (approximated) memory consumption.

This changes what `total_size` in `TrieCacheInner` means, which is also
observable through Prometheus metrics.

Existing configuration works with slightly altered effects.
Number of entries convert to an implicit size limit. Since the explicit
default size limit currently is 3GB and the default max entries is set
50k, the implicit limit = 50k * 1000B = 50MB is stronger. This still
limits the number of largest entries to 50k but allows the cache to
be filled with more entries when the values are smaller.

For shard 3, however, where the number of entries is set to 45M in code,
the memory limit of 3GB is active. Since we change how this limit is
calculated we will see fewer entries cached with this change.
Shard 3 should still be okay since we have a prefetcher in place now
that works even when the cache is empty.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants