Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: improved shard cache #7429

Merged
merged 47 commits into from
Aug 26, 2022
Merged

feat: improved shard cache #7429

merged 47 commits into from
Aug 26, 2022

Conversation

Longarithm
Copy link
Member

@Longarithm Longarithm commented Aug 17, 2022

Improve shard cache to use RAM more effectively.

Three changes are introduced:

  1. If we put new value to LRU cache and total size of existing values exceeds total_sizes_capacity, we evict values from it until that is no longer the case. So the actual total size should never exceed total_size_limit + TRIE_LIMIT_CACHED_VALUE_SIZE.
    We add this because value sizes generally vary from 1 B to 500 B and we want to count cache size precisely. The current value size limit is 1000 B, so for average size of 100 B we use shard cache 10x more effectively.

  2. When we save trie changes, we previously just applied insertions to the shard cache - which means that we added newly created nodes to it. Deletions were applied only during GC of the old block. Now we apply deletions and call pop for shard cache during saving trie changes of a new block as well.
    This helps to use shard cache space more effectively. Previously nodes from the old state could occupy a lot of space which led to eviction of nodes from the fresh state.

  3. If shard cache pop is called, item is not deleted but put to the deletions queue with deletions_queue_capacity first. If popped item doesn't fit in the queue, the last item is removed from the queue and LRU cache, and newly popped item is inserted to the queue.
    It is needed to delay removals when we have forks. In simple case, two blocks may share a parent P. When we process the first block, we call pop for some nodes from P, but when we process the second block, we may need to read some nodes from P as well. Now we delay removal by 100_000, which helps to keep all nodes from 3 completely full last blocks.

Next steps:

  • general cleanup after getting it merged;
  • make new constants configurable, similarly to trie cache capacity

We want to get the whole update merged by next Wednesday, and cherry-pick it to 1.28 and 1.29 releases. This is not a protocol change, so it doesn't require a separate release or protocol version.

Testing

@Longarithm Longarithm marked this pull request as ready for review August 19, 2022 14:48
@Longarithm Longarithm requested a review from a team as a code owner August 19, 2022 14:48
@Longarithm Longarithm requested a review from mina86 August 19, 2022 14:48
@Longarithm Longarithm self-assigned this Aug 19, 2022
@Longarithm Longarithm changed the title [DO NOT MERGE] feat: improved shard cache feat: improved shard cache Aug 19, 2022
core/store/src/trie/trie_storage.rs Show resolved Hide resolved
core/store/src/trie/trie_storage.rs Show resolved Hide resolved
@Longarithm Longarithm requested review from matklad and mm-near August 23, 2022 17:54
Copy link
Contributor

@matklad matklad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see anything obviously wrong now! This is fiddly, so a second pair of eyes would help (and @mm-near has an outstanding NO review).

I see there are nayduck failures in the run. Are those fixed at the latest version of the code?

// Convert trie changes to database ops for trie nodes.
// Create separate store update for deletions, because we want to update cache and don't want to remove nodes
// from the store.
let mut deletions_store_update = self.store().store_update();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another quite questionable bit of code, we should think about how to make this clearer in a follow-up

@Longarithm
Copy link
Member Author

Longarithm commented Aug 24, 2022

I see there are nayduck failures in the run. Are those fixed at the latest version of the code?

They have been failing for a while, so it is not caused by this PR. I raised the question here https://near.zulipchat.com/#narrow/stream/295558-pagoda.2Fcore/topic/chunk_management.20test.20failure/near/294949074

core/store/src/config.rs Outdated Show resolved Hide resolved
core/store/src/trie/trie_storage.rs Show resolved Hide resolved
core/store/src/trie/trie_storage.rs Show resolved Hide resolved
@Longarithm Longarithm merged commit a8da6a0 into master Aug 26, 2022
@Longarithm Longarithm deleted the smart-cache-1 branch August 26, 2022 15:54
Longarithm added a commit that referenced this pull request Aug 26, 2022
Co-authored-by: firatNEAR <firat@near.org>
Co-authored-by: firatNEAR <102993450+firatNEAR@users.noreply.github.com>
Longarithm added a commit that referenced this pull request Aug 26, 2022
Co-authored-by: firatNEAR <firat@near.org>
Co-authored-by: firatNEAR <102993450+firatNEAR@users.noreply.github.com>
Longarithm added a commit that referenced this pull request Aug 26, 2022
Co-authored-by: firatNEAR <firat@near.org>
Co-authored-by: firatNEAR <102993450+firatNEAR@users.noreply.github.com>
@firatNEAR firatNEAR linked an issue Aug 29, 2022 that may be closed by this pull request
jakmeier added a commit to jakmeier/nearcore that referenced this pull request Aug 29, 2022
Finding the counter of a metric is a relatively expensive operation,
when compared to just updating its value. A performance regression
of about 20% on read perfromance has been observed due to additional
metrics added in near#7429.

This commit stores the counters in `TrieCacheInne` and
`TrieCachingStorag` to avoid the lookup in hot paths like when
we hit the shard or chunk cache.
jakmeier added a commit that referenced this pull request Aug 29, 2022
* perf: avoid expensive lookup of storage metrics

Finding the counter of a metric is a relatively expensive operation,
when compared to just updating its value. A performance regression
of about 20% on read perfromance has been observed due to additional
metrics added in #7429.

This commit stores the counters in `TrieCacheInne` and
`TrieCachingStorag` to avoid the lookup in hot paths like when
we hit the shard or chunk cache.

* apply reviewer suggestions

- group metrics together in two new structs
- update comments
Longarithm pushed a commit that referenced this pull request Aug 29, 2022
* perf: avoid expensive lookup of storage metrics

Finding the counter of a metric is a relatively expensive operation,
when compared to just updating its value. A performance regression
of about 20% on read perfromance has been observed due to additional
metrics added in #7429.

This commit stores the counters in `TrieCacheInne` and
`TrieCachingStorag` to avoid the lookup in hot paths like when
we hit the shard or chunk cache.

* apply reviewer suggestions

- group metrics together in two new structs
- update comments
Longarithm pushed a commit that referenced this pull request Aug 29, 2022
* perf: avoid expensive lookup of storage metrics

Finding the counter of a metric is a relatively expensive operation,
when compared to just updating its value. A performance regression
of about 20% on read perfromance has been observed due to additional
metrics added in #7429.

This commit stores the counters in `TrieCacheInne` and
`TrieCachingStorag` to avoid the lookup in hot paths like when
we hit the shard or chunk cache.

* apply reviewer suggestions

- group metrics together in two new structs
- update comments
nikurt pushed a commit that referenced this pull request Nov 9, 2022
* perf: avoid expensive lookup of storage metrics

Finding the counter of a metric is a relatively expensive operation,
when compared to just updating its value. A performance regression
of about 20% on read perfromance has been observed due to additional
metrics added in #7429.

This commit stores the counters in `TrieCacheInne` and
`TrieCachingStorag` to avoid the lookup in hot paths like when
we hit the shard or chunk cache.

* apply reviewer suggestions

- group metrics together in two new structs
- update comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create metrics for shard/chunk cache part1
5 participants