forked from solana-labs/solana
-
Notifications
You must be signed in to change notification settings - Fork 331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
accounts-db: Improve sharding of the read-only cache #4255
Draft
vadorovsky
wants to merge
2
commits into
anza-xyz:master
Choose a base branch
from
vadorovsky:ro-cache-shards
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+171
−136
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
0b1b01d
to
2325c7e
Compare
Can you please split these two changes into two PRs? |
2325c7e
to
d0a8348
Compare
14c01bd
to
f881c83
Compare
Instead of rigid LRU eviction, perform sampled LRU eviction. Sampled LRU eviction takes K random keys and picks the one with the lowest update timestamp. This way, even though the evicted element is not always the oldest in the whole cache, removes the necessity of maintaining a queue and reduces the contention caused by locking the queue. The K parameter is configurable, but the best performing value so far was 8 and it's used as the default one. The new eviction mechanism results in performance improvement in the cache eviction benchmarks: ``` read_only_accounts_cache_eviction_lo_hi/load/8 time: [1.4096 µs 1.4108 µs 1.4117 µs] change: [-67.900% -64.278% -60.095%] (p = 0.00 < 0.05) Performance has improved. Found 19 outliers among 100 measurements (19.00%) 6 (6.00%) low severe 13 (13.00%) low mild read_only_accounts_cache_eviction_lo_hi/store/8 time: [2.4480 µs 2.4638 µs 2.4759 µs] change: [-77.943% -77.588% -77.268%] (p = 0.00 < 0.05) Performance has improved. Benchmarking read_only_accounts_cache_eviction_lo_hi/load/16 time: [1.5526 µs 1.5536 µs 1.5544 µs] change: [-79.195% -76.569% -73.647%] (p = 0.00 < 0.05) Performance has improved. Found 21 outliers among 100 measurements (21.00%) 18 (18.00%) low severe 3 (3.00%) low mild Benchmarking read_only_accounts_cache_eviction_lo_hi/store/16 time: [2.8992 µs 2.9145 µs 2.9265 µs] change: [-87.526% -87.320% -87.116%] (p = 0.00 < 0.05) Performance has improved. Benchmarking read_only_accounts_cache_eviction_lo_hi/load/32 time: [1.7435 µs 1.7538 µs 1.7654 µs] change: [-86.960% -85.829% -84.566%] (p = 0.00 < 0.05) Performance has improved. Found 19 outliers among 100 measurements (19.00%) 14 (14.00%) low mild 3 (3.00%) high mild 2 (2.00%) high severe Benchmarking read_only_accounts_cache_eviction_lo_hi/store/32 time: [3.4493 µs 3.5638 µs 3.7102 µs] change: [-90.941% -90.560% -90.136%] (p = 0.00 < 0.05) Performance has improved. Found 10 outliers among 100 measurements (10.00%) 10 (10.00%) high severe Benchmarking read_only_accounts_cache_eviction_lo_hi/load/64 time: [3.7502 µs 4.0713 µs 4.4061 µs] change: [-88.290% -85.987% -82.879%] (p = 0.00 < 0.05) Performance has improved. Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) high mild 1 (1.00%) high severe Benchmarking read_only_accounts_cache_eviction_lo_hi/store/64 time: [7.4959 µs 7.9429 µs 8.4066 µs] change: [-90.591% -90.141% -89.641%] (p = 0.00 < 0.05) Performance has improved. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild Benchmarking read_only_accounts_cache_eviction_hi/load/8 time: [1.6254 µs 1.6261 µs 1.6267 µs] change: [-58.221% -54.458% -50.268%] (p = 0.00 < 0.05) Performance has improved. Found 20 outliers among 100 measurements (20.00%) 13 (13.00%) low severe 7 (7.00%) low mild Benchmarking read_only_accounts_cache_eviction_hi/store/8 time: [2.5913 µs 2.6101 µs 2.6257 µs] change: [-79.288% -79.077% -78.838%] (p = 0.00 < 0.05) Performance has improved. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) low mild Benchmarking read_only_accounts_cache_eviction_hi/load/16 time: [1.5098 µs 1.5108 µs 1.5116 µs] change: [-78.379% -76.092% -73.612%] (p = 0.00 < 0.05) Performance has improved. Found 21 outliers among 100 measurements (21.00%) 17 (17.00%) low severe 4 (4.00%) low mild Benchmarking read_only_accounts_cache_eviction_hi/store/16 time: [2.8342 µs 2.8470 µs 2.8568 µs] change: [-86.859% -86.720% -86.575%] (p = 0.00 < 0.05) Performance has improved. Benchmarking read_only_accounts_cache_eviction_hi/load/32 time: [2.1136 µs 2.2410 µs 2.3841 µs] change: [-83.144% -81.384% -79.375%] (p = 0.00 < 0.05) Performance has improved. Found 36 outliers among 100 measurements (36.00%) 14 (14.00%) low severe 1 (1.00%) low mild 3 (3.00%) high mild 18 (18.00%) high severe Benchmarking read_only_accounts_cache_eviction_hi/store/32 time: [3.2238 µs 3.2357 µs 3.2463 µs] change: [-92.514% -92.427% -92.338%] (p = 0.00 < 0.05) Performance has improved. Found 4 outliers among 100 measurements (4.00%) 3 (3.00%) low mild 1 (1.00%) high severe Benchmarking read_only_accounts_cache_eviction_hi/load/64 time: [5.5431 µs 5.7115 µs 5.9002 µs] change: [-82.871% -81.517% -80.064%] (p = 0.00 < 0.05) Performance has improved. Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high mild Benchmarking read_only_accounts_cache_eviction_hi/store/64 time: [7.2374 µs 7.7026 µs 8.1520 µs] change: [-90.548% -90.069% -89.579%] (p = 0.00 < 0.05) Performance has improved. Found 26 outliers among 100 measurements (26.00%) 9 (9.00%) low severe 3 (3.00%) low mild 9 (9.00%) high mild 5 (5.00%) high severe ```
The default amount of shards in `DashMap` is `num_cpus * 4`. That means 256 on validators with 64 threads. A number of accounts held in caches on mainnet beta validators with default configuration is around 44-56K, which results in around 200 accounts per shard. That means, locking a shard locks an access to 200 accounts. Fix that by increasing the amount of shards to 65536, which in the best case will keep just one account per shard. Acquiring a lock shouldn't lock access to any other accounts. A single `RwLock<RawTable<(K, V)>>`, without the key and value included, takes 48 bytes. Therefore, 65536 shards are taking 3145728 B, which rounds up to 3MB. It's an acceptable increase in memopry usage.
f881c83
to
f16473d
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Depends on #3943
Problem
The default amount of shards in
DashMap
isnum_cpus * 4
. That means 256 on validators with 64 threads. A number of accounts held in caches on mainnet beta validators with default configuration is around 44-56K, which results in around 200 accounts per shard. That means, locking a shard locks an access to 200 accounts.Summary of Changes
Fix that by increasing the amount of shards to 65536, which in the best case will keep just one account per shard. Acquiring a lock shouldn't lock access to any other accounts.
A single
RwLock<RawTable<(K, V)>>
, without the key and value included, takes 48 bytes. Therefore, 65536 shards are taking 3145728 B, which rounds up to 3MB. It's an acceptable increase in memory usage.Ref #4272