accounts-db: Improve sharding of the read-only cache #4255

vadorovsky · 2025-01-03T12:43:50Z

Depends on #3943

Problem

The default amount of shards in DashMap is num_cpus * 4. That means 256 on validators with 64 threads. A number of accounts held in caches on mainnet beta validators with default configuration is around 44-56K, which results in around 200 accounts per shard. That means, locking a shard locks an access to 200 accounts.

Summary of Changes

Fix that by increasing the amount of shards to 65536, which in the best case will keep just one account per shard. Acquiring a lock shouldn't lock access to any other accounts.

A single RwLock<RawTable<(K, V)>>, without the key and value included, takes 48 bytes. Therefore, 65536 shards are taking 3145728 B, which rounds up to 3MB. It's an acceptable increase in memory usage.

Ref #4272

brooksprumo · 2025-01-04T13:57:18Z

Fix that by increasing the amount of shards to 65536 [..]

Use aHash for better performance.

Can you please split these two changes into two PRs?

Instead of rigid LRU eviction, perform sampled LRU eviction. Sampled LRU eviction takes K random keys and picks the one with the lowest update timestamp. This way, even though the evicted element is not always the oldest in the whole cache, removes the necessity of maintaining a queue and reduces the contention caused by locking the queue. The K parameter is configurable, but the best performing value so far was 8 and it's used as the default one. The new eviction mechanism results in performance improvement in the cache eviction benchmarks: ``` read_only_accounts_cache_eviction_lo_hi/load/8 time: [1.4096 µs 1.4108 µs 1.4117 µs] change: [-67.900% -64.278% -60.095%] (p = 0.00 < 0.05) Performance has improved. Found 19 outliers among 100 measurements (19.00%) 6 (6.00%) low severe 13 (13.00%) low mild read_only_accounts_cache_eviction_lo_hi/store/8 time: [2.4480 µs 2.4638 µs 2.4759 µs] change: [-77.943% -77.588% -77.268%] (p = 0.00 < 0.05) Performance has improved. Benchmarking read_only_accounts_cache_eviction_lo_hi/load/16 time: [1.5526 µs 1.5536 µs 1.5544 µs] change: [-79.195% -76.569% -73.647%] (p = 0.00 < 0.05) Performance has improved. Found 21 outliers among 100 measurements (21.00%) 18 (18.00%) low severe 3 (3.00%) low mild Benchmarking read_only_accounts_cache_eviction_lo_hi/store/16 time: [2.8992 µs 2.9145 µs 2.9265 µs] change: [-87.526% -87.320% -87.116%] (p = 0.00 < 0.05) Performance has improved. Benchmarking read_only_accounts_cache_eviction_lo_hi/load/32 time: [1.7435 µs 1.7538 µs 1.7654 µs] change: [-86.960% -85.829% -84.566%] (p = 0.00 < 0.05) Performance has improved. Found 19 outliers among 100 measurements (19.00%) 14 (14.00%) low mild 3 (3.00%) high mild 2 (2.00%) high severe Benchmarking read_only_accounts_cache_eviction_lo_hi/store/32 time: [3.4493 µs 3.5638 µs 3.7102 µs] change: [-90.941% -90.560% -90.136%] (p = 0.00 < 0.05) Performance has improved. Found 10 outliers among 100 measurements (10.00%) 10 (10.00%) high severe Benchmarking read_only_accounts_cache_eviction_lo_hi/load/64 time: [3.7502 µs 4.0713 µs 4.4061 µs] change: [-88.290% -85.987% -82.879%] (p = 0.00 < 0.05) Performance has improved. Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) high mild 1 (1.00%) high severe Benchmarking read_only_accounts_cache_eviction_lo_hi/store/64 time: [7.4959 µs 7.9429 µs 8.4066 µs] change: [-90.591% -90.141% -89.641%] (p = 0.00 < 0.05) Performance has improved. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild Benchmarking read_only_accounts_cache_eviction_hi/load/8 time: [1.6254 µs 1.6261 µs 1.6267 µs] change: [-58.221% -54.458% -50.268%] (p = 0.00 < 0.05) Performance has improved. Found 20 outliers among 100 measurements (20.00%) 13 (13.00%) low severe 7 (7.00%) low mild Benchmarking read_only_accounts_cache_eviction_hi/store/8 time: [2.5913 µs 2.6101 µs 2.6257 µs] change: [-79.288% -79.077% -78.838%] (p = 0.00 < 0.05) Performance has improved. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) low mild Benchmarking read_only_accounts_cache_eviction_hi/load/16 time: [1.5098 µs 1.5108 µs 1.5116 µs] change: [-78.379% -76.092% -73.612%] (p = 0.00 < 0.05) Performance has improved. Found 21 outliers among 100 measurements (21.00%) 17 (17.00%) low severe 4 (4.00%) low mild Benchmarking read_only_accounts_cache_eviction_hi/store/16 time: [2.8342 µs 2.8470 µs 2.8568 µs] change: [-86.859% -86.720% -86.575%] (p = 0.00 < 0.05) Performance has improved. Benchmarking read_only_accounts_cache_eviction_hi/load/32 time: [2.1136 µs 2.2410 µs 2.3841 µs] change: [-83.144% -81.384% -79.375%] (p = 0.00 < 0.05) Performance has improved. Found 36 outliers among 100 measurements (36.00%) 14 (14.00%) low severe 1 (1.00%) low mild 3 (3.00%) high mild 18 (18.00%) high severe Benchmarking read_only_accounts_cache_eviction_hi/store/32 time: [3.2238 µs 3.2357 µs 3.2463 µs] change: [-92.514% -92.427% -92.338%] (p = 0.00 < 0.05) Performance has improved. Found 4 outliers among 100 measurements (4.00%) 3 (3.00%) low mild 1 (1.00%) high severe Benchmarking read_only_accounts_cache_eviction_hi/load/64 time: [5.5431 µs 5.7115 µs 5.9002 µs] change: [-82.871% -81.517% -80.064%] (p = 0.00 < 0.05) Performance has improved. Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high mild Benchmarking read_only_accounts_cache_eviction_hi/store/64 time: [7.2374 µs 7.7026 µs 8.1520 µs] change: [-90.548% -90.069% -89.579%] (p = 0.00 < 0.05) Performance has improved. Found 26 outliers among 100 measurements (26.00%) 9 (9.00%) low severe 3 (3.00%) low mild 9 (9.00%) high mild 5 (5.00%) high severe ```

The default amount of shards in `DashMap` is `num_cpus * 4`. That means 256 on validators with 64 threads. A number of accounts held in caches on mainnet beta validators with default configuration is around 44-56K, which results in around 200 accounts per shard. That means, locking a shard locks an access to 200 accounts. Fix that by increasing the amount of shards to 65536, which in the best case will keep just one account per shard. Acquiring a lock shouldn't lock access to any other accounts. A single `RwLock<RawTable<(K, V)>>`, without the key and value included, takes 48 bytes. Therefore, 65536 shards are taking 3145728 B, which rounds up to 3MB. It's an acceptable increase in memopry usage.

vadorovsky force-pushed the ro-cache-shards branch 3 times, most recently from 0b1b01d to 2325c7e Compare January 3, 2025 20:12

vadorovsky changed the title ~~accounts-db: Improve sharding and hashing of the read-only cache~~ accounts-db: Improve sharding of the read-only cache Jan 6, 2025

vadorovsky force-pushed the ro-cache-shards branch from 2325c7e to d0a8348 Compare January 6, 2025 14:02

vadorovsky mentioned this pull request Jan 6, 2025

accounts-db: Sampled LRU eviction #3943

Merged

vadorovsky force-pushed the ro-cache-shards branch 6 times, most recently from 14c01bd to f881c83 Compare January 6, 2025 20:10

vadorovsky added 2 commits January 7, 2025 09:14

vadorovsky force-pushed the ro-cache-shards branch from f881c83 to f16473d Compare January 7, 2025 12:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

accounts-db: Improve sharding of the read-only cache #4255

accounts-db: Improve sharding of the read-only cache #4255

vadorovsky commented Jan 3, 2025 •

edited

Loading

brooksprumo commented Jan 4, 2025

accounts-db: Improve sharding of the read-only cache #4255

Are you sure you want to change the base?

accounts-db: Improve sharding of the read-only cache #4255

Conversation

vadorovsky commented Jan 3, 2025 • edited Loading

Problem

Summary of Changes

brooksprumo commented Jan 4, 2025

vadorovsky commented Jan 3, 2025 •

edited

Loading