feat: enable receipt prefetching by default #7661

jakmeier · 2022-09-21T16:53:53Z

Prefetch receipt meta data (account and access keys) ahead of time.
This recent performance optimization has been disabled by default.

In lab settings, performance improvement is confirmed. Using the
estimator to measure the time it takes to process empty receipts,
on a DB with 2 million accounts, on a local SSD, with enabled shard
caches. The result is as follows.

sender = receiver: 737us -> 386 us
sender != receiver: 1014us -> 644us
overhead per block: 6.9us -> 7.4us

Note that this is with 100 empty receipts in the same block, with all
different accounts. In real traffic it usually does not happen that so
many different accounts are accessed in the same block. But it is
allowed and we must be able process this case in reasonable time.
So even if it might not help in the average case, it makes sense to
activate this feature to speed up the worst-case.

Currently we use 8 IO threads per shard. Repeated experiments with
more threads showed no difference.
Decreasing it to 4 threads is about equal to 8 threads. Going lower is
significantly worse. Thus, overall, 8 threads seems reasonable here.

Canary nodes in testnet and mainnet with the feature enabled show that
the feature also works as expected on real traffic. The memory impact is
minimal, usually less than 40MB of reserved capacity, which is less than
8MB actual memory because 8 threads reserve 4MB each ahead of actually
fetching the data.

jakmeier · 2022-09-21T16:54:46Z

@mm-near nominating you as a reviewer since you already reviewed the feature, this one is just the decision to activate it by default from now on

core/store/src/config.rs

Prefetch receipt meta data (account and access keys) ahead of time. This recent performance optimization has been disabled by default. In lab settings, performance improvement is confirmed. Using the estimator to measure the time it takes to process empty receipts, on a DB with 2 million accounts, on a local SSD, with enabled shard caches. The result is as follows. - sender = receiver: 737us -> 386 us - sender != receiver: 1014us -> 644us - overhead per block: 6.9us -> 7.4us Note that this is with 100 empty receipts in the same block, with all different accounts. In real traffic it usually does not happen that so many different accounts are accessed in the same block. But it is allowed and we must be able process this case in reasonable time. So even if it might not help in the average case, it makes sense to activate this feature to speed up the worst-case. Currently we use 8 IO threads per shard. Repeated experiments with more threads showed no difference. Decreasing it to 4 threads is about equal to 8 threads. Going lower is significantly worse. Thus, overall, 8 threads seems reasonable here. Canary nodes in testnet and mainnet with the feature enabled show that the feature also works as expected on real traffic. The memory impact is minimal, usually less than 40MB of reserved capacity, which is less than 8MB actual memory because 8 threads reserve 4MB each ahead of actually fetching the data.

Explicitly stop and wait for prefetching background threads to terminate when a testbed is dropped. This avoids that estimations are influenced by background threads left over from previous estimations, which we have observed since merging near#7661.

Prefetch receipt meta data (account and access keys) ahead of time. This recent performance optimization has been disabled by default. In lab settings, performance improvement is confirmed. Using the estimator to measure the time it takes to process empty receipts, on a DB with 2 million accounts, on a local SSD, with enabled shard caches. The result is as follows. - sender = receiver: 737us -> 386 us - sender != receiver: 1014us -> 644us - overhead per block: 6.9us -> 7.4us Note that this is with 100 empty receipts in the same block, with all different accounts. In real traffic it usually does not happen that so many different accounts are accessed in the same block. But it is allowed and we must be able process this case in reasonable time. So even if it might not help in the average case, it makes sense to activate this feature to speed up the worst-case. Currently we use 8 IO threads per shard. Repeated experiments with more threads showed no difference. Decreasing it to 4 threads is about equal to 8 threads. Going lower is significantly worse. Thus, overall, 8 threads seems reasonable here. Canary nodes in testnet and mainnet with the feature enabled show that the feature also works as expected on real traffic. The memory impact is minimal, usually less than 40MB of reserved capacity, which is less than 8MB actual memory because 8 threads reserve 4MB each ahead of actually fetching the data.

Explicitly stop and wait for prefetching background threads to terminate when the `ShardTriesInner` is dropped. This avoids that estimations are influenced by background threads left over from previous estimations, which we have observed since merging near#7661.

Explicitly stop and wait for prefetching background threads to terminate when the `ShardTriesInner` is dropped. This avoids that estimations are influenced by background threads left over from previous estimations, which we have observed since merging #7661.

Prefetch receipt meta data (account and access keys) ahead of time. This recent performance optimization has been disabled by default. In lab settings, performance improvement is confirmed. Using the estimator to measure the time it takes to process empty receipts, on a DB with 2 million accounts, on a local SSD, with enabled shard caches. The result is as follows. - sender = receiver: 737us -> 386 us - sender != receiver: 1014us -> 644us - overhead per block: 6.9us -> 7.4us Note that this is with 100 empty receipts in the same block, with all different accounts. In real traffic it usually does not happen that so many different accounts are accessed in the same block. But it is allowed and we must be able process this case in reasonable time. So even if it might not help in the average case, it makes sense to activate this feature to speed up the worst-case. Currently we use 8 IO threads per shard. Repeated experiments with more threads showed no difference. Decreasing it to 4 threads is about equal to 8 threads. Going lower is significantly worse. Thus, overall, 8 threads seems reasonable here. Canary nodes in testnet and mainnet with the feature enabled show that the feature also works as expected on real traffic. The memory impact is minimal, usually less than 40MB of reserved capacity, which is less than 8MB actual memory because 8 threads reserve 4MB each ahead of actually fetching the data.

Explicitly stop and wait for prefetching background threads to terminate when the `ShardTriesInner` is dropped. This avoids that estimations are influenced by background threads left over from previous estimations, which we have observed since merging #7661.

jakmeier requested a review from mm-near September 21, 2022 16:53

jakmeier requested a review from a team as a code owner September 21, 2022 16:53

mm-near approved these changes Sep 22, 2022

View reviewed changes

core/store/src/config.rs Show resolved Hide resolved

jakmeier added 2 commits September 23, 2022 09:33

update changelog

d98d7b9

jakmeier force-pushed the enable-receipt-prefetch branch from 9e711b2 to d98d7b9 Compare September 23, 2022 07:35

jakmeier added the S-automerge label Sep 23, 2022

near-bulldozer bot merged commit 1def1e9 into near:master Sep 23, 2022

jakmeier mentioned this pull request Sep 23, 2022

Enable account and access key prefetching by default #7636

Closed

jakmeier mentioned this pull request Sep 26, 2022

fix: stop background threads between estimations #7689

Closed

jakmeier deleted the enable-receipt-prefetch branch September 26, 2022 12:32

jakmeier mentioned this pull request Sep 28, 2022

fix: properly stop prefetching background threads #7712

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: enable receipt prefetching by default #7661

feat: enable receipt prefetching by default #7661

jakmeier commented Sep 21, 2022

jakmeier commented Sep 21, 2022

feat: enable receipt prefetching by default #7661

feat: enable receipt prefetching by default #7661

Conversation

jakmeier commented Sep 21, 2022

jakmeier commented Sep 21, 2022