[Merged by Bors] - Cache target attester balances for unrealized FFG progression calculation #4362

jimmygchen · 2023-06-01T13:16:29Z

Issue Addressed

Proposed Changes

This PR introduces a "progressive balances" cache on the BeaconState, which keeps track of the accumulated target attestation balance for the current & previous epochs. The cached values are utilised by fork choice to calculate unrealized justification and finalization (instead of converting epoch participation arrays to balances for each block we receive).

This optimization will be rolled out gradually to allow for more testing. A new --progressive-balances disabled|checked|strict|fast flag is introduced to support this:

checked: enabled with checks against participation cache, and falls back to the existing epoch processing calculation if there is a total target attester balance mismatch. There is no performance gain from this as the participation cache still needs to be computed. This is the default mode for now.
strict: enabled with checks against participation cache, returns error if there is a mismatch. Used for testing only.
fast: enabled with no comparative checks and without computing the participation cache. This mode gives us the performance gains from the optimization. This is still experimental and not currently recommended for production usage, but will become the default mode in a future release.
disabled: disable the usage of progressive cache, and use the existing method for FFG progression calculation. This mode may be useful if we find a bug and want to stop the frequent error logs.

Tasks

Initial cache implementation in BeaconState
Perform checks in fork choice to compare the progressive balances cache against results from ParticipationCache
Add CLI flag, and disable the optimization by default
Testing on Goerli & Benchmarking
Move caching logic from state processing to the ProgressiveBalancesCache (see this comment)
Add attesting balance metrics

…lization if balances are available.

… balances cache in fork choice.

… Some cleanups.

consensus/state_processing/src/common/initialize_progressive_balances_cache.rs

…onversion workaround.

jimmygchen · 2023-06-09T11:55:43Z

Benchmark from initial testing looks pretty good, beacon_fork_choice_process_block_seconds metrics improved quite significantly with --progressive-balances fast

In terms of correctness, it looks pretty good so far, been running a node with --progressive-balances checked for 24 hours and haven't seen any mismatch with the old method (ParticipationCache) yet, will keep monitoring.

jimmygchen · 2023-06-26T07:17:07Z

Alternate name ideas?

build_most_caches

build_intrinsic_caches where "intrinsic" = caches that don't need state processing). However this is a bit inaccurate because the tree_hash_cache doesn't need state_processing...

??

Renamed to build_caches as discussed.

…eaconState`

paulhauner

This is looking really good! There's quite a few edge-cases in here that have been covered well (e.g., slashings, balance updates at the end of an epoch).

I just have a few very minor suggestions/comments!

testing/ef_tests/src/cases/fork_choice.rs

consensus/state_processing/src/common/update_progressive_balances_cache.rs

paulhauner · 2023-06-29T06:51:56Z

consensus/fork_choice/src/fork_choice.rs

+    // Load cached balances
+    let progressive_balances_cache: &ProgressiveBalancesCache = state.progressive_balances_cache();
+    let previous_target_balance =
+        progressive_balances_cache.previous_epoch_target_attesting_balance()?;


I notice that we're expecting to have an initialized progressive_balances_cache and that we're going to error-out on on_block if we don't.

I understand that we are expecting an initialized cache since it's always called at the start of per_block_processing. This seems like a reasonable assumption to me, since I don't see any reason why we'd be processing a state in on_block that didn't just have a state applied to it.

Perhaps it would be more in the spirit of Checked mode for us just to default back to the non-optimized method if we happen to discover an uninitialized progressive_balances_cache?

Perhaps it would be more in the spirit of Checked mode for us just to default back to the non-optimized method if we happen to discover an uninitialized progressive_balances_cache?

This is a great point! Checked mode should just always fallback to the non-optimized method if there's anything wrong with the progressive_balances_cache. I've re-arranged the code a bit so that it falls back to the original method if this function failed for whatever reason in Checked mode.

Updated in 72935e1

Oops, missed a check in the previous commit and broke some tests. Fixed in f49daa0.

…g tests if there's a mismatch.

…ocessing` rather than per attestation / slashing.

…progressive_balances_cache` isn't initialized.

paulhauner · 2023-06-29T23:25:16Z

consensus/fork_choice/src/fork_choice.rs

+                                if progressive_balances_mode == ProgressiveBalancesMode::Checked {
+                                    error!(
+                                        log,
+                                        "Processing with progressive balances cache failed in checked mode";
+                                        "info" => "falling back to the non-optimized processing method",
+                                        "error" => ?e,
+                                    );
+                                    per_epoch_processing::altair::process_justification_and_finalization(
+                                        state,
+                                        maybe_participation_cache.as_ref().ok_or(Error::ParticipationCacheMissing)?,
+                                    ).map_err(Error::from)


Suggested change

if progressive_balances_mode == ProgressiveBalancesMode::Checked {

error!(

log,

"Processing with progressive balances cache failed in checked mode";

"info" => "falling back to the non-optimized processing method",

"error" => ?e,

);

per_epoch_processing::altair::process_justification_and_finalization(

state,

maybe_participation_cache.as_ref().ok_or(Error::ParticipationCacheMissing)?,

).map_err(Error::from)

if progressive_balances_mode != ProgressiveBalancesMode::Strict {

error!(

log,

"Processing with progressive balances cache failed in checked mode";

"info" => "falling back to the non-optimized processing method",

"error" => ?e,

);

let participation_cache = maybe_participation_cache

.map(Result::Ok)

.unwrap_or_else(|| ParticipationCache::new(state, spec))

.map_err(Error::ParticipationCacheBuild)?;

per_epoch_processing::altair::process_justification_and_finalization(

state,

&participation_cache,

).map_err(Error::from)

We could be a little bit more paranoid here and always fall back to the old method if the new method fails (unless we're in Strict mode)?

Nice, being paranoid is good for us - I need more of this actually, it's a bit of a mindset shift to what I have been used to for a while!
This is definitely much better error handling! (and the unwrap_or_else fallback), will update to the suggested approach and remove the unnecessary ParticipationCacheMissing error variant.

Updated in 90acdc7. Thanks 🙏

… strict mode. Improved error handling from review suggestions. Co-authored-by: Paul Hauner <6660660+paulhauner@users.noreply.github.com>

paulhauner

This looks great! I have full confidence that the new implementation is correct, and it doesn't even matter if it's not! 🎉

paulhauner · 2023-06-30T01:12:45Z

bors r+

…tion (#4362) ## Issue Addressed #4118 ## Proposed Changes This PR introduces a "progressive balances" cache on the `BeaconState`, which keeps track of the accumulated target attestation balance for the current & previous epochs. The cached values are utilised by fork choice to calculate unrealized justification and finalization (instead of converting epoch participation arrays to balances for each block we receive). This optimization will be rolled out gradually to allow for more testing. A new `--progressive-balances disabled|checked|strict|fast` flag is introduced to support this: - `checked`: enabled with checks against participation cache, and falls back to the existing epoch processing calculation if there is a total target attester balance mismatch. There is no performance gain from this as the participation cache still needs to be computed. **This is the default mode for now.** - `strict`: enabled with checks against participation cache, returns error if there is a mismatch. **Used for testing only**. - `fast`: enabled with no comparative checks and without computing the participation cache. This mode gives us the performance gains from the optimization. This is still experimental and not currently recommended for production usage, but will become the default mode in a future release. - `disabled`: disable the usage of progressive cache, and use the existing method for FFG progression calculation. This mode may be useful if we find a bug and want to stop the frequent error logs. ### Tasks - [x] Initial cache implementation in `BeaconState` - [x] Perform checks in fork choice to compare the progressive balances cache against results from `ParticipationCache` - [x] Add CLI flag, and disable the optimization by default - [x] Testing on Goerli & Benchmarking - [x] Move caching logic from state processing to the `ProgressiveBalancesCache` (see [this comment](#4362 (comment))) - [x] Add attesting balance metrics Co-authored-by: Jimmy Chen <jimmy@sigmaprime.io>

bors · 2023-06-30T03:46:49Z

Pull request successfully merged into unstable.

Build succeeded!

The publicly hosted instance of bors-ng is deprecated and will go away soon.

If you want to self-host your own instance, instructions are here.
For more help, visit the forum.

If you want to switch to GitHub's built-in merge queue, visit their help page.

…tion (sigp#4362) ## Issue Addressed sigp#4118 ## Proposed Changes This PR introduces a "progressive balances" cache on the `BeaconState`, which keeps track of the accumulated target attestation balance for the current & previous epochs. The cached values are utilised by fork choice to calculate unrealized justification and finalization (instead of converting epoch participation arrays to balances for each block we receive). This optimization will be rolled out gradually to allow for more testing. A new `--progressive-balances disabled|checked|strict|fast` flag is introduced to support this: - `checked`: enabled with checks against participation cache, and falls back to the existing epoch processing calculation if there is a total target attester balance mismatch. There is no performance gain from this as the participation cache still needs to be computed. **This is the default mode for now.** - `strict`: enabled with checks against participation cache, returns error if there is a mismatch. **Used for testing only**. - `fast`: enabled with no comparative checks and without computing the participation cache. This mode gives us the performance gains from the optimization. This is still experimental and not currently recommended for production usage, but will become the default mode in a future release. - `disabled`: disable the usage of progressive cache, and use the existing method for FFG progression calculation. This mode may be useful if we find a bug and want to stop the frequent error logs. ### Tasks - [x] Initial cache implementation in `BeaconState` - [x] Perform checks in fork choice to compare the progressive balances cache against results from `ParticipationCache` - [x] Add CLI flag, and disable the optimization by default - [x] Testing on Goerli & Benchmarking - [x] Move caching logic from state processing to the `ProgressiveBalancesCache` (see [this comment](sigp#4362 (comment))) - [x] Add attesting balance metrics Co-authored-by: Jimmy Chen <jimmy@sigmaprime.io>

…tion (sigp#4362) This PR introduces a "progressive balances" cache on the `BeaconState`, which keeps track of the accumulated target attestation balance for the current & previous epochs. The cached values are utilised by fork choice to calculate unrealized justification and finalization (instead of converting epoch participation arrays to balances for each block we receive). This optimization will be rolled out gradually to allow for more testing. A new `--progressive-balances disabled|checked|strict|fast` flag is introduced to support this: - `checked`: enabled with checks against participation cache, and falls back to the existing epoch processing calculation if there is a total target attester balance mismatch. There is no performance gain from this as the participation cache still needs to be computed. **This is the default mode for now.** - `strict`: enabled with checks against participation cache, returns error if there is a mismatch. **Used for testing only**. - `fast`: enabled with no comparative checks and without computing the participation cache. This mode gives us the performance gains from the optimization. This is still experimental and not currently recommended for production usage, but will become the default mode in a future release. - `disabled`: disable the usage of progressive cache, and use the existing method for FFG progression calculation. This mode may be useful if we find a bug and want to stop the frequent error logs. - [x] Initial cache implementation in `BeaconState` - [x] Perform checks in fork choice to compare the progressive balances cache against results from `ParticipationCache` - [x] Add CLI flag, and disable the optimization by default - [x] Testing on Goerli & Benchmarking - [x] Move caching logic from state processing to the `ProgressiveBalancesCache` (see [this comment](sigp#4362 (comment))) - [x] Add attesting balance metrics Co-authored-by: Jimmy Chen <jimmy@sigmaprime.io>

Initial implementation of ProgressiveTotalBalances cache

9eb6f5b

jimmygchen added work-in-progress PR is a work-in-progress optimization Something to make Lighthouse run more efficiently. labels Jun 1, 2023

jimmygchen added 6 commits June 1, 2023 23:40

Use ProgressiveTotalBalances cache to process jutification and fina…

c650a71

…lization if balances are available.

Add ProgressiveTotalBalances cache to BeaconState clone config

36ca37a

Revert incorrect change in per_epoch_processing and check progressive…

7e2212c

… balances cache in fork choice.

Initialize ProgressiveTotalBalances cache in per_block_processing

aeb086a

Passing fork choice tests \o/

0703bcd

Fix failing per_block_processing tests

d31b42e

jimmygchen added the consensus An issue/PR that touches consensus code, such as state_processing or block verification. label Jun 2, 2023

jimmygchen force-pushed the unrealized-ffg-progressive branch 2 times, most recently from d3e25db to e84d432 Compare June 2, 2023 14:36

Fix ef tests

f7cf5dd

jimmygchen force-pushed the unrealized-ffg-progressive branch 3 times, most recently from 6dd579d to fef3e16 Compare June 6, 2023 15:58

Fix progressive balances mismatch. Perform checks only in debug mode.…

787f67d

… Some cleanups.

jimmygchen force-pushed the unrealized-ffg-progressive branch from fef3e16 to 787f67d Compare June 6, 2023 16:00

Rename to progressive_balances_cache and add some documentation.

4dbb04c

jimmygchen changed the title ~~Optimise unrealized FFG progression calculation~~ Cache progressive balances for unrealized FFG progression calculation Jun 7, 2023

michaelsproul reviewed Jun 7, 2023

View reviewed changes

consensus/state_processing/src/common/initialize_progressive_balances_cache.rs Outdated Show resolved Hide resolved

Add --progressive-balances flag and implement flag behaviour

17d7525

jimmygchen changed the title ~~Cache progressive balances for unrealized FFG progression calculation~~ Cache target attester balances for unrealized FFG progression calculation Jun 7, 2023

jimmygchen added 2 commits June 7, 2023 23:15

Use Balance type in ProgressiveBalancesCache and remove balance c…

33e6b70

…onversion workaround.

Error handling cleanup. Use automatic error conversion.

70f0fa1

jimmygchen self-assigned this Jun 9, 2023

jimmygchen marked this pull request as ready for review June 9, 2023 13:53

jimmygchen added 2 commits June 13, 2023 15:34

Merge branch 'unstable' into unrealized-ffg-progressive

3044a94

Add more logging to progressive cache mismatch

21642f4

Merge branch 'unstable' into unrealized-ffg-progressive

2350ca3

jimmygchen added 2 commits June 26, 2023 19:33

Fix compilation errors in tests

617bd3a

Rename build_all_caches as we're now only building 3/5 caches on `B…

1da56b6

…eaconState`

paulhauner added the v4.3.0 Estimated Q2 2023 label Jun 29, 2023

paulhauner requested changes Jun 29, 2023

View reviewed changes

jimmygchen added 4 commits June 29, 2023 21:52

Use "strict" progressive balances mode in EF tests so we start failin…

2c9519e

…g tests if there's a mismatch.

Update progressive balances cache metrics at the end of `per_block_pr…

964f0bd

…ocessing` rather than per attestation / slashing.

Falls back to the non-optimized unrealized FFG processing method if `…

72935e1

…progressive_balances_cache` isn't initialized.

Add missing checks when updating progressive balances metrics.

f49daa0

jimmygchen force-pushed the unrealized-ffg-progressive branch from 6ba7e45 to f49daa0 Compare June 29, 2023 13:32

jimmygchen requested a review from paulhauner June 29, 2023 13:37

Update num_caches to 5 in test.

b12c091

paulhauner reviewed Jun 29, 2023

View reviewed changes

Always fallback to the un-optimized processing method unless we're in…

90acdc7

… strict mode. Improved error handling from review suggestions. Co-authored-by: Paul Hauner <6660660+paulhauner@users.noreply.github.com>

paulhauner approved these changes Jun 30, 2023

View reviewed changes

paulhauner added ready-for-merge This PR is ready to merge. and removed ready-for-review The code is ready for review labels Jun 30, 2023

bors bot changed the title ~~Cache target attester balances for unrealized FFG progression calculation~~ [Merged by Bors] - Cache target attester balances for unrealized FFG progression calculation Jun 30, 2023

bors bot closed this Jun 30, 2023

paulhauner mentioned this pull request Jun 30, 2023

Optimise unrealized FFG progression calculation #4118

Closed

jimmygchen deleted the unrealized-ffg-progressive branch August 8, 2023 11:31

AgeManning mentioned this pull request Aug 22, 2023

Propose Jimmy Chen protocolguild/documentation#100

Merged

michaelsproul mentioned this pull request Dec 3, 2023

Enable progressive balances fast mode by default #4971

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Merged by Bors] - Cache target attester balances for unrealized FFG progression calculation #4362

[Merged by Bors] - Cache target attester balances for unrealized FFG progression calculation #4362

jimmygchen commented Jun 1, 2023 •

edited

Loading

jimmygchen commented Jun 9, 2023

jimmygchen commented Jun 26, 2023

paulhauner left a comment

paulhauner Jun 29, 2023

jimmygchen Jun 29, 2023

jimmygchen Jun 29, 2023

paulhauner Jun 29, 2023

jimmygchen Jun 30, 2023 •

edited

Loading

jimmygchen Jun 30, 2023

paulhauner left a comment

paulhauner commented Jun 30, 2023

bors bot commented Jun 30, 2023

[Merged by Bors] - Cache target attester balances for unrealized FFG progression calculation #4362

[Merged by Bors] - Cache target attester balances for unrealized FFG progression calculation #4362

Conversation

jimmygchen commented Jun 1, 2023 • edited Loading

Issue Addressed

Proposed Changes

Tasks

jimmygchen commented Jun 9, 2023

jimmygchen commented Jun 26, 2023

paulhauner left a comment

Choose a reason for hiding this comment

paulhauner Jun 29, 2023

Choose a reason for hiding this comment

jimmygchen Jun 29, 2023

Choose a reason for hiding this comment

jimmygchen Jun 29, 2023

Choose a reason for hiding this comment

paulhauner Jun 29, 2023

Choose a reason for hiding this comment

jimmygchen Jun 30, 2023 • edited Loading

Choose a reason for hiding this comment

jimmygchen Jun 30, 2023

Choose a reason for hiding this comment

paulhauner left a comment

Choose a reason for hiding this comment

paulhauner commented Jun 30, 2023

bors bot commented Jun 30, 2023

jimmygchen commented Jun 1, 2023 •

edited

Loading

jimmygchen Jun 30, 2023 •

edited

Loading