Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Merged by Bors] - Cache target attester balances for unrealized FFG progression calculation #4362

Closed
wants to merge 40 commits into from

Conversation

jimmygchen
Copy link
Member

@jimmygchen jimmygchen commented Jun 1, 2023

Issue Addressed

#4118

Proposed Changes

This PR introduces a "progressive balances" cache on the BeaconState, which keeps track of the accumulated target attestation balance for the current & previous epochs. The cached values are utilised by fork choice to calculate unrealized justification and finalization (instead of converting epoch participation arrays to balances for each block we receive).

This optimization will be rolled out gradually to allow for more testing. A new --progressive-balances disabled|checked|strict|fast flag is introduced to support this:

  • checked: enabled with checks against participation cache, and falls back to the existing epoch processing calculation if there is a total target attester balance mismatch. There is no performance gain from this as the participation cache still needs to be computed. This is the default mode for now.
  • strict: enabled with checks against participation cache, returns error if there is a mismatch. Used for testing only.
  • fast: enabled with no comparative checks and without computing the participation cache. This mode gives us the performance gains from the optimization. This is still experimental and not currently recommended for production usage, but will become the default mode in a future release.
  • disabled: disable the usage of progressive cache, and use the existing method for FFG progression calculation. This mode may be useful if we find a bug and want to stop the frequent error logs.

Tasks

  • Initial cache implementation in BeaconState
  • Perform checks in fork choice to compare the progressive balances cache against results from ParticipationCache
  • Add CLI flag, and disable the optimization by default
  • Testing on Goerli & Benchmarking
  • Move caching logic from state processing to the ProgressiveBalancesCache (see this comment)
  • Add attesting balance metrics

@jimmygchen jimmygchen added work-in-progress PR is a work-in-progress optimization Something to make Lighthouse run more efficiently. labels Jun 1, 2023
@jimmygchen jimmygchen added the consensus An issue/PR that touches consensus code, such as state_processing or block verification. label Jun 2, 2023
@jimmygchen jimmygchen force-pushed the unrealized-ffg-progressive branch 2 times, most recently from d3e25db to e84d432 Compare June 2, 2023 14:36
@jimmygchen jimmygchen force-pushed the unrealized-ffg-progressive branch 3 times, most recently from 6dd579d to fef3e16 Compare June 6, 2023 15:58
@jimmygchen jimmygchen force-pushed the unrealized-ffg-progressive branch from fef3e16 to 787f67d Compare June 6, 2023 16:00
@jimmygchen jimmygchen changed the title Optimise unrealized FFG progression calculation Cache progressive balances for unrealized FFG progression calculation Jun 7, 2023
@jimmygchen jimmygchen changed the title Cache progressive balances for unrealized FFG progression calculation Cache target attester balances for unrealized FFG progression calculation Jun 7, 2023
@jimmygchen
Copy link
Member Author

Benchmark from initial testing looks pretty good, beacon_fork_choice_process_block_seconds metrics improved quite significantly with --progressive-balances fast

image

In terms of correctness, it looks pretty good so far, been running a node with --progressive-balances checked for 24 hours and haven't seen any mismatch with the old method (ParticipationCache) yet, will keep monitoring.

@jimmygchen jimmygchen self-assigned this Jun 9, 2023
@jimmygchen jimmygchen marked this pull request as ready for review June 9, 2023 13:53
@jimmygchen
Copy link
Member Author

Alternate name ideas?

  • build_most_caches
  • build_intrinsic_caches where "intrinsic" = caches that don't need state processing). However this is a bit inaccurate because the tree_hash_cache doesn't need state_processing...
  • ??

Renamed to build_caches as discussed.

@paulhauner paulhauner added the v4.3.0 Estimated Q2 2023 label Jun 29, 2023
Copy link
Member

@paulhauner paulhauner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking really good! There's quite a few edge-cases in here that have been covered well (e.g., slashings, balance updates at the end of an epoch).

I just have a few very minor suggestions/comments!

testing/ef_tests/src/cases/fork_choice.rs Outdated Show resolved Hide resolved
// Load cached balances
let progressive_balances_cache: &ProgressiveBalancesCache = state.progressive_balances_cache();
let previous_target_balance =
progressive_balances_cache.previous_epoch_target_attesting_balance()?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice that we're expecting to have an initialized progressive_balances_cache and that we're going to error-out on on_block if we don't.

I understand that we are expecting an initialized cache since it's always called at the start of per_block_processing. This seems like a reasonable assumption to me, since I don't see any reason why we'd be processing a state in on_block that didn't just have a state applied to it.

Perhaps it would be more in the spirit of Checked mode for us just to default back to the non-optimized method if we happen to discover an uninitialized progressive_balances_cache?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it would be more in the spirit of Checked mode for us just to default back to the non-optimized method if we happen to discover an uninitialized progressive_balances_cache?

This is a great point! Checked mode should just always fallback to the non-optimized method if there's anything wrong with the progressive_balances_cache. I've re-arranged the code a bit so that it falls back to the original method if this function failed for whatever reason in Checked mode.

Updated in 72935e1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, missed a check in the previous commit and broke some tests. Fixed in f49daa0.

@jimmygchen jimmygchen force-pushed the unrealized-ffg-progressive branch from 6ba7e45 to f49daa0 Compare June 29, 2023 13:32
@jimmygchen jimmygchen requested a review from paulhauner June 29, 2023 13:37
Comment on lines 799 to 809
if progressive_balances_mode == ProgressiveBalancesMode::Checked {
error!(
log,
"Processing with progressive balances cache failed in checked mode";
"info" => "falling back to the non-optimized processing method",
"error" => ?e,
);
per_epoch_processing::altair::process_justification_and_finalization(
state,
maybe_participation_cache.as_ref().ok_or(Error::ParticipationCacheMissing)?,
).map_err(Error::from)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if progressive_balances_mode == ProgressiveBalancesMode::Checked {
error!(
log,
"Processing with progressive balances cache failed in checked mode";
"info" => "falling back to the non-optimized processing method",
"error" => ?e,
);
per_epoch_processing::altair::process_justification_and_finalization(
state,
maybe_participation_cache.as_ref().ok_or(Error::ParticipationCacheMissing)?,
).map_err(Error::from)
if progressive_balances_mode != ProgressiveBalancesMode::Strict {
error!(
log,
"Processing with progressive balances cache failed in checked mode";
"info" => "falling back to the non-optimized processing method",
"error" => ?e,
);
let participation_cache = maybe_participation_cache
.map(Result::Ok)
.unwrap_or_else(|| ParticipationCache::new(state, spec))
.map_err(Error::ParticipationCacheBuild)?;
per_epoch_processing::altair::process_justification_and_finalization(
state,
&participation_cache,
).map_err(Error::from)

We could be a little bit more paranoid here and always fall back to the old method if the new method fails (unless we're in Strict mode)?

Copy link
Member Author

@jimmygchen jimmygchen Jun 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, being paranoid is good for us - I need more of this actually, it's a bit of a mindset shift to what I have been used to for a while!
This is definitely much better error handling! (and the unwrap_or_else fallback), will update to the suggested approach and remove the unnecessary ParticipationCacheMissing error variant.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 90acdc7. Thanks 🙏

… strict mode. Improved error handling from review suggestions.

Co-authored-by: Paul Hauner <6660660+paulhauner@users.noreply.github.com>
Copy link
Member

@paulhauner paulhauner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! I have full confidence that the new implementation is correct, and it doesn't even matter if it's not! 🎉

@paulhauner paulhauner added ready-for-merge This PR is ready to merge. and removed ready-for-review The code is ready for review labels Jun 30, 2023
@paulhauner
Copy link
Member

bors r+

bors bot pushed a commit that referenced this pull request Jun 30, 2023
…tion (#4362)

## Issue Addressed

#4118 

## Proposed Changes

This PR introduces a "progressive balances" cache on the `BeaconState`, which keeps track of the accumulated target attestation balance for the current & previous epochs. The cached values are utilised by fork choice to calculate unrealized justification and finalization (instead of converting epoch participation arrays to balances for each block we receive).

This optimization will be rolled out gradually to allow for more testing. A new `--progressive-balances disabled|checked|strict|fast` flag is introduced to support this:
- `checked`: enabled with checks against participation cache, and falls back to the existing epoch processing calculation if there is a total target attester balance mismatch. There is no performance gain from this as the participation cache still needs to be computed. **This is the default mode for now.**
- `strict`: enabled with checks against participation cache, returns error if there is a mismatch. **Used for testing only**.
- `fast`: enabled with no comparative checks and without computing the participation cache. This mode gives us the performance gains from the optimization. This is still experimental and not currently recommended for production usage, but will become the default mode in a future release.
- `disabled`: disable the usage of progressive cache, and use the existing method for FFG progression calculation. This mode may be useful if we find a bug and want to stop the frequent error logs.

### Tasks

- [x] Initial cache implementation in `BeaconState`
- [x] Perform checks in fork choice to compare the progressive balances cache against results from `ParticipationCache`
- [x] Add CLI flag, and disable the optimization by default
- [x] Testing on Goerli & Benchmarking
- [x]  Move caching logic from state processing to the `ProgressiveBalancesCache` (see [this comment](#4362 (comment)))
- [x] Add attesting balance metrics



Co-authored-by: Jimmy Chen <jimmy@sigmaprime.io>
@bors
Copy link

bors bot commented Jun 30, 2023

@bors bors bot changed the title Cache target attester balances for unrealized FFG progression calculation [Merged by Bors] - Cache target attester balances for unrealized FFG progression calculation Jun 30, 2023
@bors bors bot closed this Jun 30, 2023
ghost pushed a commit to oone-world/lighthouse that referenced this pull request Jul 13, 2023
…tion (sigp#4362)

## Issue Addressed

sigp#4118 

## Proposed Changes

This PR introduces a "progressive balances" cache on the `BeaconState`, which keeps track of the accumulated target attestation balance for the current & previous epochs. The cached values are utilised by fork choice to calculate unrealized justification and finalization (instead of converting epoch participation arrays to balances for each block we receive).

This optimization will be rolled out gradually to allow for more testing. A new `--progressive-balances disabled|checked|strict|fast` flag is introduced to support this:
- `checked`: enabled with checks against participation cache, and falls back to the existing epoch processing calculation if there is a total target attester balance mismatch. There is no performance gain from this as the participation cache still needs to be computed. **This is the default mode for now.**
- `strict`: enabled with checks against participation cache, returns error if there is a mismatch. **Used for testing only**.
- `fast`: enabled with no comparative checks and without computing the participation cache. This mode gives us the performance gains from the optimization. This is still experimental and not currently recommended for production usage, but will become the default mode in a future release.
- `disabled`: disable the usage of progressive cache, and use the existing method for FFG progression calculation. This mode may be useful if we find a bug and want to stop the frequent error logs.

### Tasks

- [x] Initial cache implementation in `BeaconState`
- [x] Perform checks in fork choice to compare the progressive balances cache against results from `ParticipationCache`
- [x] Add CLI flag, and disable the optimization by default
- [x] Testing on Goerli & Benchmarking
- [x]  Move caching logic from state processing to the `ProgressiveBalancesCache` (see [this comment](sigp#4362 (comment)))
- [x] Add attesting balance metrics



Co-authored-by: Jimmy Chen <jimmy@sigmaprime.io>
@jimmygchen jimmygchen deleted the unrealized-ffg-progressive branch August 8, 2023 11:31
Woodpile37 pushed a commit to Woodpile37/lighthouse that referenced this pull request Jan 6, 2024
…tion (sigp#4362)

This PR introduces a "progressive balances" cache on the `BeaconState`, which keeps track of the accumulated target attestation balance for the current & previous epochs. The cached values are utilised by fork choice to calculate unrealized justification and finalization (instead of converting epoch participation arrays to balances for each block we receive).

This optimization will be rolled out gradually to allow for more testing. A new `--progressive-balances disabled|checked|strict|fast` flag is introduced to support this:
- `checked`: enabled with checks against participation cache, and falls back to the existing epoch processing calculation if there is a total target attester balance mismatch. There is no performance gain from this as the participation cache still needs to be computed. **This is the default mode for now.**
- `strict`: enabled with checks against participation cache, returns error if there is a mismatch. **Used for testing only**.
- `fast`: enabled with no comparative checks and without computing the participation cache. This mode gives us the performance gains from the optimization. This is still experimental and not currently recommended for production usage, but will become the default mode in a future release.
- `disabled`: disable the usage of progressive cache, and use the existing method for FFG progression calculation. This mode may be useful if we find a bug and want to stop the frequent error logs.

- [x] Initial cache implementation in `BeaconState`
- [x] Perform checks in fork choice to compare the progressive balances cache against results from `ParticipationCache`
- [x] Add CLI flag, and disable the optimization by default
- [x] Testing on Goerli & Benchmarking
- [x]  Move caching logic from state processing to the `ProgressiveBalancesCache` (see [this comment](sigp#4362 (comment)))
- [x] Add attesting balance metrics

Co-authored-by: Jimmy Chen <jimmy@sigmaprime.io>
Woodpile37 pushed a commit to Woodpile37/lighthouse that referenced this pull request Jan 6, 2024
…tion (sigp#4362)

This PR introduces a "progressive balances" cache on the `BeaconState`, which keeps track of the accumulated target attestation balance for the current & previous epochs. The cached values are utilised by fork choice to calculate unrealized justification and finalization (instead of converting epoch participation arrays to balances for each block we receive).

This optimization will be rolled out gradually to allow for more testing. A new `--progressive-balances disabled|checked|strict|fast` flag is introduced to support this:
- `checked`: enabled with checks against participation cache, and falls back to the existing epoch processing calculation if there is a total target attester balance mismatch. There is no performance gain from this as the participation cache still needs to be computed. **This is the default mode for now.**
- `strict`: enabled with checks against participation cache, returns error if there is a mismatch. **Used for testing only**.
- `fast`: enabled with no comparative checks and without computing the participation cache. This mode gives us the performance gains from the optimization. This is still experimental and not currently recommended for production usage, but will become the default mode in a future release.
- `disabled`: disable the usage of progressive cache, and use the existing method for FFG progression calculation. This mode may be useful if we find a bug and want to stop the frequent error logs.

- [x] Initial cache implementation in `BeaconState`
- [x] Perform checks in fork choice to compare the progressive balances cache against results from `ParticipationCache`
- [x] Add CLI flag, and disable the optimization by default
- [x] Testing on Goerli & Benchmarking
- [x]  Move caching logic from state processing to the `ProgressiveBalancesCache` (see [this comment](sigp#4362 (comment)))
- [x] Add attesting balance metrics

Co-authored-by: Jimmy Chen <jimmy@sigmaprime.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
consensus An issue/PR that touches consensus code, such as state_processing or block verification. optimization Something to make Lighthouse run more efficiently. ready-for-merge This PR is ready to merge. v4.3.0 Estimated Q2 2023
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants