Improve next epoch shuffling computation #6268

twoeths · 2024-01-09T04:12:16Z

Problem description

For now next epoch shuffling computation in afterProcessEpoch takes the most time of epoch transition in normal condition. In some gc intensive condition, processRewardsAndPenalties takes more time (I used to see it took up to 1.8s) see #6229

this is a holesky profile:

Solution description

as discussed with @dapplion we can go with these 2 approaches:

Right after beforeProcessEpoch, we have nextEpochShufflingActiveValidatorIndices and a finalized randaoMixes, we should be able to offload next epoch shuffling computation to a worker thread, and epoch transition becomes async
unshuffleList is the main part of shuffling computation, we should be able to implement it in assembly script importing as-sha256 from there

Additional context

No response

The text was updated successfully, but these errors were encountered:

dapplion · 2024-01-09T04:17:56Z

epoch transition becomes async

We can have a flag to skip the unshuffle computation and delegate it to the caller of the epoch transition, kind of like we do with execution payload verification

twoeths · 2024-04-03T07:55:39Z

one proposal for this work with some criterias:

keep state-transition async, no need to change it, just add some simple apis
ShufflingCache is only managed by beacon-node

given the epoch transition mostly get triggered by prepareNextSlot while prepareNextSlot only need BeaconStateAllForks, we don't need to compute shuffling synchronously there

twoeths · 2024-04-04T07:43:11Z

another idea is to model ShufflingSource for next epoch. We always need previous/current shuffling, not the next shuffling so it could be lazily computed at beacon-node side after an epoch transition

export type ShufflingSource = {
  epoch: Epoch;
  activeIndices: Uint32Array;

  // TODO: consider if we need this or not, in that case make it ShufflingSummary
  // committeesPerSlot: number;
  // committeeLens: number[][];
};

export type Shuffling = {
  shuffling: Uint32Array;
  committees: Uint32Array[][];
  committeesPerSlot: number;
};

export type EpochShuffling = ShufflingSource & Shuffling;

export function isEpochShuffling(shufflingOrIndices: ShufflingSource): shufflingOrIndices is EpochShuffling {
  return (shufflingOrIndices as EpochShuffling).shuffling !== undefined;
}

then type nextShuffling as EpochSource

  previousShuffling: EpochShuffling;
  currentShuffling: EpochShuffling;
  nextShuffling: ShufflingSource;

then in afterProcessEpoch we don't need to compute next shuffling

afterProcessEpoch(
    state: BeaconStateAllForks,
    epochTransitionCache: {
      nextEpochShufflingActiveValidatorIndices: ValidatorIndex[];
      nextEpochTotalActiveBalanceByIncrement: number;
    }
  ): void {
    this.previousShuffling = this.currentShuffling;
    this.currentShuffling = isEpochShuffling(this.nextShuffling)
      ? this.nextShuffling
      // should rarely/never happens
      // TODO: track this is a metric
      : computeEpochShuffling(state, this.nextShuffling.activeIndices, this.nextShuffling.epoch);
    const currEpoch = this.currentShuffling.epoch;
    const nextEpoch = currEpoch + 1;
    this.nextShuffling = {
      epoch: nextEpoch,
      activeIndices: new Uint32Array(epochTransitionCache.nextEpochShufflingActiveValidatorIndices),
    };

right after an epoch transition, beacon-node should compute nextShuffling and update CachedBeaconState, update ShufflingCache and state caches as well

twoeths · 2024-09-24T02:14:34Z

after #6938 was merged

Epoch transition is reduced because we move the code to the next event loop

But Prepare Next Epoch Duration is the same

need to implement

lodestar/packages/beacon-node/src/chain/shufflingCache.ts

Line 183 in d0ba6bc

callInNextEventLoop(() => {

wemeetagain · 2024-10-22T19:37:51Z

fixed by feat: use rust shuffle #7120

twoeths added the meta-feature-request Issues to track feature requests. label Jan 9, 2024

philknows added prio-high Resolve issues as soon as possible. scope-performance Performance issue and ideas to improve performance. labels Jan 9, 2024

twoeths assigned matthewkeil Jan 23, 2024

philknows added this to the v1.16.0 milestone Jan 23, 2024

wemeetagain mentioned this issue Feb 5, 2024

Proposal for reorganizing shuffling code #6386

Closed

philknows modified the milestones: v1.16.0, Short-Term Features/Issues Mar 8, 2024

twoeths mentioned this issue Jul 2, 2024

chore: track Prepare Next Epoch heatmap #6928

Merged

philknows modified the milestones: Short-Term Features/Issues, v1.23.0 Sep 24, 2024

wemeetagain closed this as completed Oct 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve next epoch shuffling computation #6268

Improve next epoch shuffling computation #6268

twoeths commented Jan 9, 2024

dapplion commented Jan 9, 2024

twoeths commented Apr 3, 2024

twoeths commented Apr 4, 2024

twoeths commented Sep 24, 2024

wemeetagain commented Oct 22, 2024

Improve next epoch shuffling computation #6268

Improve next epoch shuffling computation #6268

Comments

twoeths commented Jan 9, 2024

Problem description

Solution description

Additional context

dapplion commented Jan 9, 2024

twoeths commented Apr 3, 2024

twoeths commented Apr 4, 2024

twoeths commented Sep 24, 2024

wemeetagain commented Oct 22, 2024