Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verify signatures on both main thread and worker threads #3793

Merged
merged 11 commits into from
Mar 2, 2022

Conversation

twoeths
Copy link
Contributor

@twoeths twoeths commented Feb 26, 2022

Motivation

We want a stable BLS - Job Wait Time metric, ideally it should be < 100ms consistently

Description

  • Write a BlsMixedVerifier to send batchable bls jobs to worker threads and non-batchable ones to main thread
  • It's not likely we want to try using singleThreadVerifier, so change the flag to useMultiThreadVerifier, default to false
  • More explanation on Improve BLS thread pool - job wait time #3792

Closes #3792

@codecov
Copy link

codecov bot commented Feb 26, 2022

Codecov Report

Merging #3793 (dfa6731) into master (060918e) will decrease coverage by 0.83%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #3793      +/-   ##
==========================================
- Coverage   36.83%   35.99%   -0.84%     
==========================================
  Files         322      324       +2     
  Lines        8827     9139     +312     
  Branches     1372     1481     +109     
==========================================
+ Hits         3251     3290      +39     
- Misses       5434     5706     +272     
- Partials      142      143       +1     

@github-actions
Copy link
Contributor

github-actions bot commented Feb 26, 2022

Performance Report

✔️ no performance regression detected

Full benchmark results
Benchmark suite Current: 9748c59 Previous: de66f0b Ratio
BeaconState.hashTreeRoot - No change 615.00 ns/op 565.00 ns/op 1.09
BeaconState.hashTreeRoot - 1 full validator 99.582 us/op 81.267 us/op 1.23
BeaconState.hashTreeRoot - 32 full validator 1.4664 ms/op 1.1877 ms/op 1.23
BeaconState.hashTreeRoot - 512 full validator 18.608 ms/op 15.514 ms/op 1.20
BeaconState.hashTreeRoot - 1 validator.effectiveBalance 104.28 us/op 80.699 us/op 1.29
BeaconState.hashTreeRoot - 32 validator.effectiveBalance 1.6984 ms/op 1.5535 ms/op 1.09
BeaconState.hashTreeRoot - 512 validator.effectiveBalance 23.064 ms/op 18.684 ms/op 1.23
BeaconState.hashTreeRoot - 1 balances 75.086 us/op 58.446 us/op 1.28
BeaconState.hashTreeRoot - 32 balances 635.68 us/op 496.66 us/op 1.28
BeaconState.hashTreeRoot - 512 balances 5.7713 ms/op 4.8987 ms/op 1.18
BeaconState.hashTreeRoot - 250000 balances 116.02 ms/op 89.623 ms/op 1.29
processSlot - 1 slots 60.045 us/op 41.564 us/op 1.44
processSlot - 32 slots 2.8272 ms/op 2.1441 ms/op 1.32
getCommitteeAssignments - req 1 vs - 250000 vc 5.3832 ms/op 5.3005 ms/op 1.02
getCommitteeAssignments - req 100 vs - 250000 vc 7.5493 ms/op 7.4402 ms/op 1.01
getCommitteeAssignments - req 1000 vs - 250000 vc 8.3457 ms/op 7.9737 ms/op 1.05
computeProposers - vc 250000 25.712 ms/op 20.919 ms/op 1.23
computeEpochShuffling - vc 250000 199.60 ms/op 189.15 ms/op 1.06
getNextSyncCommittee - vc 250000 406.37 ms/op 345.63 ms/op 1.18
altair processAttestation - 250000 vs - 7PWei normalcase 35.535 ms/op 33.136 ms/op 1.07
altair processAttestation - 250000 vs - 7PWei worstcase 42.196 ms/op 32.058 ms/op 1.32
altair processAttestation - setStatus - 1/6 committees join 13.498 ms/op 11.058 ms/op 1.22
altair processAttestation - setStatus - 1/3 committees join 29.839 ms/op 23.300 ms/op 1.28
altair processAttestation - setStatus - 1/2 committees join 44.836 ms/op 35.591 ms/op 1.26
altair processAttestation - setStatus - 2/3 committees join 60.987 ms/op 48.260 ms/op 1.26
altair processAttestation - setStatus - 4/5 committees join 72.732 ms/op 56.790 ms/op 1.28
altair processAttestation - setStatus - 100% committees join 82.628 ms/op 71.860 ms/op 1.15
altair processAttestation - updateEpochParticipants - 1/6 committees join 15.309 ms/op 12.808 ms/op 1.20
altair processAttestation - updateEpochParticipants - 1/3 committees join 30.537 ms/op 21.637 ms/op 1.41
altair processAttestation - updateEpochParticipants - 1/2 committees join 22.754 ms/op 17.765 ms/op 1.28
altair processAttestation - updateEpochParticipants - 2/3 committees join 24.761 ms/op 20.327 ms/op 1.22
altair processAttestation - updateEpochParticipants - 4/5 committees join 25.212 ms/op 21.923 ms/op 1.15
altair processAttestation - updateEpochParticipants - 100% committees join 27.140 ms/op 25.384 ms/op 1.07
altair processAttestation - updateAllStatus 21.426 ms/op 18.833 ms/op 1.14
altair processBlock - 250000 vs - 7PWei normalcase 44.921 ms/op 33.166 ms/op 1.35
altair processBlock - 250000 vs - 7PWei worstcase 125.47 ms/op 117.24 ms/op 1.07
altair processEpoch - mainnet_e81889 903.52 ms/op 797.34 ms/op 1.13
mainnet_e81889 - altair beforeProcessEpoch 393.59 ms/op 293.21 ms/op 1.34
mainnet_e81889 - altair processJustificationAndFinalization 125.08 us/op 51.250 us/op 2.44
mainnet_e81889 - altair processInactivityUpdates 18.910 ms/op 16.499 ms/op 1.15
mainnet_e81889 - altair processRewardsAndPenalties 107.38 ms/op 115.59 ms/op 0.93
mainnet_e81889 - altair processRegistryUpdates 23.621 us/op 7.9060 us/op 2.99
mainnet_e81889 - altair processSlashings 6.6930 us/op 1.3990 us/op 4.78
mainnet_e81889 - altair processEth1DataReset 6.4500 us/op 1.3600 us/op 4.74
mainnet_e81889 - altair processEffectiveBalanceUpdates 6.7388 ms/op 6.3713 ms/op 1.06
mainnet_e81889 - altair processSlashingsReset 37.438 us/op 9.0410 us/op 4.14
mainnet_e81889 - altair processRandaoMixesReset 45.140 us/op 13.519 us/op 3.34
mainnet_e81889 - altair processHistoricalRootsUpdate 8.0860 us/op 1.4620 us/op 5.53
mainnet_e81889 - altair processParticipationFlagUpdates 78.028 ms/op 64.670 ms/op 1.21
mainnet_e81889 - altair processSyncCommitteeUpdates 5.2450 us/op 1.1740 us/op 4.47
mainnet_e81889 - altair afterProcessEpoch 244.27 ms/op 218.24 ms/op 1.12
altair processInactivityUpdates - 250000 normalcase 88.270 ms/op 72.702 ms/op 1.21
altair processInactivityUpdates - 250000 worstcase 76.217 ms/op 74.462 ms/op 1.02
altair processParticipationFlagUpdates - 250000 anycase 63.201 ms/op 58.305 ms/op 1.08
altair processRewardsAndPenalties - 250000 normalcase 102.26 ms/op 86.615 ms/op 1.18
altair processRewardsAndPenalties - 250000 worstcase 115.31 ms/op 86.488 ms/op 1.33
altair processSyncCommitteeUpdates - 250000 422.76 ms/op 358.84 ms/op 1.18
Tree 40 250000 create 849.72 ms/op 649.72 ms/op 1.31
Tree 40 250000 get(125000) 345.81 ns/op 342.42 ns/op 1.01
Tree 40 250000 set(125000) 2.6565 us/op 1.9974 us/op 1.33
Tree 40 250000 toArray() 49.888 ms/op 39.545 ms/op 1.26
Tree 40 250000 iterate all - toArray() + loop 49.823 ms/op 38.595 ms/op 1.29
Tree 40 250000 iterate all - get(i) 136.11 ms/op 122.25 ms/op 1.11
MutableVector 250000 create 24.881 ms/op 20.788 ms/op 1.20
MutableVector 250000 get(125000) 15.663 ns/op 11.007 ns/op 1.42
MutableVector 250000 set(125000) 720.66 ns/op 478.15 ns/op 1.51
MutableVector 250000 toArray() 9.2942 ms/op 8.5550 ms/op 1.09
MutableVector 250000 iterate all - toArray() + loop 10.076 ms/op 8.5623 ms/op 1.18
MutableVector 250000 iterate all - get(i) 3.7493 ms/op 3.3129 ms/op 1.13
Array 250000 create 5.7562 ms/op 5.4092 ms/op 1.06
Array 250000 clone - spread 2.5053 ms/op 2.2194 ms/op 1.13
Array 250000 get(125000) 1.2180 ns/op 1.0770 ns/op 1.13
Array 250000 set(125000) 1.2400 ns/op 1.0820 ns/op 1.15
Array 250000 iterate all - loop 144.23 us/op 169.03 us/op 0.85
effectiveBalanceIncrements clone Uint8Array 300000 599.99 us/op 71.495 us/op 8.39
effectiveBalanceIncrements clone MutableVector 300000 595.00 ns/op 626.00 ns/op 0.95
effectiveBalanceIncrements rw all Uint8Array 300000 190.67 us/op 301.91 us/op 0.63
effectiveBalanceIncrements rw all MutableVector 300000 205.50 ms/op 186.95 ms/op 1.10
aggregationBits - 2048 els - readonlyValues 226.60 us/op 187.12 us/op 1.21
aggregationBits - 2048 els - zipIndexesInBitList 37.775 us/op 33.282 us/op 1.13
regular array get 100000 times 58.480 us/op 67.438 us/op 0.87
wrappedArray get 100000 times 59.436 us/op 67.467 us/op 0.88
arrayWithProxy get 100000 times 39.490 ms/op 32.151 ms/op 1.23
ssz.Root.equals 1.5970 us/op 1.1360 us/op 1.41
ssz.Root.equals with valueOf() 1.4970 us/op 1.2800 us/op 1.17
byteArrayEquals with valueOf() 1.4510 us/op 1.2640 us/op 1.15
phase0 processBlock - 250000 vs - 7PWei normalcase 10.469 ms/op 7.7892 ms/op 1.34
phase0 processBlock - 250000 vs - 7PWei worstcase 94.008 ms/op 68.644 ms/op 1.37
phase0 afterProcessEpoch - 250000 vs - 7PWei 231.63 ms/op 206.47 ms/op 1.12
phase0 beforeProcessEpoch - 250000 vs - 7PWei 708.96 ms/op 504.03 ms/op 1.41
phase0 processEpoch - mainnet_e58758 940.29 ms/op 727.13 ms/op 1.29
mainnet_e58758 - phase0 beforeProcessEpoch 561.31 ms/op 445.44 ms/op 1.26
mainnet_e58758 - phase0 processJustificationAndFinalization 124.31 us/op 47.426 us/op 2.62
mainnet_e58758 - phase0 processRewardsAndPenalties 128.82 ms/op 78.457 ms/op 1.64
mainnet_e58758 - phase0 processRegistryUpdates 83.352 us/op 34.721 us/op 2.40
mainnet_e58758 - phase0 processSlashings 6.2400 us/op 1.0840 us/op 5.76
mainnet_e58758 - phase0 processEth1DataReset 5.3960 us/op 1.0280 us/op 5.25
mainnet_e58758 - phase0 processEffectiveBalanceUpdates 5.5734 ms/op 5.1843 ms/op 1.08
mainnet_e58758 - phase0 processSlashingsReset 32.562 us/op 7.6240 us/op 4.27
mainnet_e58758 - phase0 processRandaoMixesReset 41.770 us/op 10.624 us/op 3.93
mainnet_e58758 - phase0 processHistoricalRootsUpdate 8.0950 us/op 1.2680 us/op 6.38
mainnet_e58758 - phase0 processParticipationRecordUpdates 31.798 us/op 8.2900 us/op 3.84
mainnet_e58758 - phase0 afterProcessEpoch 193.50 ms/op 163.94 ms/op 1.18
phase0 processEffectiveBalanceUpdates - 250000 normalcase 6.5695 ms/op 5.3186 ms/op 1.24
phase0 processEffectiveBalanceUpdates - 250000 worstcase 0.5 6.9324 ms/op 6.4448 ms/op 1.08
phase0 processRegistryUpdates - 250000 normalcase 88.676 us/op 35.562 us/op 2.49
phase0 processRegistryUpdates - 250000 badcase_full_deposits 3.9603 ms/op 2.5896 ms/op 1.53
phase0 processRegistryUpdates - 250000 worstcase 0.5 2.2870 s/op 1.5733 s/op 1.45
phase0 getAttestationDeltas - 250000 normalcase 13.902 ms/op 11.624 ms/op 1.20
phase0 getAttestationDeltas - 250000 worstcase 14.308 ms/op 11.513 ms/op 1.24
phase0 processSlashings - 250000 worstcase 38.413 ms/op 30.467 ms/op 1.26
shuffle list - 16384 els 13.944 ms/op 11.527 ms/op 1.21
shuffle list - 250000 els 194.57 ms/op 165.46 ms/op 1.18
getEffectiveBalanceIncrementsZeroInactive - 250000 vs - 7PWei 663.80 us/op 468.10 us/op 1.42
pass gossip attestations to forkchoice per slot 19.582 ms/op 17.996 ms/op 1.09
computeDeltas 3.6535 ms/op 3.1803 ms/op 1.15
computeProposerBoostScoreFromBalances 470.25 us/op 503.04 us/op 0.93
getPubkeys - index2pubkey - req 1000 vs - 250000 vc 1.9638 ms/op 2.0167 ms/op 0.97
getPubkeys - validatorsArr - req 1000 vs - 250000 vc 775.81 us/op 787.93 us/op 0.98
BLS verify - blst-native 2.1253 ms/op 1.8603 ms/op 1.14
BLS verifyMultipleSignatures 3 - blst-native 4.4087 ms/op 3.8406 ms/op 1.15
BLS verifyMultipleSignatures 8 - blst-native 9.4861 ms/op 8.2065 ms/op 1.16
BLS verifyMultipleSignatures 32 - blst-native 35.251 ms/op 29.866 ms/op 1.18
BLS aggregatePubkeys 32 - blst-native 46.369 us/op 36.788 us/op 1.26
BLS aggregatePubkeys 128 - blst-native 179.12 us/op 136.16 us/op 1.32
getAttestationsForBlock 68.182 ms/op 58.933 ms/op 1.16
CheckpointStateCache - add get delete 21.848 us/op 17.270 us/op 1.27
validate gossip signedAggregateAndProof - struct 5.4879 ms/op 3.9930 ms/op 1.37
validate gossip signedAggregateAndProof - treeBacked 5.3273 ms/op 3.8878 ms/op 1.37
validate gossip attestation - struct 2.6071 ms/op 1.8415 ms/op 1.42
validate gossip attestation - treeBacked 2.5733 ms/op 1.8544 ms/op 1.39
pickEth1Vote - no votes 10.080 ms/op 9.2665 ms/op 1.09
pickEth1Vote - max votes 59.628 ms/op 50.101 ms/op 1.19
pickEth1Vote - Eth1Data hashTreeRoot value x2048 29.267 ms/op 22.327 ms/op 1.31
pickEth1Vote - Eth1Data hashTreeRoot tree x2048 10.675 ms/op 8.7705 ms/op 1.22
pickEth1Vote - Eth1Data fastSerialize value x2048 5.1969 ms/op 4.7022 ms/op 1.11
pickEth1Vote - Eth1Data fastSerialize tree x2048 28.872 ms/op 23.802 ms/op 1.21
bytes32 toHexString 1.8460 us/op 1.7330 us/op 1.07
bytes32 Buffer.toString(hex) 920.00 ns/op 687.00 ns/op 1.34
bytes32 Buffer.toString(hex) from Uint8Array 1.2320 us/op 860.00 ns/op 1.43
bytes32 Buffer.toString(hex) + 0x 880.00 ns/op 618.00 ns/op 1.42
Object access 1 prop 0.43700 ns/op 0.30700 ns/op 1.42
Map access 1 prop 0.38700 ns/op 0.26300 ns/op 1.47
Object get x1000 17.371 ns/op 15.452 ns/op 1.12
Map get x1000 1.0110 ns/op 0.87300 ns/op 1.16
Object set x1000 110.67 ns/op 109.67 ns/op 1.01
Map set x1000 76.082 ns/op 67.575 ns/op 1.13
Return object 10000 times 0.44730 ns/op 0.37380 ns/op 1.20
Throw Error 10000 times 6.6807 us/op 5.7474 us/op 1.16
enrSubnets - fastDeserialize 64 bits 1.4240 us/op 1.2560 us/op 1.13
enrSubnets - ssz BitVector 64 bits 18.291 us/op 16.425 us/op 1.11
enrSubnets - fastDeserialize 4 bits 560.00 ns/op 471.00 ns/op 1.19
enrSubnets - ssz BitVector 4 bits 3.3710 us/op 2.9040 us/op 1.16
RateTracker 1000000 limit, 1 obj count per request 192.58 ns/op 182.17 ns/op 1.06
RateTracker 1000000 limit, 2 obj count per request 146.11 ns/op 135.93 ns/op 1.07
RateTracker 1000000 limit, 4 obj count per request 116.66 ns/op 113.00 ns/op 1.03
RateTracker 1000000 limit, 8 obj count per request 102.43 ns/op 101.63 ns/op 1.01
RateTracker with prune 4.3000 us/op 3.9120 us/op 1.10
array of 16000 items push then shift 5.0245 us/op 3.0867 us/op 1.63
LinkedList of 16000 items push then shift 17.988 ns/op 16.716 ns/op 1.08
array of 16000 items push then pop 236.03 ns/op 198.74 ns/op 1.19
LinkedList of 16000 items push then pop 16.850 ns/op 15.365 ns/op 1.10
array of 24000 items push then shift 7.7365 us/op 4.5596 us/op 1.70
LinkedList of 24000 items push then shift 18.710 ns/op 19.544 ns/op 0.96
array of 24000 items push then pop 223.31 ns/op 176.63 ns/op 1.26
LinkedList of 24000 items push then pop 18.574 ns/op 15.453 ns/op 1.20

by benchmarkbot/action

@twoeths twoeths marked this pull request as ready for review February 26, 2022 08:22
@dapplion
Copy link
Contributor

I would rather add another flag when sending jobs to the BLS verifier where you opt-out of multi-thread

Opting-out and using more resources on the main thread should be done very carefully and which good backing data that's it's a good trade-off

Copy link
Contributor

@dapplion dapplion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above

@twoeths twoeths marked this pull request as draft February 27, 2022 08:18
@twoeths
Copy link
Contributor Author

twoeths commented Feb 28, 2022

Test result in contabo-18 (250 validators) for ~16h (previously it's tuyen/gossip_penalize_peer branch)

  • BLS job wait time is improved

Screen Shot 2022-02-28 at 08 20 21

  • Thanks for that gossip validation time is improved

Screen Shot 2022-02-28 at 08 22 19

  • Gossip Block Processed delay has been great since we deploy this branch, even through we verify gossip blocks' signatures in the main thread

Screen Shot 2022-02-28 at 08 24 15

Looks like it's a good trade-off. Also @dapplion I notice the "gossip validation gap" issue in previous branch (tuyen/gossip_penalize_peer), which hasn't happened in this branch so far. At least we know that issue is not specific to ssz-v2 branch.

@twoeths twoeths marked this pull request as ready for review February 28, 2022 01:31
@dapplion
Copy link
Contributor

dapplion commented Feb 28, 2022

Some feedback items

  1. Remove BlsMixed class, just add and if inside the BlsMultiThread class
  async verifySignatureSets(sets: ISignatureSet[], opts: VerifySignatureOpts = {}): Promise<boolean> {
    if (opts.useMainThread) {
      return verifySignatureSetsMaybeBatch(
        sets.map((set) => ({
          publicKey: getAggregatedPubkey(set),
          message: set.signingRoot.valueOf() as Uint8Array,
          signature: set.signature,
        }))
      );
    }
  1. Verifying blocks on mainthread during sync will be a performance issue. Add a flag in the block processor {blsVerifyMainThread: true}, and set to true only on gossip handler and unknown block sync

  2. To clean-up the flags I think we should have

  • --blsVerifyAllMainThread default false. In BeaconChain constructor uses the BlsSingleThread class
  • --blsVerifyAllMultiThread default false. Ignores the useMainThread flag in the BlsMultiThreadWorkerPool
  1. Add a metric that counts total CPU time spent verifying signatures on the main thread. Just an incremental gauge of seconds using process.hrtime.bigint(). Now there will be two places potentially verifying signatures:
  • verifySignatureSetsMaybeBatch inside BlsMultiThreadWorkerPool.verifySignatureSets: histogram lodestar_bls_thread_pool_main_thread_time_seconds with tight bands [0.1, 1]
  • verifySignatureSetsMaybeBatch inside BlsSingleThreadVerifier.verifySignatureSets: histogram lodestar_bls_single_thread_time_seconds with tight bands [0.1, 1]

/**
* Verify signatures on main thread or not.
*/
blsVerifyMainThread?: boolean;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe better name blsVerifyOnMainThread?

* Use main thread to verify signatures, use this with care.
* Ignore the batchable option if this is true.
*/
useMainThread?: boolean;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we name verifyOnMainThread for consistency with blsVerifyOnMainThread?

);

const endNs = process.hrtime.bigint();
this.metrics?.blsTime.mainThreadDurationInThreadPool.observe(Number(endNs - startNs) / 1e9);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use the timer function which does calls the process time for you. Also wrap in try {} finally {} in case verifySignatureSetsMaybeBatch rejects

const timer = metrics.startTimer()
try {
  return fn()
} finally {
  timer()
}

const bls = opts.blsVerifyAllMainThread
? new BlsSingleThreadVerifier({metrics})
: opts.blsVerifyAllMultiThread
? new BlsMultiThreadWorkerPool({blsVerifyAllMultiThread: true}, blsModules)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just do

new BlsMultiThreadWorkerPool(opts, blsModules)

import {IBlsVerifier} from "./interface";
import {verifySignatureSetsMaybeBatch} from "./maybeBatch";
import {getAggregatedPubkey} from "./utils";

export class BlsSingleThreadVerifier implements IBlsVerifier {
private readonly metrics: IMetrics | null;

constructor({metrics = null}: {metrics: IMetrics | null}) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Equivalent but constructor arguments like this look much cleaner

constructor(private readonly metrics: IMetrics | null) {}

sets.map((set) => ({
publicKey: getAggregatedPubkey(set),
message: set.signingRoot.valueOf() as Uint8Array,
signature: set.signature,
}))
);

const endNs = process.hrtime.bigint();
this.metrics?.blsTime.singleThreadDuration.observe(Number(endNs - startNs) / 1e9);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use metrics.startTimer() and try {} finally {}

help: "Time to verify signatures with single thread mode",
buckets: [0.1, 1],
}),
mainThreadDurationInThreadPool: register.histogram({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's already a group blsThreadPool, add mainThreadDurationInThreadPool to there

@@ -367,6 +367,20 @@ export function createLodestarMetrics(
}),
},

// BLS time
blsTime: {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put right after blsThreadPool, with group blsSingleThread or no group at all

@@ -111,7 +111,7 @@ export function getGossipHandlers(modules: ValidatorFnsModules, options: GossipH

// `validProposerSignature = true`, in gossip validation the proposer signature is checked
chain
.processBlock(signedBlock, {validProposerSignature: true})
.processBlock(signedBlock, {validProposerSignature: true, blsVerifyMainThread: true})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add explanation above the blsVerifyMainThread option linking to the issue and investigation that motivated this change

@@ -164,7 +164,9 @@ export class UnknownBlockSync {
}

pendingBlock.status = PendingBlockStatus.processing;
const res = await wrapError(this.chain.processBlock(pendingBlock.signedBlock, {ignoreIfKnown: true}));
const res = await wrapError(
this.chain.processBlock(pendingBlock.signedBlock, {ignoreIfKnown: true, blsVerifyMainThread: true})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add explanation above the blsVerifyMainThread option linking to the issue and investigation that motivated this change. Copy paste from packages/lodestar/src/network/gossip/handlers/index.ts.

Then please add a note why unknown block sync should use true and range sync false

@@ -194,6 +194,8 @@ export class RangeSync extends (EventEmitter as {new (): RangeSyncEmitter}) {
ignoreIfFinalized: true,
// We won't attest to this block so it's okay to ignore a SYNCING message from execution layer
fromRangeSync: true,
// When syncing, we should verify signatures in worker threads
blsVerifyMainThread: false,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a note why unknown block sync should use true and range sync false

@twoeths twoeths changed the title Mixed BLS Verify signatures on both main thread and worker threads Feb 28, 2022
@twoeths
Copy link
Contributor Author

twoeths commented Mar 1, 2022

with #3812, gossip p7 penalty (broken IWANT promises issue) is avoided most of the time so this PR is less important

also in hetzner-test0 (with >900 validators), BLS Job wait time is consistently low
Screen Shot 2022-03-01 at 17 34 21

@dapplion
Copy link
Contributor

dapplion commented Mar 1, 2022

with #3812, gossip p7 penalty (broken IWANT promises issue) is avoided most of the time so this PR is less important

I want to revert #3812 (see #3815), so I think we should merge this PR.

@twoeths twoeths merged commit bbe818d into master Mar 2, 2022
@twoeths twoeths deleted the tuyen/mixed-bls branch March 2, 2022 06:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve BLS thread pool - job wait time
2 participants