-
-
Notifications
You must be signed in to change notification settings - Fork 318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Verify signatures on both main thread and worker threads #3793
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3793 +/- ##
==========================================
- Coverage 36.83% 35.99% -0.84%
==========================================
Files 322 324 +2
Lines 8827 9139 +312
Branches 1372 1481 +109
==========================================
+ Hits 3251 3290 +39
- Misses 5434 5706 +272
- Partials 142 143 +1 |
Performance Report✔️ no performance regression detected Full benchmark results
|
I would rather add another flag when sending jobs to the BLS verifier where you opt-out of multi-thread
Opting-out and using more resources on the main thread should be done very carefully and which good backing data that's it's a good trade-off |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above
Test result in contabo-18 (250 validators) for ~16h (previously it's
Looks like it's a good trade-off. Also @dapplion I notice the "gossip validation gap" issue in previous branch ( |
Some feedback items
async verifySignatureSets(sets: ISignatureSet[], opts: VerifySignatureOpts = {}): Promise<boolean> {
if (opts.useMainThread) {
return verifySignatureSetsMaybeBatch(
sets.map((set) => ({
publicKey: getAggregatedPubkey(set),
message: set.signingRoot.valueOf() as Uint8Array,
signature: set.signature,
}))
);
}
|
/** | ||
* Verify signatures on main thread or not. | ||
*/ | ||
blsVerifyMainThread?: boolean; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe better name blsVerifyOnMainThread
?
* Use main thread to verify signatures, use this with care. | ||
* Ignore the batchable option if this is true. | ||
*/ | ||
useMainThread?: boolean; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we name verifyOnMainThread
for consistency with blsVerifyOnMainThread
?
); | ||
|
||
const endNs = process.hrtime.bigint(); | ||
this.metrics?.blsTime.mainThreadDurationInThreadPool.observe(Number(endNs - startNs) / 1e9); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use the timer function which does calls the process time for you. Also wrap in try {} finally {} in case verifySignatureSetsMaybeBatch rejects
const timer = metrics.startTimer()
try {
return fn()
} finally {
timer()
}
packages/lodestar/src/chain/chain.ts
Outdated
const bls = opts.blsVerifyAllMainThread | ||
? new BlsSingleThreadVerifier({metrics}) | ||
: opts.blsVerifyAllMultiThread | ||
? new BlsMultiThreadWorkerPool({blsVerifyAllMultiThread: true}, blsModules) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just do
new BlsMultiThreadWorkerPool(opts, blsModules)
import {IBlsVerifier} from "./interface"; | ||
import {verifySignatureSetsMaybeBatch} from "./maybeBatch"; | ||
import {getAggregatedPubkey} from "./utils"; | ||
|
||
export class BlsSingleThreadVerifier implements IBlsVerifier { | ||
private readonly metrics: IMetrics | null; | ||
|
||
constructor({metrics = null}: {metrics: IMetrics | null}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Equivalent but constructor arguments like this look much cleaner
constructor(private readonly metrics: IMetrics | null) {}
sets.map((set) => ({ | ||
publicKey: getAggregatedPubkey(set), | ||
message: set.signingRoot.valueOf() as Uint8Array, | ||
signature: set.signature, | ||
})) | ||
); | ||
|
||
const endNs = process.hrtime.bigint(); | ||
this.metrics?.blsTime.singleThreadDuration.observe(Number(endNs - startNs) / 1e9); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use metrics.startTimer() and try {} finally {}
help: "Time to verify signatures with single thread mode", | ||
buckets: [0.1, 1], | ||
}), | ||
mainThreadDurationInThreadPool: register.histogram({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's already a group blsThreadPool
, add mainThreadDurationInThreadPool
to there
@@ -367,6 +367,20 @@ export function createLodestarMetrics( | |||
}), | |||
}, | |||
|
|||
// BLS time | |||
blsTime: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put right after blsThreadPool, with group blsSingleThread
or no group at all
@@ -111,7 +111,7 @@ export function getGossipHandlers(modules: ValidatorFnsModules, options: GossipH | |||
|
|||
// `validProposerSignature = true`, in gossip validation the proposer signature is checked | |||
chain | |||
.processBlock(signedBlock, {validProposerSignature: true}) | |||
.processBlock(signedBlock, {validProposerSignature: true, blsVerifyMainThread: true}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add explanation above the blsVerifyMainThread
option linking to the issue and investigation that motivated this change
@@ -164,7 +164,9 @@ export class UnknownBlockSync { | |||
} | |||
|
|||
pendingBlock.status = PendingBlockStatus.processing; | |||
const res = await wrapError(this.chain.processBlock(pendingBlock.signedBlock, {ignoreIfKnown: true})); | |||
const res = await wrapError( | |||
this.chain.processBlock(pendingBlock.signedBlock, {ignoreIfKnown: true, blsVerifyMainThread: true}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add explanation above the blsVerifyMainThread option linking to the issue and investigation that motivated this change. Copy paste from packages/lodestar/src/network/gossip/handlers/index.ts
.
Then please add a note why unknown block sync should use true and range sync false
@@ -194,6 +194,8 @@ export class RangeSync extends (EventEmitter as {new (): RangeSyncEmitter}) { | |||
ignoreIfFinalized: true, | |||
// We won't attest to this block so it's okay to ignore a SYNCING message from execution layer | |||
fromRangeSync: true, | |||
// When syncing, we should verify signatures in worker threads | |||
blsVerifyMainThread: false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a note why unknown block sync should use true and range sync false
with #3812, gossip p7 penalty (broken IWANT promises issue) is avoided most of the time so this PR is less important also in hetzner-test0 (with >900 validators), BLS Job wait time is consistently low |
Motivation
We want a stable BLS - Job Wait Time metric, ideally it should be < 100ms consistently
Description
BlsMixedVerifier
to send batchable bls jobs to worker threads and non-batchable ones to main threaduseMultiThreadVerifier
, default to falseCloses #3792