-
-
Notifications
You must be signed in to change notification settings - Fork 331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Serialize calls to the engine api #4651
Conversation
Performance Report✔️ no performance regression detected Full benchmark results
|
jobWaitTime: register.histogram({ | ||
name: "lodestar_engine_http_processor_queue_job_wait_time_seconds", | ||
help: "Time from job added to the engine http processor queue to starting in seconds", | ||
buckets: [0.1, 1, 10, 100], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bucket values must be realistic on expected values else not useful at all. If this sort of x10 prev value buckets are still in Lodestar is because we add them a long time ago, or we haven't figured out the right values. The wait time metric is crucial so buckets must be good. See below as example of buckets values specific to a metric given expected values
lodestar/packages/beacon-node/src/metrics/metrics/lodestar.ts
Lines 369 to 370 in 1cba836
// Epoch transitions are 100ms on very fast clients, and average 800ms on heavy networks | |
buckets: [0.01, 0.05, 0.1, 0.2, 0.5, 0.75, 1, 1.25, 1.5, 3, 10], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm right, most of the time will be taken up by new payload, and it can vary from 100s of ms to 2-4 seconds (if EL is really slow) and may be on average 300-400ms, so should be bucket for [0.1, 0.5, 2, 5] ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then do for example
[0.05, 0.1, 0.2, 0.3, 0.5, 0.75, 1, 2, 5, 10, 25]
method, | ||
params: [payloadId], | ||
methodOpts: getPayloadOpts, | ||
}) as Promise<EngineApiRpcReturnTypes[typeof method]>); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getPayload doesn't need to be sequential right? Don't feel it needs to be queued
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right, can keep it out of the queue, but ideally there shouldn't be anything else in the queue at that moment anyways. still can keep it out.
@@ -75,6 +78,14 @@ const exchageTransitionConfigOpts: ReqOpts = {routeId: "exchangeTransitionConfig | |||
export class ExecutionEngineHttp implements IExecutionEngine { | |||
readonly payloadIdCache = new PayloadIdCache(); | |||
private readonly rpc: IJsonRpcHttpClient; | |||
private readonly jobQueue: JobItemQueue<[EngineRequest], EngineResponse>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it's not generic to multiple jobs I would call it rpcFetchQueue
or something like that. Also add a comment explaining why this is necessary and reason which methods need it and which do not
@@ -57,6 +58,8 @@ export const defaultExecutionEngineHttpOpts: ExecutionEngineHttpOpts = { | |||
timeout: 12000, | |||
}; | |||
|
|||
const QUEUE_MAX_LENGHT = 256; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Write a JSDoc comment reasoning the choice of this number. What the tradeoff here? A job queue holds items in memory which in this case are full de-serialized blocks. Are there any memory concerns if the queue fills? What's a reasonable length given the usage of this queue? Since blocks are imported strictly sequential the queue length should always be pretty short.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will check from the blocks processor queue what limit it has, ideally there could be fcUs corresponding to the last batch of blocks plus new payload
* Size for the serializing queue for fcUs and new payloads, the max length could be equal to | ||
* EPOCHS_PER_BATCH * 2 in case new payloads are also not awaited serially | ||
*/ | ||
const QUEUE_MAX_LENGHT = EPOCHS_PER_BATCH * SLOTS_PER_EPOCH * 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lenght -> length
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤦
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm, appreciate the metrics
Serialize calls to the engine api
Follow up of #4595
Metrics for engine job processor: