Engine API: add `getPayloadBodiesByRangeV1` to #146 #218

arnetheduck · 2022-05-06T06:19:39Z

This PR extends #146 to also specify a by-range request.

Similar to the consensus p2p spec, a by-range request allows execution
clients to store finalized blocks in linear by-number storage instead of
relying on by-hash indices, significantly increasing efficiency in
fetching them from cold storage.

Clients whose database design does not permit efficient by-number
lookups may opt to not implement this call, but must then give a
well-known error code allowing consensus later clients to fall back to a
less efficient method of fetching the blocks.

This specification assumes that execution clients know nothing of slot
numbers as seen on the consensus layer. Should execution clients later
learn about these, the specification may be amended to work with slot
numbers instead.

Until then, consensus clients must be careful to compute block numbers
correctly.

Consensus clients must also be careful when this request is used to
fetch non-finalized blocks, reverting to by-root requests if an
unexpected chain is returned.

…of-bodies

Similar to the consensus p2p spec, a by-range request allows execution clients to store finalized blocks in linear by-number storage instead of relying on by-hash indices, significantly increasing efficiency in fetching them from cold storage. Clients whose database design does not permit efficient by-number lookups may opt to not implement this call, but must then give a well-known error code allowing consensus later clients to fall back to a less efficient method of fetching the blocks. This specification assumes that execution clients know nothing of slot numbers as seen on the consensus layer. Should execution clients later learn about these, the specification may be amended to work with slot numbers instead. Until then, consensus clients must be careful to compute block numbers correctly. Consensus clients must also be careful when this request is used to fetch non-finalized blocks, reverting to by-root requests if an unexpected chain is returned.

mkalinin

I left a few comments.

One recent addition worth considering is timeouts. I think we should modify Timeouts section to say that if timeout parameter isn't specified, CL has a liberty to decide how much time it should wait for response. But in this particular case I think that timeouts matter as these requests may be time consuming as they require disk access.

Also, we may consider a limitation on a number of block bodies in the response. Say something like 1024 as a hard cap to prevent buggy CL making EL read and send back a million of blocks.

src/engine/specification.md

mkalinin · 2022-05-07T13:17:09Z

src/engine/specification.md

+* result: `Array of ExecutionPayloadBodyV1` - Array of [`ExecutionPayloadBodyV1`](#ExecutionPayloadBodyV1) objects.
+* error: code and message set in case an exception happens while processing the method call.
+  * Clients that don't support fetching bodies by range **MUST** return the error code `-32601 	Method not found 	The method does not exist / is not available.`. Callers may then revert to `engine_getPayloadBodiesByRootV1`.


Consider moving this statement to the Specification section of this method

src/engine/specification.md

Co-authored-by: Mikhail Kalinin <noblesse.knight@gmail.com>

arnetheduck · 2022-05-10T14:44:49Z

Thanks!

But in this particular case I think that timeouts matter as these requests may be time consuming as they require disk access.

In the consensus layer, we have to start responding respond within 5s, and the entire response must take no longer than 10s in total which effectively puts a few caps on request sizes - if we can't serve a response within this time, it's likely useless. I haven't followed the rationale for adding timeouts to this spec in particular, but if we're going to add it, these are the values to take into consideration.

Say something like 1024 as a hard cap to prevent buggy CL making EL read and send back a million of blocks.

On the CL side, we "soft limit" requests to 1024 slots, meaning that anyone that requests more will not get a "full" answer.

Since the EL/CL is more of trusted connection, we could also opt for a hard limit where the server gives an error if more is requested - I agree it's useful to add this as a coordination point so that clients don't make outrageous requests.

I'd probably go with a soft limit - I'll put that and 1024 in the next edit unless someone comes along with an opinion :)

djrtwo

Looks good! Agree with @mkalinin's comments

Also, summoning @paulhauner to ensure that this fits with the deduplication engineering they've been working on in lighthouse

djrtwo · 2022-05-13T12:47:58Z

src/engine/specification.md

+#### Specification
+
+1. Given a `start` and a `count`, the client software **MUST** respond with array of `ExecutionPayloadBodyV1` objects with the corresponding execution block number respecting the order of blocks in the canonical chain, as selected by the latest `forkChoiceUpdated` call.


I think we need to add an error condition for if the EL does not have the requisite data (either a malformed CL request or some sort of failure/resync on EL side

Updated response to give nil when blocks are missing

MariusVanDerWijden · 2022-05-27T14:58:58Z

src/engine/specification.md

+1. Given array of block hashes client software **MUST** respond with array of `ExecutionPayloadBodyV1` objects with the corresponding hashes respecting the order of block hashes in the input array.
+
+1. Client software **MUST** skip the payload body in the response array if the data of this body is missing. For instance, if the request is `[A.block_hash, B.block_hash, C.block_hash]` and client software has data of payloads `A` and `C`, but doesn't have data of `B`, the response **MUST** be `[A.body, C.body]`.


I think thats a bit weird, if I request n-blocks I would expect n-blocks even if some blocks are nil.
So I would rather return [A.body, nil, C.body]

this raises a question: should CL:s verify the hash of the data? including nil here alleviates CL:s of this requirement (which they otherwise have to to, to match response to request

I'd be in favour of including the nils, as it's conceptually cleaner to have one response per request, and I think CL verification should be optional (although we should probably enable it by default, especially initially).

not only I'm in favor of including nil but I think CL software should be mandated to check the hashes or do some verification that the data was not corrupted in the wire.

CL software should be mandated

the only viable way to mandate it is by making it technically infeasible to not check - if we mandate it, we should also reap the benefits of having the hash check there (and use a more compact response that potentially can be reordered) - I see this mostly as a trust decision - if we include nil, we're saying that the CL fully trusts the EL and at that point, I don't think it's reasonable to mandate verification because it makes serving block requests more expensive for little gain (ie no trust gain, tiny chance of detecting corruption that whoever did the block request on the far end will have to repeat anyway).

nil has been added as a required response for missing blocks - consumers may or may not verify the payload - this is up to their own quality policy - when they are sending the EL contents to someone on the internet, that someone will have to verify it regardless, so it's a bit of a waste to redo it, assuming they trust their EL.

michaelsproul · 2022-09-16T07:05:27Z

src/engine/specification.md

+
+1. Client software **MUST** skip the payload body in the response array if the data of this body is missing. For instance, if the request is `[A.block_hash, B.block_hash, C.block_hash]` and client software has data of payloads `A` and `C`, but doesn't have data of `B`, the response **MUST** be `[A.body, C.body]`.
+
+1. Clients that support `engine_getPayloadBodiesByRangeV1` **MAY NOT** respond to requests for finalized blocks by hash.


This is because they may have pruned them?

Same as CL really - this allows us to drop hash-based indices and retrieve blocks from linear archival storage - the aim with this clause is to ensure that CL:s use the linear request for prefinality blocks and only "top up" with by-hash requests where forks are possible.

mkalinin · 2022-12-21T09:11:14Z

There is a rough consensus to attempt to include these methods into Shanghai, if there is any Shanghai delay related to implementation of proposed changes then we can reconsider these methods for a later inclusion.

@arnetheduck would you mind to rebase this PR with the most recent changes from main, it implies adding these new methods into shanghai.md and appending withdrawals to the ExecutionPayloadBodyV1 structure. After a conversation with EL client devs I feel like both methods are easy to be implemented thus I think both should be required, i.e. I'd remove optionality from the spec of the ByRangeV1.

mkalinin · 2023-01-11T10:14:27Z

Rebased, refined and moved to #352

mkalinin and others added 6 commits December 10, 2021 15:47

Engine API: add getPayloadBodies method

b1a4ebd

Engine API: fix spellchecker

6be1088

Engine API: fix spellchecker. Take 2

f500c48

Merge branch 'main' into get-payload-bodies

b490341

Merge remote-tracking branch 'mkalinin/get-payload-bodies' into lots-…

f9ee0ff

…of-bodies

arnetheduck changed the title ~~Lots of bodies~~ Execution API: add getPayloadBodiesByRangeV1 to #146 May 6, 2022

arnetheduck changed the title ~~Execution API: add getPayloadBodiesByRangeV1 to #146~~ Engine API: add getPayloadBodiesByRangeV1 to #146 May 6, 2022

fight spellchecker

1091c2d

arnetheduck mentioned this pull request May 6, 2022

deprecate BeaconBlocksByRange.step ethereum/consensus-specs#2856

Merged

mkalinin reviewed May 7, 2022

View reviewed changes

michaelsproul mentioned this pull request May 10, 2022

[Merged by Bors] - Separate execution payloads in the DB sigp/lighthouse#3157

Closed

4 tasks

arnetheduck and others added 2 commits May 10, 2022 16:31

Update src/engine/specification.md

e189c5b

Co-authored-by: Mikhail Kalinin <noblesse.knight@gmail.com>

Update src/engine/specification.md

e19aa80

Co-authored-by: Mikhail Kalinin <noblesse.knight@gmail.com>

djrtwo reviewed May 13, 2022

View reviewed changes

djrtwo mentioned this pull request May 19, 2022

Ethereum Core Devs Meeting 139 Agenda ethereum/pm#528

Closed

MariusVanDerWijden reviewed May 27, 2022

View reviewed changes

arnetheduck added 2 commits May 31, 2022 12:43

Merge remote-tracking branch 'origin/main' into lots-of-bodies

19c2e8a

ByRoot -> ByHash

e1989ed

lightclient added the A-engine Area: for future consideration label Jun 22, 2022

michaelsproul reviewed Sep 16, 2022

View reviewed changes

Allow nil in response

de8fd3d

mkalinin mentioned this pull request Nov 8, 2022

Engine API spec improvement proposal #321

Closed

deffrian mentioned this pull request Dec 2, 2022

GetBodiesByRangeV1 implementation NethermindEth/nethermind#4939

Merged

12 tasks

siladu mentioned this pull request Dec 8, 2022

Add getPayloadBodies and getPayloadBodiesByRangeV1 Methods hyperledger/besu#4787

Closed

mkalinin mentioned this pull request Dec 8, 2022

Ethereum Core Devs Meeting 151 Agenda ethereum/pm#675

Closed

mkalinin mentioned this pull request Dec 21, 2022

Engine API: add getPayloadBodies method #146

Closed

mkalinin mentioned this pull request Jan 11, 2023

Engine API: define payload bodies requests #352

Merged

mkalinin closed this Jan 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Engine API: add `getPayloadBodiesByRangeV1` to #146 #218

Engine API: add `getPayloadBodiesByRangeV1` to #146 #218

arnetheduck commented May 6, 2022

mkalinin left a comment

mkalinin May 7, 2022

arnetheduck commented May 10, 2022

djrtwo left a comment

djrtwo May 13, 2022

arnetheduck Sep 29, 2022

MariusVanDerWijden May 27, 2022

arnetheduck May 31, 2022

michaelsproul Sep 16, 2022

potuz Sep 16, 2022

arnetheduck Sep 16, 2022

arnetheduck Sep 29, 2022

michaelsproul Sep 16, 2022

arnetheduck Sep 16, 2022

mkalinin commented Dec 21, 2022

mkalinin commented Jan 11, 2023

		#### Specification

		1. Given a `start` and a `count`, the client software MUST respond with array of `ExecutionPayloadBodyV1` objects with the corresponding execution block number respecting the order of blocks in the canonical chain, as selected by the latest `forkChoiceUpdated` call.

		1. Given array of block hashes client software MUST respond with array of `ExecutionPayloadBodyV1` objects with the corresponding hashes respecting the order of block hashes in the input array.

		1. Client software MUST skip the payload body in the response array if the data of this body is missing. For instance, if the request is `[A.block_hash, B.block_hash, C.block_hash]` and client software has data of payloads `A` and `C`, but doesn't have data of `B`, the response MUST be `[A.body, C.body]`.


		1. Client software MUST skip the payload body in the response array if the data of this body is missing. For instance, if the request is `[A.block_hash, B.block_hash, C.block_hash]` and client software has data of payloads `A` and `C`, but doesn't have data of `B`, the response MUST be `[A.body, C.body]`.

		1. Clients that support `engine_getPayloadBodiesByRangeV1` MAY NOT respond to requests for finalized blocks by hash.

Engine API: add getPayloadBodiesByRangeV1 to #146 #218

Engine API: add getPayloadBodiesByRangeV1 to #146 #218

Conversation

arnetheduck commented May 6, 2022

mkalinin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arnetheduck commented May 10, 2022

djrtwo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mkalinin commented Dec 21, 2022

mkalinin commented Jan 11, 2023

Engine API: add `getPayloadBodiesByRangeV1` to #146 #218

Engine API: add `getPayloadBodiesByRangeV1` to #146 #218