Pure consensus upgrade version of the merge #2257

mkalinin · 2021-03-20T13:36:43Z

What's done

executable beacon chain ([WIP] The Merge #2229) is stripped down to the pure consensus upgrade
the process of transition from PoW to PoS is based on total difficulty, as described by Quick merge via fork choice change

Differences from quick merge proposal

ApplicationPayload retained and used instead of application_block bytes

Inspired by

UPD

Application client requirements

Application client (Ethereum Mainnet client) will have to implement a new RPC protocol and the underlying logic to become merge compliant. The protocol has following methods:

eth2_insertBlock(parent_hash: Bytes32, application_payload: ApplicationPayload) -> boolean
Given the application_payload and parent_hash assembles an application block and inserts it into the application chain. If any of the pre or post verification conditions including state_root check are not satisfied then the method returns false, otherwise, returns true.
If parent block has been considered as the head of the chain then the newly inserted block becomes the head of chain if the import succeeds.
eth2_produceBlock(parent_hash: Bytes32) -> ApplicationPayload
Fetches transactions from the transaction pool and assembles a block on top of the parent block (specified by parent_hash) and returns it in a form of ApplicationPayload. Returns error code if parent block is not found.

Note: Pending block may be used as a source of a new block if given parent_hash matches the one that is in the pending block.
eth2_setHead(hash: Bytes32) -> None
Given the hash of a block sets the head of the application chain. Returns error code if block is not found.
eth2_finalizeBlock(hash: Bytes32) -> boolean
Given the hash of a block finalises the application block. Returns error code if block is not found.

Note: Chain and state clean ups may be triggered upon this call meaning that given block becomes an irreversible point on the chain. First finalisation event after the transition may have additional logic like disabling block gossip on the application layer.
eth2_submitTransitionBlockHash(hash: Bytes32) -> TransitionBlockStatus
Given the hash of a block returns the status information of the block. If block is not found then triggers the same processing chain inside of a client as if this hash was received with NewBlockHashes message. That is, asking the sync protocol to fetch the block and its ancestors and insert them into the application chain.

Note: Required by transition process only.

class TransitionBlockStatus:
    is_processed: boolean
    is_valid: boolean
    total_difficulty: uint256

Quick merge proposal

Quick merge proposal doesn't require extra eth2_submitTransitionBlockHash RPC method, this logic is implemented as a part of the submitBlock RPC method instead.

Co-authored-by: Paul Hauner <paul@paulhauner.com>

Co-authored-by: Danny Ryan <dannyjryan@gmail.com>

specs/merge/beacon-chain.md

specs/merge/fork-choice.md

specs/merge/beacon-chain.md

Co-authored-by: terence tsao <terence@prysmaticlabs.com>

djrtwo

Did a review of beacon-chain.md. Mostly formatting suggestions. One substantive question about the transition block payload

specs/merge/beacon-chain.md

adiasg · 2021-03-22T19:31:04Z

specs/merge/beacon-chain.md

+##### `get_application_state`
+
+*Note*: `ApplicationState` class is an abstract class representing ethereum application state.
+
+Let `get_application_state(application_state_root: Bytes32) -> ApplicationState` be the function that given the root hash returns a copy of ethereum application state. 
+The body of the function is implementation dependent.
+
+##### `application_state_transition`
+
+Let `application_state_transition(application_state: ApplicationState, application_payload: ApplicationPayload) -> None` be the transition function of ethereum application state. 
+The body of the function is implementation dependent.
+
+*Note*: `application_state_transition` must throw `AssertionError` if either the transition itself or one of the post-transition verifications has failed.


Do we need get_application_state() at all? (also see other comment below)

Suggested change

##### `get_application_state`

*Note*: `ApplicationState` class is an abstract class representing ethereum application state.

Let `get_application_state(application_state_root: Bytes32) -> ApplicationState` be the function that given the root hash returns a copy of ethereum application state.

The body of the function is implementation dependent.

##### `application_state_transition`

Let `application_state_transition(application_state: ApplicationState, application_payload: ApplicationPayload) -> None` be the transition function of ethereum application state.

The body of the function is implementation dependent.

*Note*: `application_state_transition` must throw `AssertionError` if either the transition itself or one of the post-transition verifications has failed.

##### `application_state_transition`

Let `application_state_transition(application_state_root: Bytes32, application_payload: ApplicationPayload) -> Bytes32` be the transition function of ethereum application state. This function takes the pre-state associated with `application_state_root`, applies the `application_payload`, and returns the post-state root.

The body of the function is implementation dependent.

*Note*: `application_state_transition` must throw `AssertionError` if either the transition itself or one of the post-transition verifications has failed.

adiasg · 2021-03-22T19:39:28Z

specs/merge/beacon-chain.md

+ application_state = get_application_state(state.application_state_root)
+ application_state_transition(application_state, body.application_payload)
+
+ state.application_state_root = body.application_payload.state_root


Is there a need for the Eth2 node to access application_state?
A cleaner approach would be to outsource all the work to application_state_transition(), so that it operates using the application_state_root. (see suggestion for the application_state_transition function)
Also, need to check that whatever is returned by application_state_transition() actually matches with body.application_payload.state_root.

Suggested change

application_state = get_application_state(state.application_state_root)

application_state_transition(application_state, body.application_payload)

state.application_state_root = body.application_payload.state_root

state.application_state_root = application_state_transition(state.application_state_root, body.application_payload)

assert state.application_state_root == body.application_payload.state_root

I definitely see the value in simplifying this part. But then it would not read like the beacon state and the application state are tightly coupled.

For the root check we can do the following:

application_state = get_application_state(state.application_state_root) application_state_transition(application_state, body.application_payload) assert body.application_payload.state_root == get_application_state_root(application_state) state.application_state_root = body.application_payload.state_root state.application_block_hash = body.application_payload.block_hash

But that would require yet another abstract function. So, it might really be better to have the following (according to your suggestion):

application_state_root = application_state_transition(state.application_state_root, body.application_payload) assert application_state_root == body.application_payload.state_root state.application_state_root = body.application_payload.state_root state.application_block_hash = body.application_payload.block_hash

The latter form also fits stateless verification.

nitpicking: application_state_root = application_state_transition(...) requires application_state_transition to return Bytes32, but it doesn't align with the naming pattern as beacon state transition function state_transition(...) -> None. So having a get_application_state_root(application_state) makes sense to me.

Making this call conformant with state_transition(...) was the original intention put behind this extra function call and using application state object rather than the state root.

We may also define ApplicationState in a following way

class ApplicationState(Container): root: Bytes32

And then add an explicit check of the state root after application state transition

application_state = get_application_state(state.application_state_root) application_state_transition(application_state, body.application_payload) assert application_state.root == body.application_payload.state_root state.application_state_root = body.application_payload.state_root state.application_block_hash = body.application_payload.block_hash

specs/merge/beacon-chain.md

timbeiko · 2021-03-22T20:23:17Z

specs/merge/beacon-chain.md

+The application payload included in a `BeaconBlock`.
+
+```python
+class ApplicationPayload(Container):


Post-London, we'll want to add the BASE FEE here.

timbeiko · 2021-03-22T20:25:49Z

specs/merge/beacon-chain.md

+def process_block(state: BeaconState, block: BeaconBlock) -> None:
+ process_block_header(state, block)
+ process_randao(state, block.body)
+ process_eth1_data(state, block.body)


Suggested change

process_eth1_data(state, block.body)

process_application_data(state, block.body)

I agree that Eth1Data name will look odd after the merge but we might want to keep it as it this time. The follow up cleanups are going to reduce the follow distance of Eth1Data and it would be a good time to make the renaming too.

It's more of DepositContractData if we are going to rename it.
It is more specific than application layer data

djrtwo · 2021-03-22T22:08:09Z

Is there a need for the Eth2 node to access application_state?

@adiasg I would argue that this is the cleaner approach because "outsourcing" all the work is actually an implementation detail (e.g. a client could bundle pos and application components tightly or could do something like we are doing with the separation of consensus vs application client)

djrtwo

fantastic work! got through the next two docs. happy to discuss here or otherwise

djrtwo · 2021-03-22T19:01:00Z

specs/merge/fork-choice.md

+class PowBlock(Container):
+ is_processed: boolean
+ is_valid: boolean
+ total_difficulty: uint256


Is this granularity required?

uint's larger than 64 have been entirely avoided in the consensus-layer so far

total difficulty is on the order of 10**22 so it doesn't fit in uint64...

We could reduce precision to avoid uint256. Need to consider our options here

Yes, total difficulty falls into > 2**74 interval today. The other potential way of handling this is to return an offset wrt some absolute total difficulty value. Current block's difficulty is around 2**64, so we indeed may divide total difficulty by 2**20 without loss of generality.

Also, it worth noting that uint256 type is used by Transaction data structure to define several fields.

ah, right. I don't thnk we'll be able to get around value with 256 granularity

I suppose beacon clients don't actually have to support arithmetic on these TX values because all arithmetic and validations happen in application layer so as long as they can serialize and deserialize, that's enough support

specs/merge/fork-choice.md

djrtwo · 2021-03-22T19:18:17Z

specs/merge/fork-choice.md

+
+```python
+class PowBlock(Container):
+ is_processed: boolean


is is_processed just a super set of is_valid?

Do eth1 clients remember if an invalid block has already been processed? or does it just drop it

Current behaviour is to return error if either invalid or not yet processed block is requested meaning that these two statuses are indistinguishable. So, if we want this level of granularity then JSON-RPC implementation will have to be adjusted but we probably don't want it. It's worth discussing.

Do eth1 clients remember if an invalid block has already been processed? or does it just drop it

I think with some configuration it stores invalid blocks to be able to serve debug_getBadBlock

djrtwo · 2021-03-22T19:30:08Z

specs/merge/fork-choice.md

+
+```python
+def on_block(store: Store, signed_block: SignedBeaconBlock) -> None:
+ block = signed_block.message


We should probably move some of his logic into sub-functions on phase0 so we can have better code reuse. Can do that in a separate pr

specs/merge/validator.md

adiasg · 2021-03-23T06:04:27Z

@djrtwo

@adiasg I would argue that this is the cleaner approach because "outsourcing" all the work is actually an implementation detail (e.g. a client could bundle pos and application components tightly or could do something like we are doing with the separation of consensus vs application client)

I partly agree - clients may do application processing in the Eth2 node itself, or outsource the work to a separate application client. That's why I think application_state_transition(application_state_root: Bytes32, application_payload: ApplicationPayload) is better, as it accounts for both cases. If outsourcing the work, ApplicationState is not required in the Eth2 node. If not outsourcing the work, the function can be implemented to handle processing without exposing the ApplicationState anywhere else.

The way this behavior is currently defined (with get_application_state(application_state_root: Bytes32) -> ApplicationState) makes it seem like ApplicationState is essential for Eth2 validation logic, which is not the case.

specs/merge/beacon-chain.md

ralexstokes · 2021-03-24T18:46:01Z

specs/merge/beacon-chain.md

+ gas_used: uint64
+ receipt_root: Bytes32
+ logs_bloom: Vector[Bytes1, BYTES_PER_LOGS_BLOOM]
+ difficulty: uint64 # Temporary field, will be removed later on


why can't we remove this field now?

paulmillr · 2021-03-24T20:15:04Z

Thank you for the hard work!

A quick question from a guy who has not been following the plans very thoroughly.

What hazardous or bad tech debt does this merge proposal create when compared to the "old" merge plan?

lsankar4033 · 2021-03-25T04:30:16Z

specs/merge/beacon-chain.md

+
+ if is_transition_completed(state):
+ application_state = get_application_state(state.application_state_root)
+ application_state_transition(application_state, body.application_payload)


would it make sense to pass in the randao or some other seed for DIFFICULTY/BLOCKHASH here?

obviously, easiest to just stick them on the application_payload so the eth1 engine doesn't even have to know about this new logic.

Yep, I think it would make sense to prepare a consensus bundle that will be passed onto this function with randao mix and further extended by other bits. I'd add it later once we made the decision about difficulty and whether to use randao or not.

specs/merge/beacon-chain.md

hwwhww · 2021-03-25T09:42:17Z

specs/merge/beacon-chain.md

+ application_state = get_application_state(state.application_state_root)
+ application_state_transition(application_state, body.application_payload)
+
+ state.application_state_root = body.application_payload.state_root


nitpicking: application_state_root = application_state_transition(...) requires application_state_transition to return Bytes32, but it doesn't align with the naming pattern as beacon state transition function state_transition(...) -> None. So having a get_application_state_root(application_state) makes sense to me.

specs/merge/beacon-chain.md

vbuterin · 2021-03-25T21:10:50Z

It does seem like this goes pretty far in changing formatting at the same time as the merge, particularly the transactions list. Remember that by this point there will be plenty of transaction types: old-style with no replay protection, old-style with replay protection, EIP 2930 access list-carrying txs, EIP 1559 basefee-setting txs. These would all have to be changed into an SSZ format.

To keep merge complexity down, might it not be simpler to just keep transactions as a list of blobs, and then have some post-merge fork replace them with a new SSZ-based transaction type? (first make it voluntary, and then make it mandatory)

class ApplicationPayload(Container):
    block_hash: Bytes32  # Hash of application block
    coinbase: Bytes20
    state_root: Bytes32
    gas_limit: uint64
    gas_used: uint64
    receipt_root: Bytes32
    logs_bloom: ByteVector[BYTES_PER_LOGS_BLOOM]
    transactions: List[Transaction, MAX_APPLICATION_TRANSACTIONS]

I noticed that this does not include a bunch of fields (uncles_hash, difficulty, number, timestamp, mixhash, nonce). This makes the conversion between old-style and new-style application blocks not a clean reversible function. Perhaps this does not matter, because those fields are not relevant anymore, but it does create some extra special cases. Particularly, it requires an extra special case for the transition application block itself: if the transition application block is the last PoW-bearing block, then you would need some special logic for including that block, because the last PoW-bearing block does have an uncles_hash, nonce, etc.

Or is the idea to change that bit in the spec, so that instead of the transition beacon block including the last PoW block, it merely includes a block whose parent is the last PoW block? If so, then I could see how this can work.

protolambda · 2021-03-25T21:17:40Z

@vbuterin fair point. I have #2270 open to track Union support to switch between (the many) transaction types, and already proposed the option of a list of opaque transactions.

Another option is to do transactions: List[Union[TxBlob], MAX_APPLICATION_TRANSACTIONS] and expand the Union[TxBlob] later with more SSZ transaction types. That way we avoid the immediate complexity of many new SSZ transaction types, have something to fall back on with all legacy transactions, and it's much more compatible to extend with SSZ transactions in future forks.

Edit: added missing issue link

mkalinin · 2021-03-26T09:14:08Z

@vbuterin @protolambda IMO, an opaque transactions option is the best for the beginning. IIUC, some of old-styled transaction types will become unacceptable starting from some point in time and it would make more sense to replace transaction blobs with SSZ objects once all the formats are settled down to avoid them being heavily dependent on the beacon chain spec.

mkalinin · 2021-03-26T09:21:44Z

Particularly, it requires an extra special case for the transition application block itself

@vbuterin It will be a special case. As ApplicationPayload does not explicitly contain parent_hash, the hash of the last PoW block will be brought up on chain before producing the first PoS block. It's gonna be done by adding ApplicationPayload(block_hash=transition_block_hash) to the beacon block body to denote a transition block. It also requires a special case in the state_transition function to process transition block which just sets state.application_block_hash = application_payload.block_hash without interacting with the application layer.

mkalinin · 2021-03-26T10:00:04Z

Thank you for the hard work!

A quick question from a guy who has not been following the plans very thoroughly.

What hazardous or bad tech debt does this merge proposal create when compared to the "old" merge plan?

The list of changes that this proposal sets for the hard fork(s) after the merge is as follows:

Eth1Data follow distance reduction
New EVM opcodes, RANDOM, BEACONBLOCKROOT, etc.
Validator withdrawals

Note: Withdrawals are dependent on the opcodes

djrtwo · 2021-03-26T14:33:21Z

It's gonna be done by adding ApplicationPayload(block_hash=transition_block_hash) to the beacon block body to denote a transition bloc

Note that, another option to signal transition is to just include a full duplicate of the last PoW block. I'm not sure if this would result in more or less exceptional logic but probably worth considering

specs/merge/beacon-chain.md

Co-authored-by: terence tsao <terence@prysmaticlabs.com>

djrtwo

Fantastic work @mkalinin and to the many reviewers!

To better manage the complexity, we're going to merge the current PR as is. I've created an issue (#2280) to track the open discussion points and todos that were brought up but not resolved in PR.

From here, we'll move toward smaller and more iterative PRs to address these and any other points that come up along the way

mkalinin and others added 11 commits March 17, 2021 15:59

Add initial merge spec

0dec828

Polish beacon chain spec and validator guide

ee16163

Index from GENESIS_SLOT in compute_time_at_slot

f6f3687

Co-authored-by: Paul Hauner <paul@paulhauner.com>

Use Vector struct for recent_block_roots field

3fb5f2e

Co-authored-by: Paul Hauner <paul@paulhauner.com>

Add a line break in get_recent_beacon_block_roots

5435324

Co-authored-by: Danny Ryan <dannyjryan@gmail.com>

Fix spelling

3c9cd85

Co-authored-by: Danny Ryan <dannyjryan@gmail.com>

Lable Added/Remove notes with Merge explicitly

a368f5d

Remove min(..., ...) in get_evm_beacon_block_roots

b8e16c1

Add rebase-to-Altair warning

bf15164

Strip down the merge to the pure consensus upgrade

46fc8a1

Verify transition block to be assembled correctly

3420e51

terencechain reviewed Mar 21, 2021

View reviewed changes

specs/merge/beacon-chain.md Outdated Show resolved Hide resolved

specs/merge/beacon-chain.md Show resolved Hide resolved

specs/merge/beacon-chain.md Show resolved Hide resolved

specs/merge/fork-choice.md Outdated Show resolved Hide resolved

seishun reviewed Mar 22, 2021

View reviewed changes

specs/merge/beacon-chain.md Outdated Show resolved Hide resolved

mkalinin and others added 5 commits March 22, 2021 20:54

Fix block_body variable in is_transition_block

24dc8a2

Co-authored-by: terence tsao <terence@prysmaticlabs.com>

Verify that ApplicationPayload is zeroed before the transition

38a455c

Simplify merge.BeaconState definition

83453d2

Boolean -> boolean

7e6ac4e

Distinguish invalid and not processed transition block

96de910

djrtwo reviewed Mar 22, 2021

View reviewed changes

adiasg reviewed Mar 22, 2021

View reviewed changes

timbeiko reviewed Mar 22, 2021

View reviewed changes

specs/merge/beacon-chain.md Outdated Show resolved Hide resolved

timbeiko reviewed Mar 22, 2021

View reviewed changes

specs/merge/beacon-chain.md Outdated Show resolved Hide resolved

timbeiko reviewed Mar 22, 2021

View reviewed changes

djrtwo reviewed Mar 23, 2021

View reviewed changes

seishun reviewed Mar 23, 2021

View reviewed changes

specs/merge/beacon-chain.md Outdated Show resolved Hide resolved

hwwhww added the Bellatrix CL+EL Merge label Mar 23, 2021

djrtwo mentioned this pull request Mar 23, 2021

[WIP] The Merge #2229

Closed

4 tasks

ralexstokes reviewed Mar 24, 2021

View reviewed changes

lsankar4033 reviewed Mar 25, 2021

View reviewed changes

hwwhww reviewed Mar 25, 2021

View reviewed changes

mkalinin added 4 commits March 25, 2021 17:49

Address a new portion of comments and fixes

ee5ecf8

Bytes1 to byte in ApplicationPayload.logs_bloom

a23bde3

Polish merge/fork-choice.md

260a0a5

Use ByteList[N] and ByteVector[N] types

81a2c2c

terencechain reviewed Mar 26, 2021

View reviewed changes

specs/merge/beacon-chain.md Outdated Show resolved Hide resolved

specs/merge/beacon-chain.md Show resolved Hide resolved

terencechain reviewed Mar 26, 2021

View reviewed changes

specs/merge/beacon-chain.md Outdated Show resolved Hide resolved

minor edits from code review

41a087a

Co-authored-by: terence tsao <terence@prysmaticlabs.com>

mkalinin marked this pull request as ready for review March 26, 2021 19:19

djrtwo force-pushed the consensus-upgrade branch 5 times, most recently from a96e318 to f263b95 Compare March 26, 2021 19:48

byte-list for opaque transaction payload

223aba3

djrtwo force-pushed the consensus-upgrade branch from f263b95 to 223aba3 Compare March 26, 2021 19:50

djrtwo mentioned this pull request Mar 26, 2021

Merge discussion and todos from #2257 #2280

Closed

9 tasks

djrtwo approved these changes Mar 26, 2021

View reviewed changes

djrtwo merged commit 9f8e627 into ethereum:the-merge Mar 26, 2021

mkalinin mentioned this pull request Mar 29, 2021

Merge Implementers' Call 1 ethereum/pm#290

Closed

Myu-Unix mentioned this pull request Apr 4, 2021

Intro draft InsideTheSim/ethmerge.com-content#26

Merged

	process_eth1_data(state, block.body)
	process_application_data(state, block.body)

Pure consensus upgrade version of the merge #2257

Pure consensus upgrade version of the merge #2257

Conversation

mkalinin commented Mar 20, 2021 • edited Loading

What's done

Differences from quick merge proposal

Inspired by

UPD

Application client requirements

Quick merge proposal

djrtwo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

djrtwo Mar 24, 2021 • edited Loading

Choose a reason for hiding this comment

djrtwo commented Mar 22, 2021 • edited Loading

djrtwo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adiasg commented Mar 23, 2021 • edited Loading

Choose a reason for hiding this comment

paulmillr commented Mar 24, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vbuterin commented Mar 25, 2021

protolambda commented Mar 25, 2021 • edited Loading

mkalinin commented Mar 26, 2021

mkalinin commented Mar 26, 2021

mkalinin commented Mar 26, 2021

djrtwo commented Mar 26, 2021

djrtwo left a comment

Choose a reason for hiding this comment

mkalinin commented Mar 20, 2021 •

edited

Loading

djrtwo Mar 24, 2021 •

edited

Loading

djrtwo commented Mar 22, 2021 •

edited

Loading

adiasg commented Mar 23, 2021 •

edited

Loading

protolambda commented Mar 25, 2021 •

edited

Loading