Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Common: Support custom hash & state root on genesis #1833

Closed
wants to merge 2 commits into from

Conversation

cbrzn
Copy link
Contributor

@cbrzn cbrzn commented Apr 4, 2022

This PR aims to support custom hash and state root in Common initialization. Currently, we are updating the genesis object with the desired hash and state root, but the state given (which is the json attribute) is not being used at all, the reason for this is that the geth state parser is in Client and not in Common package.

I think that we should be able to support also the genesis state object (Since we are passing a state root hash, we need a state that can represent that hash - If I understand this correctly). After thinking a bit though this, I think that the easiest way to achieve this is to just make this json param to be GenesisState type

I am seeing this from the point of view of initiating for example the polygon chain, we can set the state with proper hashes, being able to fully execute a block

edit: I marked this as draft because it's not ready yet but would be awesome to get a review just to make sure I am going in the correct path

@codecov
Copy link

codecov bot commented Apr 4, 2022

Codecov Report

Merging #1833 (c3b9018) into master (7a3549a) will increase coverage by 0.01%.
The diff coverage is 88.88%.

Impacted file tree graph

Flag Coverage Δ
block 85.57% <ø> (ø)
blockchain 83.82% <ø> (ø)
client 75.89% <ø> (ø)
common 94.19% <88.88%> (ø)
devp2p 82.61% <ø> (+0.13%) ⬆️
ethash 90.76% <ø> (ø)
trie 80.02% <ø> (ø)
tx 88.20% <ø> (ø)
util 92.62% <ø> (ø)
vm 81.58% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

if (this._customChains?.length && Array.isArray(this._customChains[0])) {
plainCustomChains = (this._customChains as [IChain, GenesisState | GethGenesisState][]).map(
([chain, state]) => {
if ('hash' && 'stateRoot' in state) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i believe this needs to be

Suggested change
if ('hash' && 'stateRoot' in state) {
if ('hash' in state && 'stateRoot' in state) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a trick i use when the list starts getting long (from parse.ts):

if (['config', 'difficulty', 'gasLimit', 'alloc'].some((field) => !(field in json))) {
  throw new Error('Invalid format, expected geth genesis fields missing')
}

Comment on lines +251 to +252
stateRoot: 'cool-state-root',
hash: 'cool-hash',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you use valid values? maybe we should strictly validate them

@ryanio
Copy link
Contributor

ryanio commented Apr 4, 2022

wouldn't it be safer to derive the block hash and state root ourselves to ensure they equal the submitted params, rather than using them as an "override"? that would ensure everything in our stack is working to derive them correctly

it doesn't make sense to me that common should have any "geth" logic, rather i like that the client (or somewhere, could be a common util) sets everything up for a proper common instance. the geth config file should just be formatted and parsed to be used in the rest of our stack, not as a custom option. think if we tried to also support a nethermind config, we wouldn't want to add custom logic and storage for parsing of the nethermind config, we just want a resulting common that matches the config exactly.

@cbrzn
Copy link
Contributor Author

cbrzn commented Apr 4, 2022

wouldn't it be safer to derive the block hash and state root ourselves to ensure they equal the submitted params, rather than using them as an "override"? that would ensure everything in our stack is working to derive them correctly

I agree with you, but the problem is that, if we want to generate these hashes we would need the Trie library, which we don't want to add to Common package (discussion here - The important point is Holger's answer)

it doesn't make sense to me that common should have any "geth" logic, rather i like that the client (or somewhere, could be a common util) sets everything up for a proper common instance

That's a super important point and thanks for bringing that up I think is aligned with my thought, which is making the json param a GenesisState object. What do you think about removing the GethGenesisState type, and extending the GenesisState to support the genesis hash and state root

edit: also, if you feel this should not be done just let me know - I am trying to approach a way to make L2 support easier, but probably I am missing things for my lack of knowledge

@ryanio
Copy link
Contributor

ryanio commented Apr 4, 2022

sounds good, thanks for explaining. i guess we can use this as a "trusted source" then. it might be a good idea to check in the client that these hash and stateRoot match on bootup and error and abort if not, i imagine it would help to know in advance rather than getting different values in different places.

GenesisState should probably stay the alloc portion, can we add an optional properties object to the tuple that contains these block fields? maybe call it GenesisBlockData. then we can also add to it in the future.

@holgerd77
Copy link
Member

@cbrzn thanks for a start on this, great! 🙂

@ryanio the main use case for this is that people can easily (easier) use a Geth config file within our lower level libraries (so everything below the client, e.g. - guess main use case - use this for running a tx in the VM or generally creating a tx - if one only has a Geth config file at hand).

The other alternative would be to remove - along the breaking work - stateRoot and hash from our own Common config setup and therefore get "en match" with what Geth is providing in a chain/genesis context. This might even be the cleaner solution, this would have a lot of refactoring implications though if you start going along the usage chain.

So hash() might be the lesser problem (and thinking about it: we might actually want to treat this differently and see one-by-one what would be the removal-implications). stateRoot is used for the first time in the Block library for the initWithGenesisHeader option which is encapsuled by the Block.genesis() constructor which is used for the first time in the Blockchain library.

So a chain of refactoring would look roughly as follows:

  • Remove stateRoot and hash from Common
  • Remove genesis functionality from Block, move to Blockchain (which might generally a more fitting place)
  • Remove genesis functionality from StateManager, move to Blockchain as well
  • Remove VM.runBlockchain(), remove Blockchain dependency

Have no 100% proof that this would be it, but I am thinking about this topic for several weeks now and have followed this whole interdependency chain a couple of times and I think this would roughly be it which might lead us to generally leave some non-logical twists behind, also regarding e.g. some overload/misplacement on StateManager and inconsistent library placements/dependencies regarding the Client/Blockchain/VM correlation.

This is pretty heavy stuff though. 😬 😋

We could tackle nevertheless.

The solution here would be the - very - local one just extending a bit (but not too dramatically) on our own sub-optimalities.

I have no totally strong bias in the one or the other direction. We should just know what we are doing. 🙂

@ryanio
Copy link
Contributor

ryanio commented Apr 5, 2022

thanks for the in-depth walk through, super interesting! and a great time to consider some broader changes while working on develop. i will let this sink in before i give an opinion on a route, but i do like your bullet points, they seem like improvements in the right direction to me.

@holgerd77
Copy link
Member

holgerd77 commented Apr 7, 2022

As a second thing, this kind of deeper-reaching refactoring would also allow to solve on #1673 - so the heavy bundle sizes due to the bundling of the genesis state within Common - by moving genesis states to Blockchain, where it more naturally belongs. There it would be more feasible respectively ok to have it in the bundle, while the heavy burden on all libraries below (VM, Block, Tx - so everything which is most likely to be bundled) would be lifted.

At the same time (if I am not mistaken) genesis state could still be provided on an API level for us internally on all places where it is necessary. Common would then solely take genesis in as a constructor option.

@jochem-brouwer
Copy link
Member

Very good points @holgerd77!

Remove VM.runBlockchain(), remove Blockchain dependency

Right, so we move runBlockchain into the blockchain package, and instantiate blockchain with VM? Then in Client we would also need to change each instance of VM by instantiating a blockchain, right?

@holgerd77
Copy link
Member

Right, so we move runBlockchain into the blockchain package, and instantiate blockchain with VM? Then in Client we would also need to change each instance of VM by instantiating a blockchain, right?

That would at least be a possibility, we can explore this further.

Note that there is no imminent need though to do anything in this direction at all. VM.runBlockchain() is just not used at all (in the monorepo), in the client we have integrated (copied over) and then extended the whole runBlockchain() code due to its current limitations.

So if we do not want to overload on this we can also simply skip this point for now.

@ryanio
Copy link
Contributor

ryanio commented May 17, 2022

Still have my sights on this, will try to pick up and continue this week. I'll open a new PR targeting develop with the changes outlined in #1833 (comment)

@holgerd77
Copy link
Member

Still have my sights on this, will try to pick up and continue this week. I'll open a new PR targeting develop with the changes outlined in #1833 (comment)

Oh, interesting (and great), totally forgot about this TBH.

@holgerd77
Copy link
Member

holgerd77 commented May 18, 2022

@ryanio you can also consider #1833 (comment) (another comment, even if linked with the same text) along if you want.

@holgerd77 holgerd77 mentioned this pull request May 18, 2022
51 tasks
@holgerd77
Copy link
Member

Also have added an additional "In Discussion/TODO" note on the v6 Release Notes so that we do not forget about this task.

@ryanio
Copy link
Contributor

ryanio commented May 25, 2022

I've gotten started on this but have run into a few things so far:

  1. common uses the genesis hash in _calcForkHash(), if we remove it how should we replace it there
  2. we use runBlockchain() in the BlockchainTestsRunner, if we remove it how should we run those tests
    1. we can't remove the blockchain dependency from vm since the vm always has a blockchain
  3. what do you mean by removing genesis functionality from statemanager (now vmState) and moving to blockchain? vmState needs to initialize accounts in the trie, and the blockchain doesn't have access to the putAccount methods (the blockchain is a dependency of the vm). i can move generateCanonicalGenesis() one level up from vmState to the vm in order to access the new blockchain.genesisState() if we move it from common.genesisState()

@holgerd77
Copy link
Member

holgerd77 commented May 25, 2022

@ryanio great that you had a first look! 🙂

All of these are first-shots, so we might want to deepen discussion on the call or something:

  1. common uses the genesis hash in _calcForkHash(), if we remove it how should we replace it there

We could make the forkHash entry in the HF JSOn files mandatory and then omit calculating the fork hash in the C-ommon constructor if not present. for the forkHash() function itself we would demand the genesis hash as an input. All of this should be viable on first thought.

  1. we use runBlockchain() in the BlockchainTestsRunner, if we remove it how should we run those tests
  • we can't remove the blockchain dependency from vm since the vm always has a blockchain

Yeah, maybe we wait on this for the VM/EVM refactor? I guess the Blockchain dependency remains somewhat better placed there if there is this single extracted VMExecutionContext package, there it does make somewhat more sense again I guess?

what do you mean by removing genesis functionality from statemanager (now vmState) and moving to blockchain? vmState needs to initialize accounts in the trie, and the blockchain doesn't have access to the putAccount methods (the blockchain is a dependency of the vm). i can move generateCanonicalGenesis() one level up from vmState to the vm in order to access the new blockchain.genesisState() if we move it from common.genesisState()

If the genesis functionality is now in vmState then it might also be an option to then keep it there and then bundle all the genesis files (so: the large mainnet account files) with this outer package (so VMExecutionContext) instead of Blockchain? This is already a lot higher in the dependency hierachy and would also provide the benefit of creating a lot less hazzle for the lower level libraries (Tx, Block,...) and it generally might also be a fitting place to be.

@ryanio
Copy link
Contributor

ryanio commented May 27, 2022

Have continued in #1916 so I will close here

@ryanio ryanio closed this May 27, 2022
@holgerd77 holgerd77 deleted the common/geth-file-support branch March 2, 2023 09:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants