Client: add VM execution (old) #1017

holgerd77 · 2020-12-15T23:50:39Z

Ok, i did some back-and-forth experimentation on where/how to integrate VM execution in the client and I think this should finally do it.

This PR first removes the HeaderFetcher -> BlockFetcher inheritance dependency. This frees the Fetcher classes to now have a clean object relationship to the respective synchronizers (so BlockFetcher <-> FullSynchronizer, HeaderFetcher <-> LightSynchronizer) and subsequently allows for a direct integration of the VM execution into the BlockFetcher, the VM (another time 😄 ) also further moved into this class.

In the BlockFetcher.store() function the functionality from Chain.putBlocks() is then drawn in and decomposed (this should be no problem since this class is going away anyhow with the wrapper class removal @jochem-brouwer is planning).
This allows for atomic alternate block execution and blockchain storage (I've also drawn in the three line Blockchain.putBlocks() functionality) and assures that only successfully executed upon blocks are stored in the chain.

PR is not running yet. Unit tests are passing but integration tests need some modification since the mock setup now triggers some validation checks to fail along the vm.runBlock() run, not sure how to fix this yet.

Client run is also triggering the following error at the moment once a block with transactions is hit: ERROR [12-16|00:31:58] Error: sender doesn't have enough funds to send tx. The upfront cost is: 996205183388591000 and the sender's account only has: 0 (another error ERROR [12-16|00:30:46] TypeError: Cannot read property 'map' of undefined at BlockFetcher.request (/EthereumJS/ethereumjs-vm/packages/client/dist/lib/sync/fetcher/blockfetcher.js:41:62) seems unrelated to this PR and might rather be introduced along the type improvements from @ryanio in some previous PRs, at least that's my current unproven suspicion).

So this is rather open to be picked up and further continued.

codecov · 2020-12-15T23:52:19Z

Codecov Report

Merging #1017 (4811439) into master (582b4ef) will increase coverage by 0.43%.
The diff coverage is 93.13%.

Flag	Coverage Δ
block	`77.92% <ø> (ø)`
blockchain	`77.92% <ø> (ø)`
client	`88.06% <91.78%> (+0.92%)`	⬆️
common	`91.87% <ø> (ø)`
devp2p	`82.78% <100.00%> (+0.33%)`	⬆️
ethash	`82.08% <ø> (ø)`
tx	`86.25% <ø> (+0.21%)`	⬆️
vm	`83.19% <95.00%> (+0.14%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

holgerd77 · 2020-12-15T23:56:41Z

Ok, but that's at least in the center of what we want to test (currently testing at block 76511 of mainnet): 😄

ERROR [12-16|00:53:59] Error: sender doesn't have enough funds to send tx. The upfront cost is: 996205183388591000 and the sender's account only has: 0
    at VM._runTx (/EthereumJS/ethereumjs-vm/packages/vm/dist/runTx.js:67:19)
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (internal/process/task_queues.js:94:5)
    at async VM.runTx (/EthereumJS/ethereumjs-vm/packages/vm/dist/runTx.js:29:24)
    at async VM.applyTransactions (/EthereumJS/ethereumjs-vm/packages/vm/dist/runBlock.js:161:23)
    at async VM.applyBlock (/EthereumJS/ethereumjs-vm/packages/vm/dist/runBlock.js:131:23)
    at async VM.runBlock (/EthereumJS/ethereumjs-vm/packages/vm/dist/runBlock.js:70:18)
    at async BlockFetcher.store (/EthereumJS/ethereumjs-vm/packages/client/dist/lib/sync/fetcher/blockfetcher.js:69:13)
    at async Writable._write (/EthereumJS/ethereumjs-vm/packages/client/dist/lib/sync/fetcher/fetcher.js:176:17)

holgerd77 · 2020-12-16T00:14:23Z

(I could skip both the balance and the nonce checks in the VM with skipBalance, skipNonce set to `true, but would this make sense? 🤔 )

holgerd77 · 2020-12-16T00:18:07Z

The initial error when having deleted the database is:

ERROR [12-16|01:16:23] Error: invalid block stateRoot
    at VM.runBlock (/EthereumJS/ethereumjs-vm/packages/vm/dist/runBlock.js:97:19)
    at async BlockFetcher.store (/EthereumJS/ethereumjs-vm/packages/client/dist/lib/sync/fetcher/blockfetcher.js:69:13)
    at async Writable._write (/EthereumJS/ethereumjs-vm/packages/client/dist/lib/sync/fetcher/fetcher.js:176:17)

jochem-brouwer · 2020-12-16T13:08:52Z

Okay, this is great! 😄 We should not use any of the skip flags, the client should fully validate each block. Note that block 46147 is the first block in which we have transactions. These probably pass because skipBlockValidation is set to true, this does not check things like the state root. We should fully validate each block so we can immediately detect if there's something wrong.

jochem-brouwer · 2020-12-16T20:27:14Z

OK, experimenting a bit here. I have removed skipBlockValidation here.

BlockFetcher either throws this Cannot read property 'map' of undefined (peer dumps in no data at all?) or it throws on a wrong state root when running block 1.

jochem-brouwer · 2020-12-16T21:21:39Z

OK, the root cause of the problem is because stateManager points to an empty trie by default instead of the genesis state trie. This should be fixed once we start using runBlockchain, although we might want to alter that functionality a little bit too.

holgerd77 · 2020-12-17T22:38:22Z

Rebased this.

holgerd77 · 2020-12-17T23:55:17Z

( @jochem-brouwer if you want to do work here you can of course directly push to this branch )

…ance for a clear synchronizer object association

…ded state persistence by introducing StateManager

holgerd77 · 2020-12-21T12:16:54Z

Ok, this seems to work and execute the VM correctly, it also now persists the state (I wasn't aware actually that we are not doing this yet 😛 ). 🎉

This needs some further clean up, will let this open 1-2 days more, but otherwise I would cautiously say this is ready.

Following command gives a run with a clean slate:

rm -Rf [HOME_DIR]/Ethereum/ethereumjs/chaindata && rm -Rf ./statedir tsc -p tsconfig.prod.json && node dist/bin/cli.js

jochem-brouwer

Some comments and questions, great progress so far 😄

packages/client/lib/sync/fetcher/blockfetcher.ts

jochem-brouwer · 2020-12-21T16:47:46Z

packages/client/lib/sync/fetcher/blockfetcher.ts

-  count: BN
-}
+import VM from '@ethereumjs/vm'
+import { DefaultStateManager } from '@ethereumjs/vm/dist/state'


I didn't know this was possible, nice that this is possible 😄

Yeah, I have done this for the first time as well 😄 , think it's good that we are finally getting consumers of the StateManager, think this will give us a better feeling here. The import e.g. is not very optimal going through dist, on the next major release we should likely directly expose through the main index.ts file, have added this as a breaking task within a new release v6 (+ friends 😄 ) planning issue #1024.

jochem-brouwer · 2020-12-21T16:49:03Z

packages/client/lib/sync/fetcher/blockfetcher.ts

-    this.count = options.count
-  }
+    if (!this.config.vm) {
+      const db = level('./statedir')


In theory you can just use a single directory to store state, but I find it cleaner to store state into the respective directory (mainnet/ropsten). Can we store it in the chaindata directory?

Definitely, this was one of my clean-up tasks. I just added this since I needed to finish and wanted to have a proof that this works at all.

Slight modified suggestion: I would store this in a parallel statedata directory with the chain (e.g. ropsten) as the base directory, this would need some modifications:

datadir in Config should reflect the base directory (so without the chaindata or statedata part

Config.getSyncDataDirectory() (used in bin/cli.ts) should be renamed to getSystemDefaultDataDirectory() (or something similar) and also return the (then system specific (somewhat)) base directory

The other parts (chaindata or lightchaindata,statedatashould be added on the blockchain and trie instantiations

Does this make sense? If so feel free to just pick up.

So under datadir we would have a chaindata and statedata directory? It is currently configured that this datadir can only point to a single chain, so if you want to run both mainnet and ropsten then you'd need two datadirs.

Ah, likely datadir should also be without the chain, right? 🤔 So just the upper most level base dir without any configuration selection.

jochem-brouwer · 2020-12-21T16:54:44Z

packages/client/lib/sync/fetcher/blockfetcher.ts

+      const block: Block = blocks[i]
+
+      await this.chain.blockchain.putBlock(block)
+      await this.vm.runBlockchain()


This is a nice trick to only run only block (at most). However, if we want to dump in a chain-reorg (of multiple blocks), then the following will happen.

Let's say we dump 3 blocks of reorg in the chain: A, B and C. If we put in block A, then this will (in most cases) not change the canonical chain. Therefore, blockchain still points to the old chain head. Same happens with B. When we put C, blockchain sees we have a reorg, and therefore rolls back the head pointers to the parent block hash of block A. Then, when we call runBlockchain, it will only run block A, while we expect that it runs up to block C. Can we keep running the blockchain until the head block of the VM pointer in the blockchain does not change after we've called runBlockchain? (Use getHead of blockchain)

jochem-brouwer · 2020-12-21T16:57:19Z

packages/client/lib/sync/fetcher/blockfetcher.ts

+      Block.fromValuesArray(b.raw(), { common: this.config.common })
+    )
+    await this.chain.blockchain.initPromise
+    for (let i = 0; i < blocks.length; i++) {


This works if we only download blocks from a single peer (which seems to be the case). I don't know if it is compatible when we switch to using multiple peers. It is OK for now, but we should make this compatible with multiple peers ASAP. What we probably want is to call a VM update method: this keeps running blocks until no new blocks exist. If this method already runs, then return early, otherwise execute the method.

I am not sure if parallel execution should be prioritized that high, from the runs I've done it seems to me that VM execution (and not block download) is our bottleneck by a large margin.

Yeah giving this some thought you are right, parallel peers should not have such a high priority.

jochem-brouwer · 2020-12-21T17:00:20Z

packages/client/lib/sync/fetcher/fetcher.ts

  protected pool: PeerPool
+
+  protected first: BN
+  protected count: BN


This makes the Fetcher only compatible with Block/BlockHeader downloads. We probably want to extend Fetcher in the future to also support downloading state types, which is not directly supported if we encapsulate Fetcher as this. We will then have to refactor this out later to support this, I am OK with it now (I will probably refactor it out in #1023, where I already moved some methods around).

jochem-brouwer · 2020-12-21T17:00:41Z

packages/client/lib/sync/fetcher/fetcher.ts

    this.timeout = options.timeout ?? 8000
    this.interval = options.interval ?? 1000
    this.banTime = options.banTime ?? 60000
    this.maxQueue = options.maxQueue ?? 16
-    this.maxPerRequest = options.maxPerRequest ?? 128
+    this.maxPerRequest = options.maxPerRequest ?? 25


Any reason why we change this to 25?

Yes, since VM execution makes everything substantially slower I reduced this to have chunks which terminate in a reasonable timeframe (something under 1 minute)

jochem-brouwer · 2020-12-21T17:02:22Z

packages/client/lib/sync/fetcher/headerfetcher.ts

@@ -11,7 +11,7 @@ export interface HeaderFetcherOptions extends BlockFetcherOptions {
 * Implements an les/1 based header fetcher
 * @memberof module:sync/fetcher
 */
-export class HeaderFetcher extends BlockFetcher {
+export class HeaderFetcher extends Fetcher {


Yep this makes a lot of sense, I did something similar in #1023, it doesn't make a lot of sense that HeaderFetcher builds upon BlockFetcher. (In fact, it overrides all methods except tasks)

jochem-brouwer · 2020-12-21T17:06:45Z

packages/client/lib/sync/fetcher/blockfetcher.ts

+        trie,
+      })
+
+      this.vm = new VM({


I don't think I like that if we use BlockFetcher, we also require that we start running blocks. This would probably lead to problems when we implement Beam Sync. I think it makes more sense in FullSync.

What we could do to ensure we only run one block at a time, is to add an option in VM.runBlockchain which describes how many blocks we want to run (default: all), which sets maxBlocks of blockchain.iterator. When we putBlocks, in the Chain of Client, we can attach to the updated event, and then start running these new blocks one-by-one.

The reason why I did this is that I wanted to have this as close as possible to the blockchain.putBlock() call (I even thought that the VM execution might actually be suited best to be placed in the Blockchain class itself (yes - I know - that atm this would cause a circular dependency). I think to have this connected by an event (updated) is not enough, this needs be connected in both directions. So if a VM execution fails, this should also stop block processing, otherwise it is a large waste of resources if blocks are continuedly put in the blockchain when some prior block failed in the VM.

I haven't thought about the beam sync implications though.

Would it be a way to leave this like this for now and have this open for refactoring later? I am just happy that this is working right now and my fear is that we bring this back in a non-working state or at least introduce new problems if we move here (you can proof me wrong though with a PR here).

(so please always make sure by mainnet test runs that both initial sync from 0 and continued sync from some blocks already stored is still working including VM execution 😁 )

Well one thing we'd need is a lot of extra tests. Since we don't ban a peer currently if they feed us wrong blocks, this implies that the current (bad) peer is still the "best peer" and will thus be used as the (only) peer to sync these (bad) blocks. So I think it would keep running these bad blocks. It is a good point though, have not thought about the mechanism on what to do if peers feed us bad blocks. You are right that there needs to be a two-way-direction here, we want VM to report bad peers, but then we'd indeed need to know which peer reported that block.

jochem-brouwer · 2020-12-21T17:13:13Z

I will continue here a bit (but will create a new branch) to propose some changes.

jochem-brouwer · 2020-12-21T17:31:33Z

packages/client/lib/sync/fetcher/blockfetcher.ts

+    blocks = blocks.map((b: Block) =>
+      Block.fromValuesArray(b.raw(), { common: this.config.common })
+    )
+    await this.chain.blockchain.initPromise


Note: internally, in blockchain, initPromise is always awaited on methods where this is necessary.

vm: add maxBlocks option to runBlockchain

jochem-brouwer · 2020-12-23T00:28:21Z

Just checked this out, this is extremely cool 😄

holgerd77 · 2020-12-23T09:12:35Z

Speed increase is due to the removed block checkpointing in VM for blocks with zero transactions (see other messages), has nothing to do with the maxRequests reduction. 😀

holgerd77 · 2020-12-23T09:16:31Z

(might also be a hint that there might be something very wrong with the current checkpointing implementation in MPT. I mean creating a checkpoint on a still such tiny trie along these first blocks just can't take do much time. Likely something for a closet look early next year or so 🙂)

holgerd77 · 2020-12-23T10:49:55Z

@jochem-brouwer ah, and just another updated note, seems we were discussing two things at once here respectively you were rather referring to the larger maxPerRequest. When this was higher, there was no VM speed decrease or something, there were just various additional (mostly peer connectivity) related error messages and a sync was more or less not happening (I think I got 1 successful run out of 10 or so). So - yes - at some point this might also be worth to further investigate and have a look why this is happening. The things you stated (pending promises) might play a role here I guess, but I think you have got a much better feeling for real-world behavior on stuff like that.

Note that maxPerRequest is referring to the number of blocks to download though and has nothing to do with the peer pool size (these kind of constant parameters should be named more clearly, here e.g. maxBlocksPerRequest would already be a large improvement, we should likely always do when we stumble upon ambiguous naming).

…uest) along import/execution decoupling, added tx number in block to execution message

…cks with 0 txs (performance)

…ctivity problems

…ferentiation towards stateDB

… directory structures

…ed missed out test base directory tests (e.g. test/client.spec.ts) to test:unit package.json command

…ls for improved discovery performance and to avoid network traffic spikes

holgerd77 · 2021-01-02T11:14:10Z

Continued in #1028

holgerd77 added PR state: needs review PR state: WIP type: enhancement package: client labels Dec 15, 2020

holgerd77 requested review from ryanio, cgewecke and jochem-brouwer December 15, 2020 23:50

holgerd77 force-pushed the client-add-vm-execution branch from 0c7adf1 to 14aa171 Compare December 16, 2020 00:00

ryanio mentioned this pull request Dec 16, 2020

client: fix undefined BlockHeaders value when syncing #1019

Merged

holgerd77 removed the PR state: needs review label Dec 17, 2020

holgerd77 force-pushed the client-add-vm-execution branch from 14aa171 to 731b00d Compare December 17, 2020 22:38

holgerd77 added 3 commits December 21, 2020 10:20

client -> vm execution: removed headerfetcher -> blockfetcher inherit…

b575ca6

…ance for a clear synchronizer object association

client -> vm execution: moved VM to BlockFetcher, added VM execution

87442ee

client -> vm execution: use vm.runBlockchain(), fix execution run, ad…

38be1be

…ded state persistence by introducing StateManager

holgerd77 force-pushed the client-add-vm-execution branch from 731b00d to 38be1be Compare December 21, 2020 12:14

jochem-brouwer reviewed Dec 21, 2020

View reviewed changes

holgerd77 mentioned this pull request Dec 21, 2020

Breaking Releases - Meta Issue (VM v6, others) (old) #1024

Closed

50 tasks

client: move VM and block execution logic to FullSync

a197d00

vm: add maxBlocks option to runBlockchain

holgerd77 added 15 commits January 1, 2021 17:37

client -> vm execution: went back to 250 max import blocks (maxPerReq…

f8d17d9

…uest) along import/execution decoupling, added tx number in block to execution message

client, vm -> vm execution: removed checkpointing in runBlock for blo…

70823ca

…cks with 0 txs (performance)

client -> vm execution: added condensed zero txs block execution log msg

16ba16c

client -> vm execution: fixed linting in vm

9705c30

client -> vm execution: lowered maxPerRequest back to 50 due to conne…

8afe46c

…ctivity problems

client -> vm execution: fixed oldHead <-> newHead semantic switch

6242c8b

client -> VM execution: fixed VM test

28f9cf9

client -> VM execution: close stateDB, fixed fullsync tests

0edc4d8

client -> VM execution: fixed integration tests

da9bf17

client -> VM execution: renamed chain db instances to chainDB for dif…

6c0671e

…ferentiation towards stateDB

client -> VM execution: made stateDB a client option analogue to chainDB

e6f1a41

client -> VM execution: unified and consolidated chain and state data…

140596d

… directory structures

client -> VM execution: added some Config directory method tests, add…

6dc2d34

…ed missed out test base directory tests (e.g. test/client.spec.ts) to test:unit package.json command

client -> VM execution: added simple runBlocks test

5d38e6d

client -> VM execution: added more runBlocks() test cases

4000529

holgerd77 force-pushed the client-add-vm-execution branch from 6bb5030 to 4000529 Compare January 1, 2021 16:38

holgerd77 added 2 commits January 1, 2021 23:37

client, devp2p -> VM execution: subdivided DPT findneighbour peer cal…

395e334

…ls for improved discovery performance and to avoid network traffic spikes

client -> VM execution: made state DB closing more resilient

4811439

holgerd77 mentioned this pull request Jan 2, 2021

Client: add VM execution #1028

Merged

holgerd77 changed the title ~~Client: add VM execution~~ Client: add VM execution (old) Jan 2, 2021

holgerd77 added PR state: do-not-merge and removed PR state: WIP labels Jan 2, 2021

holgerd77 closed this Jan 2, 2021

holgerd77 deleted the client-add-vm-execution branch January 2, 2021 16:25

holgerd77 mentioned this pull request Feb 15, 2021

Is state checkpointing in runBlock necessary? #622

Closed

Client: add VM execution (old) #1017

Client: add VM execution (old) #1017

Conversation

holgerd77 commented Dec 15, 2020

codecov bot commented Dec 15, 2020 • edited Loading

Codecov Report

holgerd77 commented Dec 15, 2020 • edited Loading

holgerd77 commented Dec 16, 2020

holgerd77 commented Dec 16, 2020

jochem-brouwer commented Dec 16, 2020

jochem-brouwer commented Dec 16, 2020

jochem-brouwer commented Dec 16, 2020

holgerd77 commented Dec 17, 2020

holgerd77 commented Dec 17, 2020

holgerd77 commented Dec 21, 2020

jochem-brouwer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jochem-brouwer commented Dec 21, 2020

Choose a reason for hiding this comment

jochem-brouwer commented Dec 23, 2020

holgerd77 commented Dec 23, 2020

holgerd77 commented Dec 23, 2020

holgerd77 commented Dec 23, 2020

holgerd77 commented Jan 2, 2021

codecov bot commented Dec 15, 2020 •

edited

Loading

holgerd77 commented Dec 15, 2020 •

edited

Loading