Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd: refactor evm tool #30633

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Conversation

lightclient
Copy link
Member

@lightclient lightclient commented Oct 19, 2024

Had the idea for a while to refactor cmd/evm. It feels like a lot of it has grown organically to meet demands -> test running, test fill, eof validation, etc.

original motivation Unfortunately this had led to several subcommands with little relation beyond a general relationship to the EVM. This makes using the `evm` command confusing. Both `evm run` and `evm t8n` specify their own set of tracing flags ([1][runner-trace], [2][t8n-trace]). Flags like `--input` are reused for different things between the test runners and disassemblers. `eofparse` adds `--hex` which is basically `--input` but for it's own purposes.

There isn't a coherent story for the set of tools. I think we should either spend a fair bit of work ensuring the flags are meaningful across all subcommands and avoid duplicate behavior, or some of the subcommands should live in their own package. My preference is the latter, because it more accurately represents today's maintenance: t8n is the highest prio and has many external contributions from the testing team. Enforcing and ensuring their changes make sense across evm will slow down everyone involved. The interface for t8n is the most robust and it makes sense for it to be it's own package. eofparse is much newer, but similarly seems like an unrelated project that can live on it's own. Which leaves us with the runner commands and the compilation tools. I opted to delete the compilation tools since they are used much any more and it was suggested we delete them last year.

Updated this PR to focus mostly on harmonizing the staterunner / blockrunner while also tidying up in cmd/evm/main.go where I could.

In summary:

  • move t8n, t9n, and b11r to their own command cmd/t8ntool
  • delete evm compile and evm disasm
  • move evm eofparse to cmd/eofparse
  • Added some nice stuff in c594c87:
  • unify staterunner and blockrunner CLI flags, especially around tracing
  • added support for struct logger or json logging (although having issue insertChain panics when struct tracer is set #30658)
  • new --cross-check flag to validate the stateless witness collection / execution matches stateful
  • adds support for tracing the stateless execution when a tracer is set (to more easily debug differences)
  • --human for more readable test summary
  • directory or file input, so if you pass tests/spec-tests/fixtures/blockchain_tests it will execute all blockchain tests

--

example

$ go run ./cmd/evm --verbosity=0 blocktest --human --run="zero_inputs" ./tests/spec-tests/fixtures/blockchain_tests/cancun
[PASS] eip5656_mcopy test_valid_mcopy_operations, param=zero_inputs
--
1 tests passed, 0 tests failed.

--

@holiman
Copy link
Contributor

holiman commented Oct 20, 2024

  • move t8n, t9n, and b11r to their own command cmd/t8ntool
  • move evm eofparse to cmd/eofparse

Both t8n and eofparse were their own cmds. But @fjl preferred to have it inside evm. As for me, I don't really have an opinion which is preferrable.

@holiman
Copy link
Contributor

holiman commented Oct 20, 2024

One upside in having it inside evm is that it's always present whenever evm is. We don't need to fiddle with build scripts and makefiles to add or remove targets.

Copy link
Member

@MariusVanDerWijden MariusVanDerWijden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code changes LGTM

I personally prefer to have these as separate commands and I don't think we should be concerned about the build scripts etc. Eofdump has as very small audience (Martin and I), same with t8ntool (Mario and Matt) so the only really user-facing function is evm state-test and evm blocktest that are used by hive, testing and fuzzing. Everything else is functionality that we use internally that is not really intended for a wider audience imo

@lightclient
Copy link
Member Author

@fjl wdyt? I don't see much advantage of only needing one binary everywhere. There aren't many consumers and those consumers are somewhat independent (as Marius says). The advantage is that the tools can be simpler, less thought needed in organization since each tool only does a few related things, and different people can own different packages.

@lightclient
Copy link
Member Author

lightclient commented Oct 23, 2024

Added some nice stuff in c594c87:

  • unify staterunner and blockrunner CLI flags, especially around tracing
  • added support for struct logger or json logging (although having issue insertChain panics when struct tracer is set #30658)
  • new --cross-check flag to validate the stateless witness collection / execution matches stateful
  • adds support for tracing the stateless execution when a tracer is set (to more easily debug differences)
  • --human for more readable test summary
  • directory or file input, so if you pass tests/spec-tests/fixtures/blockchain_tests it will execute all blockchain tests

--

example

$ go run ./cmd/evm --verbosity=0 blocktest --human --run="zero_inputs" ./tests/spec-tests/fixtures/blockchain_tests/cancun
[PASS] eip5656_mcopy test_valid_mcopy_operations, param=zero_inputs
--
1 tests passed, 0 tests failed.

@MariusVanDerWijden
Copy link
Member

directory or file input, so if you pass tests/spec-tests/fixtures/blockchain_tests it will execute all blockchain tests

ahh great!

@lightclient
Copy link
Member Author

From triage: seems like there is desire to not split the evm command. I'll combine them again with the other refactors I did.

@@ -995,7 +996,7 @@ func (api *ConsensusAPI) executeStatelessPayload(params engine.ExecutableData, v
api.lastNewPayloadLock.Unlock()

log.Trace("Executing block statelessly", "number", block.Number(), "hash", params.BlockHash)
stateRoot, receiptRoot, err := core.ExecuteStateless(api.eth.BlockChain().Config(), block, witness)
stateRoot, receiptRoot, err := core.ExecuteStateless(api.eth.BlockChain().Config(), vm.Config{}, block, witness)
Copy link
Contributor

@mdehoog mdehoog Nov 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason not to pass *api.eth.BlockChain().GetVMConfig() here? or at least the live logger?

@lightclient
Copy link
Member Author

I'll combine them again with the other refactors I did.

still haven't done this, will get to it soon

@lightclient lightclient force-pushed the cmd-evm2 branch 4 times, most recently from 0cb4c48 to 270925f Compare November 21, 2024 07:15
@lightclient
Copy link
Member Author

Okay this should be good to go. Have reverted the refactoring out the subcommands, but retained the other improvements. PTAL!

if state.StateDB != nil {
root = state.StateDB.IntermediateRoot(false)
result.Root = &root
// Dump any state to aid debugging.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you just drop the fmt.Fprintf(os.Stderr - output of the stateRoot?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The state root is recorded in the result and output via report(..).

cmd/evm/main.go Outdated Show resolved Hide resolved
@holiman
Copy link
Contributor

holiman commented Nov 21, 2024

Some diffs when I run the reference tests in goevmlab. github.com/holiman/goevmlab/evms/testdata, using the following changed params:

diff --git a/evms/testdata/run.sh b/evms/testdata/run.sh
index 4c38810..eee91c2 100755
--- a/evms/testdata/run.sh
+++ b/evms/testdata/run.sh
@@ -15,7 +15,8 @@ if [[ -n "$evm" ]]; then
     cd ./cases
     # The traces
     for i in *.json; do
-        $evm --json --nomemory --noreturndata statetest $i \
+#        $evm --json --nomemory --noreturndata statetest $i \
+        $evm statetest --trace --trace.format=json --trace.nomemory --trace.noreturndata $i \
          2>../traces/$i.geth.stderr.txt \
          1>../traces/$i.geth.stdout.txt
     done

Screenshot from 2024-11-21 10-27-37

@lightclient
Copy link
Member Author

I'm not sure why output step might be missing. That should come from the tracer and the tracer is configured the same before and after this PR?

I can add the Fprintln back in if it's important, but I was thinking it would be good to move all this data into a full json object so we can more easily interpret and access it.

You can see it is written to stdout after the execution completes:

[
  {
    "name": "tests/prague/eip7702_set_code_tx/test_set_code_txs.py::test_set_code_call_set_code[fork_Prague-call_opcode_CALL-evm_code_type_LEGACY-state_test-value_1]",
    "pass": true,
    "stateRoot": "0xa489cbd2d4e37e8b6b27fa5a66242d3732fbe346aea1aa011d3aa4545765ec13",
    "fork": "Prague"
  }
]

When there is a full dump, it is also there:

[
  {
    "name": "tests/prague/eip7702_set_code_tx/test_set_code_txs.py::test_set_code_call_set_code[fork_Prague-call_opcode_CALL-evm_code_type_LEGACY-state_test-value_0]",
    "pass": true,
    "stateRoot": "0x21331f3c18a3cf737d68e396e32bfef23a8f7cec32971f0265850453b82afe9f",
    "fork": "Prague",
    "state": {
      "root": "21331f3c18a3cf737d68e396e32bfef23a8f7cec32971f0265850453b82afe9f",
      "accounts": {
        "0x0000000000000000000000000000000000001000": {
          "balance": "0",
          "nonce": 1,
          "root": "0x56e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421",
          "codeHash": "0xe90b835ed0c9ae43182bcfe017e4e9776a804d20e0bf7061ae73758bcc5e9cde",
          "code": "0x60006000600060006000738a0a19589531694250d570040a0c4b74576919b85af1600055600160015500",
          "address": "0x0000000000000000000000000000000000001000",
          "key": "0x1d7dcb6a0ce5227c5379fc5b0e004561d7833b063355f69bfea3178f08fbaab4"
        }
      }
    }
  }
]

@holiman
Copy link
Contributor

holiman commented Nov 21, 2024

You can see it is written to stdout after the execution completes:

Ah, but I'm reading jsonl items from stderr, so that's where I need it

@lightclient
Copy link
Member Author

Added back the fprint for the state root 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants