Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document for general test format #39

Merged
merged 2 commits into from
Oct 17, 2018
Merged

Document for general test format #39

merged 2 commits into from
Oct 17, 2018

Conversation

djrtwo
Copy link
Contributor

@djrtwo djrtwo commented Oct 4, 2018

Added test-format.md to describe the high level specification that all YAML test documents need to conform to. Once this is approved, I will add our current proposed test suites (ssz, chain tests, and shuffle) as separate documents under specs/test-suites

Any and all feedback welcome.

CC: @paulhauner, @mratsim

@djrtwo
Copy link
Contributor Author

djrtwo commented Oct 4, 2018

Note: I stole this sample test suite and test vectors from @paulhauner's https://notes.ethereum.org/n7fyPi4cR-Gg9Ypq7ylTrQ?view due to it being the simplest test suite (in terms of number of fields) that we've proposed so far

@hwwhww hwwhww added the general:RFC Request for Comments label Oct 4, 2018
@paulhauner
Copy link
Contributor

paulhauner commented Oct 4, 2018

This looks good to me. The only thing I can think of is giving some guidance on how to deal with forks and versioning. (Note: I don't know how this works in present Ethereum, so happy to ditch this if there's an already established process).

Firstly, I'm assuming that whenever there's a fork all the specs need to change their fork to match the name of the new fork. This means that a single branch of the eth2.0-specs repo only deals with one single fork.

I propose we use the Semantic Versioning 2.0 MAJOR.MINOR.PATCH format like so:

  • PATCH: fixed some small bug in the tests. E.g. typo in a description.
  • MINOR: add a new test of the same structure. E.g. some edge case wasn't previously tested and it was added without changing the test_cases structure.
  • MAJOR: a new test requires the test_cases structure to change. E.g. input variable wasn't considered and the test structure needs to change.

Whenever there's a fork, the versioning resets but still communicates the stability of the test vectors. Consider the case where there's a fork that doesn't actually affect the test vectors, we can use the following guide:

  • Alpha 3.2.5 -> Beta 1.0.0
  • Alpha 0.3.2 -> Beta 0.1.0
  • Alpha 0.0.3 -> Beta 0.0.1

If the fork destroys the test cases and we need to start at zero again, it's fine to go from 1.0.0 to 0.0.1.

Thoughts?

@mratsim
Copy link
Contributor

mratsim commented Oct 4, 2018

Assuming you are talking about the hard forks once Shasper is released, we shouldn't nuke the specs and tests once a new one is released.

Either we organize in a functional manner like https://github.com/ethereum/tests:

.
├── ABITests
│   └── basic_abi_tests.json
├── BasicTests
│   ├── README.md
│   ├── blockgenesistest.json
│   ├── crypto.json
│   ├── difficulty.json
│   ├── difficultyByzantium.json
│   ├── difficultyCustomHomestead.json
│   ├── difficultyCustomMainNetwork.json
│   ├── difficultyFrontier.json
│   ├── difficultyHomestead.json
│   ├── difficultyMainNetwork.json
│   ├── difficultyMorden.json
│   ├── difficultyOlimpic.json
│   ├── difficultyRopsten.json
│   ├── genesishashestest.json
│   ├── hexencodetest.json
│   ├── keyaddrtest.json
│   └── txtest.json
├── BlockchainTests
│   ├── GeneralStateTests
│   ├── TransitionTests
│   ├── bcBlockGasLimitTest
│   ├── bcExploitTest
│   ├── bcForgedTest
│   ├── bcForkStressTest
│   ├── bcGasPricerTest
│   ├── bcInvalidHeaderTest
│   ├── bcMultiChainTest
│   ├── bcRandomBlockhashTest
│   ├── bcStateTests
│   ├── bcTotalDifficultyTest
│   ├── bcUncleHeaderValidity
│   ├── bcUncleTest
│   ├── bcValidBlockTest
│   └── bcWalletTest
├── GeneralStateTests
│   ├── stArgsZeroOneBalance
│   ├── stAttackTest
│   ├── stBadOpcode
│   ├── stBugs
│   ├── stCallCodes
│   ├── stCallCreateCallCodeTest
│   ├── stCallDelegateCodesCallCodeHomestead
│   ├── stCallDelegateCodesHomestead
│   ├── stChangedEIP150
│   ├── stCodeCopyTest
│   ├── stCodeSizeLimit
│   ├── stCreate2
│   ├── stCreateTest
│   ├── stDelegatecallTestHomestead
│   ├── stEIP150Specific
│   ├── stEIP150singleCodeGasPrices
│   ├── stEIP158Specific
│   ├── stEWASMTests
│   ├── stExample
│   ├── stHomesteadSpecific
│   ├── stInitCodeTest
│   ├── stLogTests
│   ├── stMemExpandingEIP150Calls
...

Or we use a per-fork structure (names courtesy of the Ubuntu Name Generator)

.
├── specs
│   ├── 20190601_IceAge
│   ├── 20200101_Frontier
│   ├── 20200601_ErgonomicEmu
│   └── 20210101_OrthodoxOriole
└── tests
    ├── 20190601_IceAge
    ├── 20200101_Frontier
    ├── 20200601_ErgonomicEmu
    └── 20210101_OrthodoxOriole

Now versioning is also important because it's much easier to check generically in for a SSZ deserializer for example.

One thing regarding specs, when implementing Ethereum 1.0 I find it very hard to exhume the old gas costs of the earlier forks because I couldn't find an old Yellow Paper and EIPs are tracking history of changes but there was no way to get a snapshot of the state of the specs at a certain point of time. So client implementers that are catching up have to check Geth and Parity codebase.

@djrtwo
Copy link
Contributor Author

djrtwo commented Oct 4, 2018

All client codebases have to handle all forks and all the tests remain! yay legacy code.

To sync the chain after a fork, your codebase must remember how to process blocks before the fork and how to process blocks after the fork.

py-evm handles this pretty cleanly here
And then they instantiate different chains (like main net or test nets) like this defining fork block constants.

@paulhauner
Copy link
Contributor

paulhauner commented Oct 4, 2018 via email

@mkalinin
Copy link
Contributor

mkalinin commented Oct 5, 2018

Either we organize in a functional manner ... Or we use a per-fork structure

Recalling experience gained from eth1.0 test suite.
Previously, there was a per-fork structure which has been changed to the structure we have now (functinal manner):

.
├── tests 
    ├── GeneralStateTests
        ├── stCreate2
        ├── stShift
        ├── stZeroKnowledge
            ├── ecmul_1-2_2_28000_128.json
            ├── ecmul_1-2_2_28000_96.json
            ├── ecmul_1-2_5617_28000_128.json
                ├── IN
                ├── OUT_Homestead
                ├── OUT_Byzantium
                ├── OUT_Constantinople

With good implementation design, this structure makes it trivial to enable new forks on client side. It's just adding a new line to the list of forks. While per-fork manner would require to handle new folders each time the fork is added.

As I understand this structure is suitable for test writers as well. @winsvega might want to comment on that as he works on tests from the very beginning of eth1.0.

A bit specific thing, but @holiman has a proposal on improving #ethereum/tests repository: https://gist.github.com/holiman/fdec3547f2b104803abbd2c6e751a8e7#proposed-solution.
It's a solution for getting rid of PRs like that ethereum/tests#511. It might be early to think about test generators but it worth keeping this problem in mind for the future.

@winsvega
Copy link

winsvega commented Oct 6, 2018

The problem now is that if you want to generate new fork you have to refresh a file with all previous fork tests. So a solution would be to use 1 source file -> generates many test files.

Although it wont really work if we keep filler hash checking.

@mratsim
Copy link
Contributor

mratsim commented Oct 9, 2018

Seems like we are rediscovering the Expression Problem which this blog post summarizes quite well for sum types vs interfaces:

With sum types, it’s easy to add uses but hard to add cases.
With interfaces, it’s easy to add cases but hard to add uses.

In our case:

  • With functional organization, it's easy to add new tests but it's add to hard forks in the test suite (need to update all tests with the new case)
  • With per-fork, it's easy to add forks but hard to add tests (need to add one in all forks)

@mkalinin
Copy link
Contributor

With per-fork, it's easy to add forks but hard to add tests (need to add one in all forks)

My hope is that in this case hardness of adding a test could be mitigated somehow. I am not sure what filler hash checking does exactly mean but solution with one source file that produces several tests could do the trick.

IMHO, it's worth betting on some difficulties during test creation if this is a prize of making implementation updates trivial. Cause, there gonna be only one test repository that affects several implementations.

@winsvega
Copy link

filler hash checks that test been generated with the latest filler version

@paulhauner
Copy link
Contributor

Sorry @winsvega, I'm not sure what a "filler" is.

@winsvega
Copy link

filler aka testFiller is a test source file.
each test source file generate a complete test. right now all tests for all forks are generated from testFiller into one file. and that file has a hash of testFiller that generated it. so in case changes are done and tests are not updated there will be an error.

if we put generated tests into separate files (in order not to touch it with each test updated on each new fork) the fillerHash field will become meaningless. I guess I will have to change it into expectSection hash or remove at all.

@djrtwo
Copy link
Contributor Author

djrtwo commented Oct 16, 2018

As discussed in the call, we are merging this document in as tentative so that we can use it for simple tests across clients -- SSZ, shuffle, etc.

We will iterate from there with the hope that we lock down some more specifications on overall structure and strategy of testing around devcon.

@djrtwo djrtwo merged commit 0e86ab8 into master Oct 17, 2018
@djrtwo djrtwo deleted the test-format branch January 1, 2019 22:08
hwwhww pushed a commit that referenced this pull request Aug 17, 2020
Also update binary output due to metadata change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
general:RFC Request for Comments
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants