♻️ refactor(gentest,base_types): Improved architecture, implement EthereumTestBaseModel, EthereumTestRootModel #901

raxhvl · 2024-10-18T13:06:36Z

🗒️ Description

This PR ensures data from gentest to generated tests reaches in a type-safe manner.

The Gentest module generates a test script in three steps:

The CLI entrypoint function captures the user input required for generating the test.
This input is used to generate additional context required for creating tests.
Finally, using this context, a formatted pytest source code is generated and written to disk.

The process of generating context and rendering templates will share the same logic across different kinds of tests. These are created as modules: (cli, test_context_provider, and source_code_generator).

graph TD;
    A[CLI Entry Point] -->|Captures User Input| B[Test Context Provider]
    B -->|Generates Context| C[Source Code Generator]
    C -->|Generates Formatted Pytest Code| D[Disk]

I am using the same data type when passing data from the provider to the generated source code. This keeps the behavior predictable; Pydantic will raise errors if the RPC call returns unexpected values even before we run the test. I'm leveraging __repr__ values to print the data into the source code.

Finally, the code is formatted using the same formatter (Black) used elsewhere. The existing formatting configurations are applied, so any change in configuration should automatically reflect in the files generated as well. Formatting would become essential when reading contract bytecode, especially with the plans to convert bytecode to python objects (see #862).

This effort makes writing new kinds of gentests easier, requiring only the creation of new test providers and templates.

🔗 Related Issues

closes #859

✅ Checklist

All: Set appropriate labels for the changes.
All: Considered squashing commits to improve commit history.
All: Added an entry to CHANGELOG.md.
All: Considered updating the online docs in the ./docs/ directory.
Tests: All converted JSON/YML tests from ethereum/tests have been added to converted-ethereum-tests.txt.
Tests: A PR with removal of converted JSON/YML tests from ethereum/tests have been opened.
Tests: Included the type and version of evm t8n tool used to locally execute test cases: e.g., ref with commit hash or geth 1.13.1-stable-3f40e65.
Tests: Ran mkdocs serve locally and verified the auto-generated docs for new tests in the Test Case Reference are correctly formatted.

danceratopz

I like it a lot, really clean 🌟 Thanks so much for all the thought you're putting into this.

One comment about the name of the jinja2 filter (cool feature, btw). Feel free to either change it and merge, or simply merge as-is. Thanks!

Also like the idea of applying black formatting on the output.

src/cli/gentest/source_code_generator.py

src/cli/gentest/templates/blockchain_test/transaction.py.j2

src/cli/gentest/cli.py

raxhvl · 2024-10-23T06:15:47Z

I think the output format of the generated test could be made more human readable. Which is why the PR is still in draft.

For example, we should ensure addresses are represented as hex and large numbers like v,r,s to be perhaps literal hex numbers. So that it is consistent with manually written tests. How about:

Field	Representation in Tests
Address	hex string
Hash	hex string
Number	hex number literal
Code	Opcode class?

I think the best way to achieve this is to override __repr__ in individual base types. Perhaps like this:

class HexNumber(Number):
    """
    Class that helps represent an hexadecimal numbers in tests.
    """

+   def __repr__(self):
+       return f"0x{123}"

    def __str__(self) -> str:
        """
        Returns the string representation of the number.
        """
        return self.hex()

I would like you to weigh in on the effect of this across the codebase.

danceratopz · 2024-10-23T08:41:30Z

I think the output format of the generated test could be made more human readable. Which is why the PR is still in draft.

For example, we should ensure addresses are represented as hex and large numbers like v,r,s to be perhaps literal hex numbers. So that it is consistent with manually written tests. How about:
Field Representation in Tests
Address hex string
Hash hex string
Number hex number literal
Code Opcode class?

I think the best way to achieve this is to override __repr__ in individual base types. Perhaps like this:
class HexNumber(Number):
    """
    Class that helps represent an hexadecimal numbers in tests.
    """

+   def __repr__(self):
+       return f"0x{123}"

    def __str__(self) -> str:
        """
        Returns the string representation of the number.
        """
        return self.hex()
I would like you to weigh in on the effect of this across the codebase.

I'll set off with a broader topic, so we have the full picture. Bear with me!

All right, so getting nice representations would also really help us with the Test Case Reference docs, see the bytes objects in the test case parameters tables, for example here: https://ethereum.github.io/execution-spec-tests/main/tests/prague/eip6110_deposits/test_deposits/test_deposit.html

For the objects defined and only used in the test cases (which aren't part of EEST's types libraries), adding a __repr__ would help and seems like the way to go.

For the objects that we do define in our test libraries, we get nice representations (of most objects, at least) when we write them as string to the fixture JSON files. And, instead of just adding __repr__ to all these classes, perhaps we can re-use the pydantic magic. I will try to get back to this later today, but if you'd like to play around in the meantime, try this:

Get some fixtures:

uv run fill tests/istanbul/

Then play around in ipython (https://ethereum.github.io/execution-spec-tests/main/dev/interactive_usage/)

uvx  --with-editable . ipython

for example:

import rich

from ethereum_test_fixtures.file import Fixtures
from ethereum_test_base_types import to_json

fixtures = Fixtures.from_file("fixtures/state_tests/istanbul/eip1344_chainid/chainid/chainid.json", fixture_format=None)
fixtures.keys()
fixtures['tests/istanbul/eip1344_chainid/test_chainid.py::test_chainid[fork_Istanbul-state_test]']
fixtures['tests/istanbul/eip1344_chainid/test_chainid.py::test_chainid[fork_Istanbul-state_test]'].env

rich.print(fixtures['tests/istanbul/eip1344_chainid/test_chainid.py::test_chainid[fork_Shanghai-state_test]'].env)

to_json(fixtures['tests/istanbul/eip1344_chainid/test_chainid.py::test_chainid[fork_Shanghai-state_test]'].env.fee_recipient)

to_json uses pydantic's model_dump():

execution-spec-tests/src/ethereum_test_base_types/json.py

Lines 10 to 21 in 7758f37

    
           def to_json( 
        
               input: BaseModel | RootModel | AnyStr | List[BaseModel | RootModel | AnyStr], 
        
           ) -> Any: 
        
               """ 
        
               Converts a model to its json data representation. 
        
               """ 
        
               if isinstance(input, list): 
        
                   return [to_json(item) for item in input] 
        
               elif isinstance(input, (BaseModel, RootModel)): 
        
                   return input.model_dump(mode="json", by_alias=True, exclude_none=True) 
        
               else: 
        
                   return str(input)

Just be aware that fill serializes from the test spec format (e.g. StateTest src/ethereum_test_specs/state.py to (state) Fixture src/ethereum_test_fixtures/state.py) and when we load the fixture from file above we're not creating a StateTest, we're creating a state test Fixture object. Quite a lot of magic happens when we serialize, for example, removing objects that are None; removing objects don't belong in state test fixture, but do in a blockchain test fixture.

I can take a deeper look myself and happy to discuss to brainstorm, but if you have time and you're interested, have play around!

raxhvl · 2024-10-23T13:15:45Z

I get your point. The to_json method respects Pydantic attributes (like alias) and filters out None values, while your overrides of __str__ produce nicely formatted JSON.

However, to_json outputs JSON, not valid Python code like __repr__. We need a way to combine the best of both without duplicating effort.

Let me know if my understanding is correct. Meanwhile, let me think this through.

danceratopz · 2024-10-23T15:15:32Z

I get your point. The to_json method respects Pydantic attributes (like alias) and filters out None values, while your overrides of __str__ produce nicely formatted JSON.

However, to_json outputs JSON, not valid Python code like __repr__. We need a way to combine the best of both without duplicating effort.

Let me know if my understanding is correct. Meanwhile, let me think this through.

Yup, that's right.

Yes, unfortunately pydantic doesn't deserialize to Python. Calling to_json on each individual field (the only way I can think of doing it at the mo) makes little sense. So, as we'll only need __repr__() for these base classes (I think), it'll have minimal impact to add them.

Perhaps @marioevz can chime in, if he has a better idea? 🙏

Sorry for the detour and that I couldn't give you a better answer earlier @raxhvl.

raxhvl · 2024-10-29T06:09:32Z

I have been marinating on this. I believe your intuition is correct: We should implement a unified serialization pipeline. This approach not only eliminates duplicate (tedious) efforts but also ensures that the framework produces consistent data outputs, whether for fixtures or gentests. Fundamentally, gentest should be as close to manual tests as possible.

Our goal is to serialize a Pydantic model to python code. After trying out few things, I have a solution.

We can override __repr__args__ in the root class, which feeds into Pydantic's __repr__. This will ensure that JSON serialization and __repr__ are consistent.

Here is a full example:

from pydantic import BaseModel, Field
from typing import List


class RecipeBaseModel(BaseModel):
   
    def __repr_args__(self):
       # Skip tags
        attrs_names = self.model_dump(mode="json", exclude={"tags"}).keys()
        attrs = ((s, getattr(self, s)) for s in attrs_names)
        return [(a, v) for a, v in attrs if v is not None]


# Example child model
class Ingredient(RecipeBaseModel):
    name: str
    quantity: str
    tags: List[str]


# Example nested child model
class Recipe(RecipeBaseModel):
    title: str
    ingredients: List[Ingredient]


# Example usage
ingredient1 = Ingredient(name="Tomato", quantity="2 cups", tags=["vegetable", "fresh"])
ingredient2 = Ingredient(name="Basil", quantity="1 bunch", tags=["herb", "fresh"])
recipe = Recipe(title="Tomato Basil Salad", ingredients=[ingredient1, ingredient2])

print(repr(ingredient1))  # Ingredient(name='Tomato', quantity='2 cups')
print(
    repr(recipe)
)  # Recipe(title='Tomato Basil Salad', ingredients=[Ingredient(name='Tomato', quantity='2 cups'), Ingredient(name='Basil', quantity='1 bunch')])

I know this will take some effort to implement correctly. But I think incentive to unify serialization will go down as the code diverges because of the refactoring involved.

Let me know if we should pursue this or override __repr__ everywhere for now.

danceratopz · 2024-10-30T09:34:35Z

I have been marinating on this. I believe your intuition is correct: We should implement a unified serialization pipeline. This approach not only eliminates duplicate (tedious) efforts but also ensures that the framework produces consistent data outputs, whether for fixtures or gentests. Fundamentally, gentest should be as close to manual tests as possible.

Our goal is to serialize a Pydantic model to python code. After trying out few things, I have a solution.

We can override __repr__args__ in the root class, which feeds into Pydantic's __repr__. This will ensure that JSON serialization and __repr__ are consistent.

Here is a full example:
from pydantic import BaseModel, Field
from typing import List


class RecipeBaseModel(BaseModel):
   
    def __repr_args__(self):
       # Skip tags
        attrs_names = self.model_dump(mode="json", exclude={"tags"}).keys()
        attrs = ((s, getattr(self, s)) for s in attrs_names)
        return [(a, v) for a, v in attrs if v is not None]


# Example child model
class Ingredient(RecipeBaseModel):
    name: str
    quantity: str
    tags: List[str]


# Example nested child model
class Recipe(RecipeBaseModel):
    title: str
    ingredients: List[Ingredient]


# Example usage
ingredient1 = Ingredient(name="Tomato", quantity="2 cups", tags=["vegetable", "fresh"])
ingredient2 = Ingredient(name="Basil", quantity="1 bunch", tags=["herb", "fresh"])
recipe = Recipe(title="Tomato Basil Salad", ingredients=[ingredient1, ingredient2])

print(repr(ingredient1))  # Ingredient(name='Tomato', quantity='2 cups')
print(
    repr(recipe)
)  # Recipe(title='Tomato Basil Salad', ingredients=[Ingredient(name='Tomato', quantity='2 cups'), Ingredient(name='Basil', quantity='1 bunch')])
I know this will take some effort to implement correctly. But I think incentive to unify serialization will go down as the code diverges because of the refactoring involved.

Let me know if we should pursue this or override __repr__ everywhere for now.

Thanks @raxhvl for taking the time to look so deeply into this! This looks like a great approach to unify our serialization and think we can start adding __repr__args__ to our types (or to a new class that the types inherit from).

If I understand correctly, the code paths for serializing the JSON fixture files would remain unchanged? But now we use json.dumps() within the __repr_args__ method to achieve compatibility? In this case, this doesn't seem like a huge change to the codebase (and certainly not a breaking change). But perhaps I've misunderstood how intrusive this is. If I'm right, you could try adding these methods upfront to achieve your goal for gentest now which would allow us to better evaluate the change.

danceratopz · 2024-10-30T18:17:21Z

As we discussed, whenever you have time, feel free to just go ahead and add either:

__repr_args__ to a new base class in https://github.com/ethereum/execution-spec-tests/blob/9a109298161db2cf5986bde0781a4d28d3ccf44f/src/ethereum_test_base_types/pydantic.py
Or to the classes you explicitly need for gentest.

As we discussed, doing 1. would be cleaner and probably easier, but 2.'s fine if you'd like to just get a proof of concept. :)

Thanks again for looking into this. Another advantage of unifying pydantic with this serialization to Python is that all the fields that are None will be correctly skipped in the generated test module. I'm referring to the None fields in your screenshot here:
https://github.com/ethereum/execution-spec-tests/blob/9a109298161db2cf5986bde0781a4d28d3ccf44f/src/ethereum_test_base_types/pydantic.py

raxhvl · 2024-11-02T06:37:00Z

The new serialization works well

I have added a new base class, here is how it fits in:

Warning

This is outdated. See updated implementation.

classDiagram
   BaseModel <|-- EthereumTestBaseModel
   EthereumTestBaseModel <|-- CopyValidateModel
   CopyValidateModel <|-- CamelModel

   class EthereumTestBaseModel{
    * serialize()
    * __repr_args__()
   }

The serialize method is a thin wrapper over model_dump which __repr_args__ consumes. The __repr__ logic is working fine. It ignores None fields, supports custom representation if required, and falls back to __str__ representation.

I tried to fill the newly generated test and it succeeds.

 uv run gentest 0xa41f343be7a150b740e5c939fa4d89f3a2850dbe21715df96b612fc20d1906be tests/paris/test_0xa41f.py

 uv run fill tests/paris/test_0xa41f.py                                                                      

=== 6 passed, 6 warnings in 5.21s ===

Should we update `to_json`?

I need your feedback on a change I would like to see but is not directly related to gentest. Currently model_dump is used both within the new serialize method and within to_json.

execution-spec-tests/src/ethereum_test_base_types/json.py

Line 19 in cfbfea9

return input.model_dump(mode="json", by_alias=True, exclude_none=True)

We could replace this with input.serialize() so that serialization is at one place but I'm not sure if all classes does infact inherrit EthereumTestBaseModel. So the code is now duplicated at two places.

We should NOT modify pre-state in the tests

This seems quite unusual to me. Ideally, the pre-state tracer should provide the state for all accounts prior to the transaction's execution.

execution-spec-tests/src/cli/gentest/test_providers.py

Lines 50 to 52 in cfbfea9

    
           if address == self.transaction.sender: 
        
               state_str += f"{pad}nonce={self.transaction.nonce},\n" 
        
           else:

While the sender's nonce does get updated, that's just one of several side effects resulting from the transaction. Changing the state within the test interferes with the EVM - a big red flag 🟥 . We shouldn't be updating the state ourselves; that's the EVM's responsibility. The test should focus solely on comparing the state transition against the specification.

danceratopz

Thanks @raxhvl, this is looking very clean!

Can you rebase on main, this will get some important CI fixes, that will help getting this merged more quickly :)

Few comments:

Yes, I think we should replace model_dump in to_json. You can verify the fixtures locally to ensure that this doesn't break anything by using the hasher cli. Fill the tests twice: Once with main and once with raxhvl:feat/gentest to separate, clean directories using --output=fixtures-raxhvl, for example. Use -n auto to enable parallelism when filling. Then:

uv run hasher fixtures-main/ > hasher-main.txt
uv run hasher fixtures-raxhv/l > hasher-raxhvl.txt
diff fixtures-main.txt hasher-raxhvl.txt

Completely agree that we shouldn't modify any pre-state. I'm not sure why this was originally added.
Side note: Formatting the generated code using black is a great idea, just be aware that black will get switched out to ruff soonish, see feat(all): add ruff as default linter and additional clean up #922. So don't invest too much more time with black - switching to ruff as it stands in this PR should be very easy.

I'd love to get @marioevz's feedback on this. If you'd like to do point 1. up front and check the fixtures, go for it and request a review from him please. If you'd like to get feedback first, we can request one now 🙂

raxhvl · 2024-11-04T12:41:09Z

Rebased on top of main

re: black to ruff migration.

The code uses cli to format code as long as ruff has a cli interface to format. The migration is trivial (<2 lines of code)

It would be nice if Mario reviews this once we are done.

marioevz

Amazing work so far, thanks!

Just a single comment (the other comment about the fork name is not really something that we need to do right now).

src/cli/gentest/templates/blockchain_test/transaction.py.j2

src/ethereum_test_base_types/pydantic.py

raxhvl · 2024-11-05T13:22:33Z

All models are now using the same serialization logic. We have two types of custom models: one for BaseModel and one for RootModel. The customization for both of these models is handled through a shared mixin.

classDiagram
   BaseModel <|-- EthereumTestBaseModel
   RootModel <|-- EthereumTestRootModel
   ModelCustomizationsMixin <|-- EthereumTestBaseModel
   ModelCustomizationsMixin <|-- EthereumTestRootModel

   class ModelCustomizationsMixin{
    serialize()
    __repr_args__()
   }

I have tested this thoroughly, but please review it with care as this change impacts EVERY Pydantic model in the codebase. I'm especially unsure of EthereumTestRootModel override.

danceratopz

Really impressive, great addition! Thanks again @raxhvl for taking the time to find such an elegant solution for this!

danceratopz · 2024-11-06T13:44:03Z

I have tested this thoroughly, but please review it with care as this change impacts EVERY Pydantic model in the codebase. I'm especially unsure of EthereumTestRootModel override.

Hey @raxhvl, in regards to this, it looks good to me, I've asked @marioevz if he'd also like to take a look. I diff'd the fixtures locally and got an exact match.

marioevz

Many thanks for putting so much thought and care into this!

danceratopz approved these changes Oct 22, 2024

View reviewed changes

danceratopz mentioned this pull request Oct 30, 2024

✨(testinit): add a simple (interactive?) cli that creates a new test module skeleton #926

Open

raxhvl marked this pull request as ready for review November 2, 2024 15:58

raxhvl requested a review from danceratopz November 2, 2024 15:58

danceratopz reviewed Nov 4, 2024

View reviewed changes

raxhvl added 5 commits November 4, 2024 13:23

✨ feat(gentest): Improved architecture

69e04ca

🥢 nit: Rename provider to context provider

300831d

🥢 nit: Simplfied docstring

47a8026

✨ feat: Custom repr

49abf54

🐞 fix: Quirk: prestate account nonce

ff2e56b

raxhvl force-pushed the feat/gentest branch from 7bc79b7 to ff2e56b Compare November 4, 2024 12:23

raxhvl force-pushed the feat/gentest branch from 82afb78 to ff2e56b Compare November 4, 2024 12:56

marioevz reviewed Nov 4, 2024

View reviewed changes

src/cli/gentest/templates/blockchain_test/transaction.py.j2 Show resolved Hide resolved

src/ethereum_test_base_types/pydantic.py Outdated Show resolved Hide resolved

🧹 chore: Unifiy json and python serialization

1fabe17

raxhvl force-pushed the feat/gentest branch from a7224ca to 1fabe17 Compare November 5, 2024 13:40

raxhvl requested review from danceratopz and marioevz November 5, 2024 13:54

raxhvl mentioned this pull request Nov 6, 2024

(gentest): MUST NOT modify pre-state in tests #941

Open

danceratopz approved these changes Nov 6, 2024

View reviewed changes

marioevz approved these changes Nov 6, 2024

View reviewed changes

marioevz changed the title ~~♻️ refactor(gentest): Improved architecture~~ ♻️ refactor(gentest,base_types): Improved architecture, implement EthereumTestBaseModel, EthereumTestRootModel Nov 6, 2024

marioevz merged commit 026af18 into ethereum:main Nov 6, 2024
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

♻️ refactor(gentest,base_types): Improved architecture, implement EthereumTestBaseModel, EthereumTestRootModel #901

♻️ refactor(gentest,base_types): Improved architecture, implement EthereumTestBaseModel, EthereumTestRootModel #901

raxhvl commented Oct 18, 2024 •

edited

Loading

danceratopz left a comment

raxhvl commented Oct 23, 2024 •

edited

Loading

danceratopz commented Oct 23, 2024 •

edited

Loading

raxhvl commented Oct 23, 2024

danceratopz commented Oct 23, 2024

raxhvl commented Oct 29, 2024

danceratopz commented Oct 30, 2024

danceratopz commented Oct 30, 2024

raxhvl commented Nov 2, 2024 •

edited

Loading

danceratopz left a comment

raxhvl commented Nov 4, 2024 •

edited

Loading

marioevz left a comment

raxhvl commented Nov 5, 2024 •

edited

Loading

danceratopz left a comment

danceratopz commented Nov 6, 2024

marioevz left a comment

♻️ refactor(gentest,base_types): Improved architecture, implement EthereumTestBaseModel, EthereumTestRootModel #901

♻️ refactor(gentest,base_types): Improved architecture, implement EthereumTestBaseModel, EthereumTestRootModel #901

Conversation

raxhvl commented Oct 18, 2024 • edited Loading

🗒️ Description

🔗 Related Issues

✅ Checklist

danceratopz left a comment

Choose a reason for hiding this comment

raxhvl commented Oct 23, 2024 • edited Loading

danceratopz commented Oct 23, 2024 • edited Loading

raxhvl commented Oct 23, 2024

danceratopz commented Oct 23, 2024

raxhvl commented Oct 29, 2024

danceratopz commented Oct 30, 2024

danceratopz commented Oct 30, 2024

raxhvl commented Nov 2, 2024 • edited Loading

The new serialization works well

Should we update to_json?

We should NOT modify pre-state in the tests

danceratopz left a comment

Choose a reason for hiding this comment

raxhvl commented Nov 4, 2024 • edited Loading

marioevz left a comment

Choose a reason for hiding this comment

raxhvl commented Nov 5, 2024 • edited Loading

danceratopz left a comment

Choose a reason for hiding this comment

danceratopz commented Nov 6, 2024

marioevz left a comment

Choose a reason for hiding this comment

raxhvl commented Oct 18, 2024 •

edited

Loading

raxhvl commented Oct 23, 2024 •

edited

Loading

danceratopz commented Oct 23, 2024 •

edited

Loading

raxhvl commented Nov 2, 2024 •

edited

Loading

Should we update `to_json`?

raxhvl commented Nov 4, 2024 •

edited

Loading

raxhvl commented Nov 5, 2024 •

edited

Loading