Skip to content
This repository has been archived by the owner on Oct 19, 2024. It is now read-only.

refactor: typed solc ast #1567

Closed
wants to merge 23 commits into from
Closed

refactor: typed solc ast #1567

wants to merge 23 commits into from

Conversation

onbjerg
Copy link
Collaborator

@onbjerg onbjerg commented Aug 4, 2022

Motivation

Working with the Solc AST right now is not super ergonomic and it's making it hard to provide support for library coverage in Foundry, since for that feature we would need to walk out to each call expression node and parse the type name. It's doable currently, but a lot of the nice information in the AST is rolled into other, so it's not super easy or maintanable.

Solution

Replace the AST abstraction with more strict types. Each node gets its own type loosely based on https://github.com/OpenZeppelin/solidity-ast which claims to have a wide Solidity version support. I'll also add a Visitor trait with some walk_* helper functions for easier traversal.

I've opened up this PR because I am currently at an impasse: I've started implementing macros to generate some of the common fields in the different nodes, but we could also use struct composition instead.

The downside of struct composition is that you have to "reach" and I am having some struggle coming up with nice names. For example, the two most common fields are id and src... we could call these "meta", but then what about the expression meta-fields? If we called them "node", then it's a bit awkward since you'd probably have a lot of code doing things like node.node.id.

If we called these fields "meta", and the common expression fields "expression", then you'd instead have a lot of code doing expr.expression.

The downside of macros is that it might be slow to compile, too opaque or too inflexible.

Thoughts @mattsse @gakonst?

(Docs + doc comments and so on coming later, this is very much a WIP)

PR Checklist

  • Added Tests
  • Added Documentation
  • Updated the changelog

Copy link
Collaborator

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this a lot,
this also makes very easy to add new nodes that share a lot of things instead of writing serde(flatten) containers

and customizing them is still possible

@gakonst
Copy link
Owner

gakonst commented Aug 5, 2022

Yep concept ack

@sambacha
Copy link

sambacha commented Aug 6, 2022

It may be worthwhile to explore https://github.com/ConsenSys/solc-typed-ast to compare OZs implementation against

@onbjerg
Copy link
Collaborator Author

onbjerg commented Aug 6, 2022

I'm already comparing against a typed implementation as described in the PR

@sambacha
Copy link

sambacha commented Aug 6, 2022

I'm already comparing against a typed implementation as described in the PR

The repo i linked provides an ast with "xpath" support, its used in scribble to help provide instrumentation coverage, should have mentioned that too sorry.

@onbjerg
Copy link
Collaborator Author

onbjerg commented Aug 6, 2022

I think xpath support is out of scope for this PR since it is a bigger feature and the most immediate need here is supporting some coverage features

@onbjerg
Copy link
Collaborator Author

onbjerg commented Aug 6, 2022

Just added some test fixtures from Solmate, will try to slim it down lmfao

@onbjerg
Copy link
Collaborator Author

onbjerg commented Aug 8, 2022

@mattsse any preference on how we safeguard against breaking stuff in solc? rn if the AST won't deserialize everything breaks, but it is used by ethers-solc in at least one place, so either we find a way to replace that, or we try to deserialize to a smaller ast on failure?

@onbjerg
Copy link
Collaborator Author

onbjerg commented Aug 8, 2022

As a note here, I gave up on supporting the legacy AST layout. At first I thought it was just some properties that were lifted into a nested attributes object, but it turns out that all related nodes are also shifted into a children array. For comparison, the current layout of a binary expression in the AST is something like

{
  type: 'BinaryExpression',
  leftExpression: Expression,
  rightExpression: Expression,
}

where left and right expression in the legacy AST would be in children[0] and children[1]. It gets even more complicated with other AST nodes, and some properties have undergone refactors and renames beyond that as well. As far as I can tell, a lot of 0.4.x Solidity versions support the "modern"/current AST, so we should be good for somewhere from 0.4.x to now, which I think is OK?

@gakonst
Copy link
Owner

gakonst commented Aug 9, 2022

As far as I can tell, a lot of 0.4.x Solidity versions support the "modern"/current AST, so we should be good for somewhere from 0.4.x to now, which I think is OK?

Yeah I think for coverage-related stuff we can afford to say "this is for after solidity version XXX"

what's the next step after this PR? integrate in coverage?

@onbjerg
Copy link
Collaborator Author

onbjerg commented Aug 9, 2022

Yes, integrate into coverage to get library coverage

@mattsse
Copy link
Collaborator

mattsse commented Aug 10, 2022

@mattsse any preference on how we safeguard against breaking stuff in solc? rn if the AST won't deserialize everything breaks, but it is used by ethers-solc in at least one place, so either we find a way to replace that, or we try to deserialize to a smaller ast on failure?

perhaps upcoming generics will break some things here, how we handled semver incompatible CompilerInput is by cleaning it up depending on the version. This is not possible here.
I think we just have to maintain compatibility via custom deserde moving forward, unfortunately...

@gakonst
Copy link
Owner

gakonst commented Aug 22, 2022

bump @onbjerg

@gakonst
Copy link
Owner

gakonst commented Nov 9, 2022

@onbjerg @mattsse what should we do here? do we want this to get over the line and we are just not allocated on it (in which case I'll find someone else to cover) or do we not think this is going to be a material improvement? this also ended up going deep in the AST so maybe helpful too? https://github.com/0xKitsune/solstat

@mattsse
Copy link
Collaborator

mattsse commented Nov 9, 2022

Imo, it makes sense to have this, if someone is going to use this, they can improve this where necessary

@iFrostizz
Copy link
Contributor

The ast was very well done with the macros it makes totally sense. I have fixed some bugs because of needing it, I can add them to the branch

@gakonst
Copy link
Owner

gakonst commented Nov 9, 2022

@iFrostizz yes that would be great. keen to get this over the line, so if you want to submit a new branch (or something that forks off this PR) that works! thank you

@iFrostizz
Copy link
Contributor

@iFrostizz yes that would be great. keen to get this over the line, so if you want to submit a new branch (or something that forks off this PR) that works! thank you

I've opened the PR.
Haven't got any serde json serialization issues since, but there could be other occurrences eventually

@gakonst
Copy link
Owner

gakonst commented Nov 10, 2022

@mattsse @onbjerg let's merge Frostiz's PR on Bjerg's PR and get this in? does it get us tangible improvements on foundry that we should prioritize?

@mattsse
Copy link
Collaborator

mattsse commented Nov 10, 2022

sgtm! lets do that.

unsure about foundry impact, most likely coverage.

@gakonst
Copy link
Owner

gakonst commented Jan 3, 2023

superseded by #1943

@gakonst gakonst closed this Jan 3, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants