Skip to content

meta(compatibility): handle EVM Semantic differences across chains #748

Open
@mds1

Description

@mds1

Component

Forge

Describe the feature you would like

As L2s grow in popularity, developers will run into issues due to the fact that not all EVM compatible networks have the same semantics as L1. For example, if you want to develop and test against a forked Arbitrum:

  • On Arbitrum, block.number returns the most recent L1 block. But forge doesn't know this.
  • If I fork Arbitrum mainnet for my tests and rely on a contract with a lastBlockNumber storage variable, that variable will have L1 block numbers
  • If my contracts rely on block delta between the current block and the lastBlockNumber stored, forge's block.number will return the provider's block number, which is about 6M, but the last L1 block is about 14M
  • Therefore by default my tests will fail due to underflow when computing block delta, because currentBlock - lastBlock would evaluate to 6M - 14M
  • One other issue with Arbitrum is that is has different gas accounting than the L1 EVM implementation

In forge, we can use the vm.roll cheatcode to simply move the block number forward or back as required and work around this. And maybe that's ok, but:

  1. It still feels a bit dirty, clunky, and potentially risky to have to hack around differing semantics in this way
  2. I can no longer easily run the same test suite on both arbitrum and mainnet, because I need to execute vm.roll conditionally based on network, and I don't think there's currently an easy way to do that
  3. Other networks might not have such a straightforward workaround
  4. There's still the gas accounting issue

So the scope of this issue is: how should forge handle this? Some possibilities:

  1. Don't handle it: if running tests and the chain ID doesn't correspond to L1 mainnet or testnet, show a warning as the very last thing before tests execute, such as "WARNING: This network may have different EVM semantics than mainnet, and therefore behavior of your contracts and tests may not represent what you'd see in production"
  2. Handle each network's differences: For example, update the VM logic to handle pulling timestamps from a different RPC than the fork RPC. Then if forge detect's the RPC is an Arbitrum network, it can ensure there's also a mainnet RPC also and use that to mirror production behavior.
  3. Push the behavior of 2 into some plugin layer as part of Plugin system #706

The three ideas above a ll have their pros and cons, and there may be other ideas.

Personally I lean away from option 1, and would prefer options 2 or 3. As a developer, those would give me more confidence that my system does behave as intended, whereas with option 1 it's much harder to get that confidence.

Another question is how many networks have differing semantics, and how many differences are there? If it's just Arbitrum's block.number different, option 2 is much more feasible than if there's multiple networks and multiple other differences, that gets much more complex.

Additional context

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

Status

Todo

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions