Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add eth_batchCall #312

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from
Draft

Add eth_batchCall #312

wants to merge 9 commits into from

Conversation

s1na
Copy link
Contributor

@s1na s1na commented Sep 26, 2022

Summary

This PR adds a new method to the eth namespace which performs a sequence of call messages, each building on the previous call's state. It will use a given block tag as base state similar to eth_call. However this state can be overridden by providing a map of addresses to accounts.

Motivation

#267 already describes the motivation a bit. Further, this is a demanded feature in geth (see ethereum/go-ethereum#24089).

Rationale

  1. This method takes in an args object as opposed to a list of params which is the convention. I did this to allow more forwards-compatibility. E.g. the debug_traceCall method in geth now supports state and block overrides which couldn't be added to eth_call as they'd be a breaking change (assuming there would be consensus to add these fields). See next 2 points for possible future additions.
  2. Allowing state to be based on (block, tx_idx): I didn't add this for simplicity. IMO it's not strictly necessary as users can specify the block and pass in the txes in that block as is as part of the batch until their target tx_idx to get the same effect.
  3. Allowing state overrides for every call: Similarly this would be a convenience feature IMO. As state override for a call can be simulated through a call message.

Implementation

ethereum/go-ethereum#25743

Notes

The storage schema doesn't validate the fact that keys can be only 256bit since JSON supports only string keys for objects.

Todos

  • block overrides
  • tests

Alternative to #272

Copy link
Contributor

@MicahZoltu MicahZoltu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My implementation of this (slightly different, but not totally) from 1.5 years ago:
https://github.com/Zoltu/nethermind-plugin-multicall/blob/main/source/MulticallModule.cs#L29

Comment on lines +11 to +13
stateOverrides:
title: State overrides
$ref: '#/components/schemas/StateOverrides'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels out of place here. What is the reasoning for having this as a parameter? It also greatly complicates implementation in some clients I believe compared to not including it so I would like to see a compelling reason for its inclusion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a greatly used feature of eth_call. I agree it is less necessary in multicall since you can build up the state in previous calls. However not all state mutations are possible (e.g. simply replacing a contract code).

I agree this is extra work for client devs. However I would like to hear from them if this is indeed a complication. It seems to me all have a way to build a temporary state for serving eth_call. This goes a step further to modify parts of that state.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the state override feature is probably useful enough to warrant the complexity. I can imagine a security researcher breaking a problem down into two parts. If this value ever can become X, there is a vulnerability to report. They could research that aspect first (as it might be simpler) then research "can this value ever become X".

Could also fake signers of a multi-sig that use store approvals (as many do).

src/schemas/execute.yaml Outdated Show resolved Hide resolved
src/schemas/execute.yaml Outdated Show resolved Hide resolved
Comment on lines +119 to +121
error:
title: Execution error
type: string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does string here mean? I believe revert reason is just a byte array. While Solidity has some helpers like require(..., message) that make it easy to encode a string in those bytes, there are some contracts out there that return error codes as numbers, and it could even contain data arrays.

I think this should be bytes, and it is left up to the recipient to decode that as appropriate for the contract in question.

Copy link
Contributor Author

@s1na s1na Oct 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify this is not directly evm revert error. It encompasses other errors such as OOG. If call reverts the error will be "execution reverted" and the return data field will contain the bytes that you mentioned.

Happy to change the description of the field if it is confusing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to indicate that in the case of an EVM revert, return will contain the revert return bytes.

Is there a way in this schema to express that error is optional (may not be defined in the resulting object)?

src/eth/execute.yaml Outdated Show resolved Hide resolved
Comment on lines 83 to 104
properties:
number:
title: Number
$ref: '#/components/schemas/uint256'
difficulty:
title: Difficulty
$ref: '#/components/schemas/uint256'
time:
title: Time
$ref: '#/components/schemas/uint256'
gasLimit:
title: Gas limit
$ref: '#/components/schemas/uint64'
coinbase:
title: Coinbase
$ref: '#/components/schemas/address'
random:
title: Random
$ref: '#/components/schemas/hash32'
baseFee:
title: Base fee
$ref: '#/components/schemas/uint256'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a mechanism for indicating optionality? I feel like all of these should be optional and have reasonable defaults provided by the execution client.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also feel like these should be constrained to values that otherwise pass validation so clients don't have to write code that skips over any more validation than necessary (skipping block signature for example is necessary). For example, block number must be monotonically increasing, time must be strictly greater than previous block, gasLimit cannot change more than 1/1024 from previous block, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a mechanism for indicating optionality? I feel like all of these should be optional and have reasonable defaults provided by the execution client.

The default value will be taken from the block the calls will be executed on top of.

I also feel like these should be constrained to values that otherwise pass validation so clients don't have to...

Is this still a concern now that there's one set of block overrides for the whole batch?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be clearly indicated in this specification, perhaps in a description field? Some of them cannot be defaulted to the parent block (like number and timestamp), so I think we should have a description for each indicating where it gets its default from. Also, I would really like to see it indicated that these are optional somehow, ideally in the schema itself, but alternatively in a description field or comment of some kind.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still a concern now that there's one set of block overrides for the whole batch?

I think these are still relevant. If you are building off of block n, then you cannot have a number that is less than or equal to parent.number. Similarly, you cannot have a timestamp that is less than or equal to the parent.timestamp. Similar restriction on gas limit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: On Gas limit there may be value in expressly allowing changes that are against consensus rules so that people can do series of calls that are more than the current gas limit.

src/schemas/execute.yaml Outdated Show resolved Hide resolved
Comment on lines +110 to +121
CallResult:
title: Result of a call
type: object
required:
- return
properties:
return:
title: Return data
$ref: '#/components/schemas/bytes'
error:
title: Execution error
type: string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result for each transaction should include logs and gas spent as well. I suspect that all clients are generating this information anyway, and throwing it away is a waste when we could just return it all. In many cases, this will also make it so the caller doesn't need to do multiple round trips (e.g., once for gas estimate and again for speculative execution and again to read a deterministic event like contract creation details).

If the call is a contract creation (to: null), it should also return the address of the created contract for the same reason.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

include logs and gas spent as well

@s1na mentioned an argument against including gas spent in the return interface (here). The intention I mentioned was to use the gas spent as is for gas limits, which I agree would fail when tx. However, including gas spent in the response could still be useful if utilized with care (passing a factor of more gas for e.g. to account for 63/64 gas passed to child msg calls). And in the documentation, a note can be mentioned that gas spent is not accurate to be used as gas limit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally don't think it is worth trying to optimize the edge case where someone does if (GAS) ... because that is very rare and when it is used, the dapp almost always explicitly sets the GAS required anyway so the returned gas used won't matter.

I think the value add of returning gas used far outweighs the downsides of it being wrong in a handful of esoteric scenarios. I do recommend that we standardize on the amount of gas provided to each call in eth_multicall though. Perhaps block.gas_limit - gas_used_in_previous_transactions_in_multicall? Essentially, assume that the whole set of transactions is intended to fit inside a block (this also provides a sanity check on total gas used) and they "burn through" the block's gas as they are executed.

This would prevent people from using this to do mega gas operations, but I think we can address that by simply having eth_multicall accept a gas_limit parameter for the block and each transaction so this isn't a hard limiter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think gasUsed should not be used as an estimate. But I can sympathise with adding this info to the call. It comes at no overhead. Right now all this can be fetched in one call through geth's tracing API (via the callTracer), but I agree it should be part of eth_. I'm happy to add these.

Re gas_limit there is a need for a block gas limit set in the client itself. Because providers like Infura will want to cap the runtime of API methods. Geth has both a gas limit and a deadline in seconds.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting an upper bound on the gas able to be used by an eth_call or eth_multicall feels like it is out of scope of this schema. On my local node, I should be able to do calls that use trillions of gas, and the fact that a centralized service (which we should not be catering our designs toward) needs rate limiting should not negatively impact a self-hosted operator from doing whatever they want.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify: what I meant was there is a CLI flag anyway that will set a limit for all eth_calls and eth_multicalls. On your local machine you're free to set it to trillions. Question is whether a per-invocation limit is also useful.

This would prevent people from using this to do mega gas operations,

Adding a gasLimit parameter won't solve this for public facing nodes. They will simply set a high gas limit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify: what I meant was there is a CLI flag anyway that will set a limit for all eth_calls and eth_multicalls. On your local machine you're free to set it to trillions. Question is whether a per-invocation limit is also useful.

Such a setting feels like it is out of scope of this standard? I might be able to be convinced that it is relevant, but rate limiting features and DoS protection generally aren't standardized.

What we need to standardize is "how much gas is given when the parameter is missing" and picking something "reasonable" (e.g., not 2^256, and not 30,000) as a default feels like the right way to go to me. If a particular client has limits that are lower than that (such as from configuration) then they should return an appropriate rate limiting error and require the user to explicitly set the gas limits in their requests I think if the user exceeds those limits.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the if (GAS) case is not really worth considering but the 63/64 and the refunds are.

However, there's really two uses for gas that should be handled:
1.) what you need to add to transaction for it to succeed (eth_estimateGas/eth_batchEstimateGas)
2.) What you actually pay for, some way to determine what my EOA is actually going to be charged for the transaction. It feels most natural for that to occur in the eth_batchCall . If there is a back-run, it is very important to the sender how much gas is going to actually be charged as mev profit can often be extremely small and the difference between the actual gas used and the gas estimation is the difference between profit and loss.

@s1na
Copy link
Contributor Author

s1na commented Oct 27, 2022

Sorry for the letting the ball drop on this for a while. I'm gonna begin to address the reviews. To start:

  • I renamed the method to eth_multicall
  • Changed it so you provide one set of block overrides, instead of per call

Will address the rest of reviews on their respective comments.

title: Gas limit
$ref: '#/components/schemas/uint64'
coinbase:
title: Coinbase
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
title: Coinbase
title: FeeRecipient

I believe fee recipient is the nomenclature we are using now (as of The Merge) for the person who received the transaction fees in the block. This is separate from proposer, which I don't think the execute layer has access to.

@MicahZoltu
Copy link
Contributor

I think there would be value in making it so eth_multicall can provide multiple "blocks" of transactions that each have their own set of overrides. This would allow someone to have a multicall that sets some state, then waits some amount of time, then executes something which is incredibly valuable for testing things like timelocks and whatnot.

My initial thinking is to have something like:

{
	parent_block,
	blocks: [
		{ block_overrides, transactions },
		{ block_overrides, transactions },
		...
	]
}

Everything else would stay the same, and the first block in the array would be built on top of the looked up parent_block, then the second block in the array would be built on top of the first block, etc. Each block could then have its gas limit constrained by normal validation rules but this would allow unlimited total gas spent across all execution and allow testing things that are a function of time.

@s1na
Copy link
Contributor Author

s1na commented Oct 28, 2022

Also it's worth discussing a deprecation policy for API methods. Multicall is more general than call and if adopted will fully supersede eth_call.

@s1na
Copy link
Contributor Author

s1na commented Oct 28, 2022

I think multiblock/chain execution is a useful feature. However after reading comments such as ethereum/go-ethereum#25743 (comment) and thinking more about this, I've made up my mind and would like to champion only the single-block-multi-call variant at this time.

How I envision the multiBlockCall method right now is taking a list of something akin to a full block object (similar to how txes are passed to eth_call). That is a detriment in UX for users who want to do a simple multicall IMO.

@MicahZoltu
Copy link
Contributor

MicahZoltu commented Oct 29, 2022

I believe the referenced comment by @holiman is indicating that having multiple transactions with different block parameters could have sharp edges. A multi-block multicall however wouldn't have such sharp edges.

That is a detriment in UX for users who want to do a simple multicall IMO.

Wouldn't the difference between single block and multiblock (for a multicall that only actually spans one block) just be a couple extra sets of braces and a property name more or less? If we use a tuple you don't even need the property name (though I don't recommend that).

{ block, stateOverrides, blockOverrides, calls: [transaction1, transaction2, ..., transactionN }

vs

{ block, blocks: [ { stateOverrides, blockOverrides, calls: [transaction1, transaction2, ..., transactionN } ] }

It would be unfortunate to need a separate JSON-RPC method for multiblock call when the difference between them is so minimal, and the work required to implement them both is likely very similar, especially since single block multicall is purely a subset of multi block multicall.

@s1na
Copy link
Contributor Author

s1na commented Nov 3, 2022

A multi-block multicall however wouldn't have such sharp edges.

I'm pretty sure his later comment applies to multi-block. There is a difference between simply executing multiple txes and executing txes from different blocks (doesn't matter if they are ordered blocks). Namely, it is the coinbase reward at the moment. Soon it will also include the withdrawal logic. Also once you have multiple blocks you need to get BLOCKHASH opcode right.

just be a couple extra sets of braces and a property name more or less?

No because once you have multiple blocks the override fields become mandatory. Unless you special-case it for one block, but that's ugly IMO.

@MicahZoltu
Copy link
Contributor

No because once you have multiple blocks the override fields become mandatory. Unless you special-case it for one block, but that's ugly IMO.

Why would you require block overrides for multiple blocks but not for single block? It seems to me that you either need to require overrides in both cases, or they are optional in both cases. Can you elaborate on why you think these situations are different?

I'm pretty sure his later comment applies to multi-block. There is a difference between simply executing multiple txes and executing txes from different blocks (doesn't matter if they are ordered blocks). Namely, it is the coinbase reward at the moment. Soon it will also include the withdrawal logic. Also once you have multiple blocks you need to get BLOCKHASH opcode right.

It is certainly possible that there are some implementation detail in some client that make building multiple blocks in a row outside of current chain particularly difficult, but that feels unlikely to me (would be great to get a concrete answer from client devs on this). In Nethermind at least, I believe it would not be notably harder to build a two-block chain in memory vs a one-block chain in memory.

@sambacha
Copy link

sambacha commented Nov 3, 2022

Can this be named something other than multicall as that is confusing , maybe eth_macroCall ?

@s1na
Copy link
Contributor Author

s1na commented Nov 4, 2022

@MicahZoltu are you willing to champion the multi-block variant as a separate proposal? we can raise them both during ACD as alternatives. I'd be happy if either of them gets adopted and can take care of the implementation in Geth.

I'm all for the extra flexibility of the multi-block variant, but it seems to me that the use-case is more niche and I don't want the extra complexity to delay adoption.

Why would you require block overrides for multiple blocks but not for single block? It seems to me that you either need to require overrides in both cases, or they are optional in both cases. Can you elaborate on why you think these situations are different?

For a single block you can simply take the parameters of block (which we take the state from) as default. Once you have multiple blocks either we can use the same default for all blocks (possibly incrementing NUMBER) or make the overrides mandatory.

Can this be named something other than multicall as that is confusing , maybe eth_macroCall ?

@sambacha I would go with eth_swissArmyCall for the multi-block variant hehe.

@MicahZoltu
Copy link
Contributor

Can this be named something other than multicall as that is confusing , maybe eth_macroCall ?

What do you find confusing about multicall?

@MicahZoltu
Copy link
Contributor

I'm all for the extra flexibility of the multi-block variant, but it seems to me that the use-case is more niche and I don't want the extra complexity to delay adoption.

Given how difficult it is to get new JSON-RPC methods standardized, I think that we should take our time and spec this out in a way that we think is best long term for the ecosystem, rather than trying to do the thing that requires the least amount of work. I say this as someone who is currently working on a project that has a hard dependency on the single-block version of this too, and I don't benefit directly at all from the multi-block version!

For a single block you can simply take the parameters of block (which we take the state from) as default. Once you have multiple blocks either we can use the same default for all blocks (possibly incrementing NUMBER) or make the overrides mandatory.

I agree with this, but it doesn't seem to answer my question of why multi-block requires overrides? Am I misunderstanding something, because it sounds like you are saying that multiblock does not require overrides?

Note: Each successive block should increment the time by 12 seconds and the block number by 1 by default. This includes the first block built on the real chain, and the subsequent blocks. They should keep the gas limit unchanged from the parent, and probably just include a a deterministic "random" number for randao (which IIRC is now in the block header) like hash the parent's randao or something. This should be written into the standard (regardless of whether we do multi-block or single-block).

are you willing to champion the multi-block variant as a separate proposal? we can raise them both during ACD as alternatives. I'd be happy if either of them gets adopted and can take care of the implementation in Geth.

I would much rather go to ACD with a single proposal than two competing proposals. We have seen in the past how two competing proposals both being championed can indefinitely delay inclusion (e.g., BLS), so generally speaking it is best to get agreement among champions first.

That being said, perhaps we could go to ACD with the multiblock proposal but mention that if any of the clients believe multiblock will be particularly difficult to implement then that is something that can be pulled out of the proposal? As I mentioned above, my suspicion is that they are nearly equally difficult to implement but if one or more client teams indicate that their architecture makes multiblock hard that is the most likely thing to convince me to back off.

@ofarukcaki
Copy link

I want to test this extension on my geth but, how do I call the JSON-RPC I couldn't figure. Can you give an example please?

@epheph
Copy link

epheph commented Jan 25, 2023

I would like to see a standardized batch call method implemented into geth upstream and of the available proposals, this one seems like the best and even has an implementation with an open PR (ethereum/go-ethereum#25743) that is only blocked by the spec agreement.

There have also been attempts to put eth_callBundle (Flashbots-specific non-standard API I helped create) but i would like to see something more flexible and useful across multiple blocks with features for security researchers and end users (wallets) as well. If eth_batchCall (or similar) is upstreamed into geth, i think Flashbots is likely to switch to it and recommend its use in their libraries and documentation.

I woud like to see this API used to prevent Ethereum users from being scammed by tokens that only allow purchase (and revert on sell). An API like this could verify the ability to re-sell tokens and automatically protect users.

@s1na i see there was some correspondence here but that after a few comments, momentum was lost, are you open to continuing this discussion and helping get this across the line? I would be happy to jump on a call to hash things out or add more resources as necessary, i think this is important work and not too far from being completed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants