Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

[RFC] srml-contracts: Remove ext_return #3038

Closed
jimpo opened this issue Jul 5, 2019 · 7 comments
Closed

[RFC] srml-contracts: Remove ext_return #3038

jimpo opened this issue Jul 5, 2019 · 7 comments
Labels
I7-refactor Code needs refactoring.

Comments

@jimpo
Copy link
Contributor

jimpo commented Jul 5, 2019

Right now there's an issue with gas metering and ext_return. In the following block of code:

block
get_local $x
call 4
i32.add
end

Currently, we will charge gas at the top of the block for all three instructions in the block. However, call 4 might invoke a function which [invokes a function which]* calls ext_return. In this case the user is charged for the i32.add instruction which was never executed.

This is because ext_return is special in how it affects control flow. Handling it properly could lead to a lot more gas metering calls being injected after any control block or call to a function that might invoke ext_return.

So... I think we should discuss removing ext_return entirely. The main point of ext_return seems to be returning the result of the contract's execution to the caller. Instead, the call function on a contract could return an integer status code (0 = success, 1 = some time of error, 2 = another type of error, etc), and whatever output data remains in the scratch buffer when the function returns is its output data. This means that contracts would have to explicitly clear the scratch buffer before returning if they do not want to return anything, as there may be garbage left in it. If the function traps, the final scratch buffer state is ignored.

This makes sense to me because it maps pretty easily to a Rust function that returns Result<Vec<u8>, ContractError>, where the status code represents the variant type of the Err(ContractError(..)) or 0 for Ok(..). Then it is the job of ink! to properly handle the final state of the scratch buffer.

In short, removing ext_return simplifies the contract runtime by pushing some complexity into the contract DSL, ink!

I'd be interested to any thoughts. @Robbepop @pepyakin

This is related to #2852.

@pepyakin
Copy link
Contributor

pepyakin commented Jul 5, 2019

A couple of thoughts:

  1. As we noticed, this is not only property of ext_return but also of the wasm instruction unreachable and traps in general (e.g. OOB access). However, if we were to fix that we would have to basically create a new gas metering block after each instruction that can trap which is not feasible.
  2. Another reason for this problem to happen is that we refund the gas. If we weren't returning the gas after the trap we wouldn't have this problem. While it might seem that it is OK to burn the gas to incentivise non-trapping execution, it seems useful to refund gas in some cases when the trap is deliberate (reasons similar to seal: dedicated function for deliberate revert #2852). There is also incentive mismatch that the people who pay for the execution of the code (i.e. transaction) are most often different from the people who write the code. And even if they are the same, in case of callbacks used in contracts the gap increases.
  3. This will allow us to get rid of the concept of a "special trap", i.e. a special signal that trap that came from sandbox execution should not be handled as a fatal error but as some special condition.
  4. While writing contracts for srml-contracts right now is feasible only with Ink! I'd like to invalidate your argument about the result type. I think we should not over optimize the low-level platform for some idiomatic pattern in one of the languages, especially considering that the support is not required.

@jimpo
Copy link
Contributor Author

jimpo commented Jul 5, 2019

Good points. I agree that it is not worth guaranteeing that gas won't be overcharged in case of a trap, since it would just be impractical. But I do definitely think it is a good property to guarantee that gas won't be overcharged if the contract call exits in an expected way, which we cannot guarantee right now due to this problem.

This is also why I think that expected errors should be indicated by a clean return of a non-zero status code (like in C or something) vs being signaled by a trap. This can still result in a reversion of any state changes, but one could return an error message for diagnostics (#2082) or indicating a precondition failure vs some unexpected exception/panic.

On that note, I don't think it would be a crazy idea to consume all remaining gas on a trap, provided there's a way to revert the transaction without trapping, ie. by returning a non-zero exit status. This would match the behavior of the EVM (see rationale for EIP 140).

With regards to point 4., while it does map to Rust's error handling, I think integer exit code + data buffer is compatible with lots of different programming language error handling patterns. Certainly C, exceptions, etc.

@pepyakin
Copy link
Contributor

pepyakin commented Jul 5, 2019

This is also why I think that expected errors should be indicated by a clean return of a non-zero status code (like in C or something) vs being signaled by a trap.

Indicating errors this way adds additional overhead (in terms of code size and consumed gas) though and requires special cooperation from the language/framework side. I.e you will have to express to return error this way, whereas traps allow implementation of both approaches.

There is also a downside that error handling code tends to be least tested so I think from the security perspective it might be a better idea to try to minimize the amount of code on such path.

With regards to point 4., while it does map to Rust's error handling, I think integer exit code + data buffer is compatible with lots of different programming language error handling patterns. Certainly C, exceptions, etc.

Yes, sure it is compatible. My point was just let's not focus on Rust or any particular language. As with my previous point, trapping allows both immediate termination and implementation that looks at the return code thus it is more universal which is a desirable property for a low-level platform.

On that note, I don't think it would be a crazy idea to consume all remaining gas on a trap, provided there's a way to revert the transaction without trapping, ie. by returning a non-zero exit status. This would match the behavior of the EVM (see rationale for EIP 140).

I guess rationale for EIP140 was to actually allow non-consuming behavior whereas our situation is that we always used non-consuming and so doesn't apply to us, right?

@jimpo
Copy link
Contributor Author

jimpo commented Jul 8, 2019

Indicating errors this way adds additional overhead (in terms of code size and consumed gas) though and requires special cooperation from the language/framework side. I.e you will have to express to return error this way, whereas traps allow implementation of both approaches.

So do you think it is OK to overcharge gas in the case of an "expected" error? Like a precondition failure? Or as a more concrete example, some DEX contract where someone tries to fill an already-filled order due to a race? I think it would be best to guarantee that the amount of gas consumed here is correct. And if the error handing is implemented with some sort of special trap, then we need to handle diverging calls anyway.

There is also a downside that error handling code tends to be least tested so I think from the security perspective it might be a better idea to try to minimize the amount of code on such path.

I don't follow this argument. Like error handling code in a contract or error handling code in the contract language compiler? If it's in the contract itself, then the amount of Wasm generated doesn't seem like it should add additional surface area -- it's just a bug in the contract.

I guess rationale for EIP140 was to actually allow non-consuming behavior whereas our situation is that we always used non-consuming and so doesn't apply to us, right?

Yeah, I just meant that as a reference that the EVM consumes all gas by default.

@Robbepop
Copy link
Contributor

Robbepop commented Jul 8, 2019

I kind of like the idea stated here and was also thinking about ways for ink! to work more in that direction.
Also as @jimpo I think this would also work very well with other languages apart from Rust where we could leverage the Result type in ink!.

However, I do see some complications with losing semantics upon using the scratch buffer for that since an analyzer or optimizer would probably have a harder time to reason about the code using it for returning data instead of the explicit ext_return statement.

Still I think that returning error codes is also suitable for our purposes in the way of differentiating between different error types. For example in ink! we would want to build in contracts (requires, assume, etc.) also called pre-conditions as in Solidity. With the model of returning certain error codes we could very easily manifest the difference between those "errors" from invalid user input and logical errors from an invalidly written smart contract.

This not only goes with the spirit of Rust but many other languages, also such as C.

TL;DR
I am a big fan of this and I think we should really seriously think about changing the current error and trapping semantics in this direction and this regard. For ink! I see absolutely no implementation problems.


Also concerning the gas metering problems I think this could be solved when we use Lightbeam IR instead of Wasm directly since Lightbeam IR is organized a bit differently and for example wouldn't even allow for a call to ext_return that is not at the end of a block.

@jimpo
Copy link
Contributor Author

jimpo commented Jul 31, 2019

@pepyakin and I discussed the following more concrete proposal offline:

The function signature of call and deploy in the Wasm contract changes from [] -> [] to [] -> [i32]. The return value is the callee status code. The least-significant 8 bits of the callee status code are called the caller status code and returned to the caller. If the caller status code (the bottom 8 bits) is 0, then the call succeeds and the changes are accepted. If the caller status code is >0, then the call failed and the changes are reverted. The functions ext_call and ext_instantiate already return a u32, which is currently a boolean with 0 indicating success and 1 indicating failure, so it is compatible with this proposal.

Any data remaining in the scratch buffer when call or deploy exits is called the return data. After ext_call returns the scratch buffer is set to the return data from the callee. After ext_instantiate the caller will somehow get access to the return data subject to a decision in #2830. Note that this means the contract needs to explicitly clear the scratch buffer before exiting if it does not want to return data. Gas is charged for return data similar to ext_return now, and if there is insufficient gas remaining, then the call fails with a regular OutOfGas trap.

If the call traps for any reason (out of gas or otherwise), then the caller status code is set to 0x0100 (the smallest u32 greater than 8 bits), changes are reverted, and the return data is empty.

The most-significant 24 bits of the callee status code are reserved for interpretation by the contracts runtime module and are masked off in order to obtain the caller status code.

For example, one of these reserved bits might be a self-destruct flag, as a solution to #3254. The idea is that a contract that wants to self-destruct can transfer its remaining balance where ever it wants with ext_call, check that the transfer succeeded, then exit setting the self-destruct bit on the callee status code. The contracts runtime then marks this contract as to-be-destroyed as a deferred action at the end of the extrinsic processing.


Regarding deployment, this is not incompatible with ext_return, and so we can implement this and deprecate ext_return then remove it completely once ink! is updated. The only breaking change is that the scratch buffer needs to be cleared before call or deploy exits regularly.

This proposal also requires a new call to write to the scratch buffer. We can rename ext_scratch_copy to ext_scratch_read and create a new imported function ext_scratch_write(src_ptr: u32, len: u32) overwrites the entire scratch buffer with the contents of memory starting at src_ptr.

@athei
Copy link
Member

athei commented Jul 23, 2020

We went in the opposite direction: We doubled down on ext_return. Contracts no longer have a return value but instead need to pass the return_flags (how he status code is called now) to ext_return.

Rationale for that is that is is difficult for contracts to bubble up status codes because of wasms structured blocks. This also saves the gas necessary to execute all the end instructions necessary for that.

@athei athei closed this as completed Jul 23, 2020
@athei athei moved this to Done in Smart Contracts Jul 27, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
I7-refactor Code needs refactoring.
Projects
Status: No status
Development

No branches or pull requests

4 participants