Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move away from INVALID and use custom error types instead. #9824

Closed
chriseth opened this issue Sep 16, 2020 · 25 comments
Closed

Move away from INVALID and use custom error types instead. #9824

chriseth opened this issue Sep 16, 2020 · 25 comments
Labels
language design :rage4: Any changes to the language, e.g. new features

Comments

@chriseth
Copy link
Contributor

Since checked arithmetic might be used for input validation, the need to move away from the invalid opcode (which consumes all gas) is stronger once we make checked arithmetic the default.

The middle ground between just using revert() for all internal errors and using the invalid opcode is introducing different error types (#7877) for revert() and internal errors.

The following situations currently or potentially in the future generate internal errors which we regard as "should never happen":

  • overflow/underflow
  • division / mod by zero
  • array out of bounds access
  • enum type conversion error
  • calling an uninitialized internal function pointer
  • assert with false condition

The following error situations are considered as "validating input" and thus might be OK to happen:

  • abi decoding error
  • calls returning "error"
  • require with false condition
  • calling a contract that contains no code
  • a non-payable function receiving ether

The bottom ones either use empty returndata or the signature of Error(string) if a reason was provided.

We could create individual errors for the "should never happen" errors (and also for the others), but I would guess that people will not ant to react differently to each of them anyway (user-defined errors are a different matter).

So maybe we should just create a single error type that is used in that case? Should it contain a message?

Suggestions for the name:

  • Panic()
  • Failure()
  • Fault()
  • Critical()
  • CriticalError()
@ekpyron
Copy link
Member

ekpyron commented Sep 16, 2020

I'm really not sure if that's a good idea :-) - but we could for each of them introduce an integer-encoded error code that explains what happens...
So for example if we consider all the arithmetic errors and arrays out of bounds accesses as Panic, we could have
Panic(uint256) and define some bit schema indicating hierarchical error types...
Like 0x01xxxxxxxx is arithmetic errors... 0x0100000001 is overflow on addition, 0x0100000002 is underflow on (signed) addition, 0x0100001001 is overflow on multiplication, etc....
and 0x02xxxxxxxx is access errors... 0x0200000001 is access past the end of an array, 0x0200000002 is too large array index, etc...

But yeah - while it would be a cheap and nice way to be both coarse and fine-grained in error reporting... maintaining a list of this that then will never break would probably drive us mad :-).

@chriseth
Copy link
Contributor Author

While this is a good idea, still the main question is if such a distinction is useful. What you want to know is in which context the error happened and not what kind of error happened. Note that since we have error forwarding, it can also have happened inside a called contract.

@chriseth
Copy link
Contributor Author

I think it is kind of set that we want to be able to distinguish between Panic and Error. What if we use Panic(uint) to allow for potential error codes, but always use 0 as "unknown panic" for now and potentially introduce such a hierarchy in the future?

@chriseth
Copy link
Contributor Author

The reason behind using Panic(uint256) with 0 for everything instead of Panic() for now is that people only need one catch statement for both cases - for the same reason we did not want to introduce different error kinds for ArithmeticError, FunctionDoesNotExistError and so on.

@cameel
Copy link
Member

cameel commented Sep 23, 2020

I think that it's dangerous to allow handling Panics without providing specific error codes. If relying on built-in safe math for validation becomes a pattern then people will want to be able to catch these errors in some situations. But without error codes it's not possible to do so without silencing failed assertions at the same time.

For that reason maybe it would make more sense to just reclassify overflow/underflow as something allowed to happen? They stand out to me as the only case where checks are tricky enough to make users not want to validate the input explicitly.

@chriseth
Copy link
Contributor Author

You can switch from checked to wrapping arithmetic using the unchecked keyword. Even if you catch an error, it still means that the call reverted, you cannot "un-revert" it. it might also be that the error originated in a call further down the line. Just because there was an arithmetic error does not mean it was the one arithmetic error you thought about.

@chriseth
Copy link
Contributor Author

Because of that, I think catching an error and especially reacting on it depending on the reason has a very limited use-case.

@cameel
Copy link
Member

cameel commented Sep 23, 2020

But if I switch to unchecked then I won't be able to catch safe math errors. I'm thinking about the opposite situation because I think these errors are the ones people are most likely to consider non-fatal.

The use case for catching a Panic at all is very limited but we're still allowing it. In my opinion, if you want to catch it, you're likely to care which one it is and silencing the others is probably an unwanted side-effect.

@chriseth
Copy link
Contributor Author

I still don't properly understand the use-case, but I do want fine-grained error codes, just as a second step to make quicker progress on this.

@cameel
Copy link
Member

cameel commented Sep 23, 2020

We could add Panic as the first step and just not provide any mechanism to handle it until we have error codes. That way it's still cheap (not using INVALID) but people are not encouraged to silence all panics indiscriminately at Solidity level. They can only do it outside of the transaction.

But I don't have a specific use case so it might well be that this is not relevant in practical use. It's just my general opinion that silencing stuff in a broad way is a bad practice and this approach might encourage it in some situations without giving the user an alternative. It's also that it's not clear to me how long the gap between the first and the second step will be and in the worst case we might be stuck with just the first one for quite some time.

@cameel cameel added feature language design :rage4: Any changes to the language, e.g. new features labels Sep 25, 2020
@banescusebi
Copy link

Got and email from Franziska Heintel about this topic. Replying here, as she requested.

So maybe we should just create a single error type that is used in that case? Should it contain a message?

Yes and yes. Awesome that this is being implemented.

I think it is kind of set that we want to be able to distinguish between Panic and Error. What if we use Panic(uint) to allow for potential error codes, but always use 0 as "unknown panic" for now and potentially introduce such a hierarchy in the future?

This sounds good to me. As someone else mentioned on the email thread: it would be more convenient to also have a string description of what the error code means. Like you said 0 means "unknown panic". However, I'm assuming these codes will be clearly documented on https://solidity.readthedocs.io/ so if one doesn't know the code, they can quickly look it up.

@chriseth
Copy link
Contributor Author

chriseth commented Oct 8, 2020

The following situations currently create an invalid opcode or a Panic in the future:

  • failing assert()
  • arithmetic overflow in addition, multiplication, exponentation, negation, ...
  • arithmetic underflow in addition, multiplication, exponentation, negation, ...
  • division / modulo by zero
  • array too large (either storage or memory)
  • memory allocation error (too much memory required for something)
  • empty array pop
  • array out of bounds access
  • enum conversion error
  • calling invalid internal function

Note that the old code-generator still contains code that uses "invalid opcode".

Suggestions for how to assign error codes:

  • 0x00: generic / unspecified error. Used by assert().
  • 0x11: arithmetic underflow or overflow
  • 0x12: division or modulo by zero
  • 0x21: enum conversion error
  • 0x31: empty array pop
  • 0x32: array out of bounds access
  • 0x41: resource error (too large allocation or too large array)
  • 0x51: calling invalid internal function

@ekpyron
Copy link
Member

ekpyron commented Oct 8, 2020

I wonder whether we should be more generous and at least use 16 bits for it and reserve 256 slots in each category or even more... it's not like we don't have enough space in the uint256. Just, s.t. if we ever need a larger category, it's can still be the same part that designates the category.

@chriseth
Copy link
Contributor Author

chriseth commented Oct 8, 2020

My motivation for staying within one byte is to save deployment gas costs.

@MrChico
Copy link
Member

MrChico commented Oct 13, 2020

I would also be good to be able to identify abi decoding errors. Especially using abiencoder v2, there is usually quite a few paths that end at the decoding stage already if one doesn't specialize the calldata to a specific function signature.

@chriseth
Copy link
Contributor Author

@MrChico abi decoding errors will either return no data or data with a signature of Error(string). The codes we are currently talking about will all use Panic(uint256). We can of course talk about making abi decoding errors more specific, but that would be a different discussion.

@wuestholz
Copy link
Contributor

@chriseth The proposal looks good to me. The only change I would consider is not using 0x00 for assert violations, but specifying a designated (non-generic) error code for such cases (e.g., 0x61). This should ensure that program analysis tools that only care about assertions can reliably detect them even if for some reason future Solidity versions emit 0x00 for other generic/unspecified errors.

@MicahZoltu
Copy link
Contributor

My motivation for staying within one byte is to save deployment gas costs.

In all of my time as a dapp developer I have never seen anyone care about contract deployment gas costs. Runtime gas costs are all that anyone I have talked to cares about. For deployment, either you are deploying a contract once (ever) or you are deploying multiple copies of a contract that has a very high per-contract value (where gas costs are negligible compared to what you are doing) or you are using a proxy.

I have heard this argument of "to reduce deployment gas costs" come up multiple times from the Solidity dev team, and I would like to better understand where it comes from? Who are these people who care about deployment gas costs? What situation are they in where that is a thing that matters?

@chriseth
Copy link
Contributor Author

@wuestholz this is a very good suggestion! It allows analysis tools to distinguish between explicit assert(x) statements and "compiler-internal" assertions.

@MicahZoltu people are not worried about deployment gas costs, but they are worried about staying within the code size limit, so i think it is still reasonable to only use a single byte for the error code.

@wuestholz
Copy link
Contributor

@chriseth Great! Yeah, this would make things much easier for bytecode-only analysis tools. Thanks!

@chriseth
Copy link
Contributor Author

For now, I allocated 0x01 for assert(false) since it is still a "generic" error because you do not specify the reason.

@axic
Copy link
Member

axic commented Oct 28, 2020

We briefly discussed today whether
a) we should introduce a panic(code) helper for inline assembly which does the encoding,
b) and whether we would encourage users to use it instead of invalid().

The answer to b) was yes. The answer to a) is that some prefer to do this via #9282 and also considering the "standard library".

Please note one can manually encode the panics in inline assembly:

// Assuming `code` is <=255
function panic(code) {
  // This is using the scratch space, but could use anything given we revert.
  mstore(0, shl(0x4e487b71, 248))
  mstore(32, 0)
  mstore8(36, code)
  revert(0, 36)
}

Also see a similar discussion about assert/require in #7317.

@chriseth
Copy link
Contributor Author

Note that "could use anything given we revert" is not true anymore if we want to properly support the stack limit evader.

@axic
Copy link
Member

axic commented Oct 28, 2020

Note that "could use anything given we revert" is not true anymore if we want to properly support the stack limit evader.

Well, as long as the memoryguard is used properly (such as in a custom allocator), then it is not a problem. Isn't it?

@chriseth
Copy link
Contributor Author

chriseth commented Nov 2, 2020

Implemented in #10013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
language design :rage4: Any changes to the language, e.g. new features
Projects
None yet
Development

No branches or pull requests

8 participants