Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade the downcasting of RuntimeError in Wasmer #2316

Closed
MaksymZavershynskyi opened this issue Mar 25, 2020 · 8 comments
Closed

Upgrade the downcasting of RuntimeError in Wasmer #2316

MaksymZavershynskyi opened this issue Mar 25, 2020 · 8 comments
Assignees
Labels
A-transaction-runtime Area: transaction runtime (transaction and receipts processing, state transition, etc) P-high Priority: High

Comments

@MaksymZavershynskyi
Copy link
Contributor

Currently we are printing an error when we cannot downcast RuntimeError into the error type that we return through the host functions: https://github.com/nearprotocol/nearcore/blob/a08944db6672185618e2454fce5e7be0e327b47a/runtime/near-vm-runner/src/errors.rs#L78

The printed error also says that the output is non-deterministic.

As of 0.14.0 Wasmer returns ExceptionCode there that we can try to downcast to. We should switch to this downcasting and also do not print that the output is non-deterministic anymore.

@MaksymZavershynskyi MaksymZavershynskyi added A-transaction-runtime Area: transaction runtime (transaction and receipts processing, state transition, etc) P-high Priority: High labels Mar 25, 2020
@lexfrl
Copy link
Contributor

lexfrl commented Mar 27, 2020

If we put Runtime errors into 2 categories:

  1. Deterministic: the errors which are expected errors and we consider that the Node is healthy.
  2. Non-deterministic: such as a node failure.

This is where we make a decision on the category:
https://github.com/nearprotocol/nearcore/blob/44bf10d5f618b50195e5ad122c0247aa30dee909/runtime/runtime/src/actions.rs#L190-L206

https://github.com/nearprotocol/nearcore/blob/a08944db6672185618e2454fce5e7be0e327b47a/runtime/near-vm-runner/src/errors.rs#L59-L90

Deterministic errors:

  • VMError::FunctionCallError (all of them)
  • VMError::InconsistentStateError

Non-deterministic:

  • VMLogicError::ExternalError

All deterministic errors are considered as a valid FunctionCall outputs.

Is everything correct so far?

@lexfrl lexfrl changed the title Upgrade the downcasting of RuntimeError in Wasmer fix: Upgrade Wasmer to 0.16.2, update Wasmer error handling Mar 30, 2020
@lexfrl lexfrl changed the title fix: Upgrade Wasmer to 0.16.2, update Wasmer error handling Upgrade the downcasting of RuntimeError in Wasmer Mar 30, 2020
@MaksymZavershynskyi
Copy link
Contributor Author

FYI @olonho We have a lot of complexity around error handling in Wasm that you should be aware of.

@MaksymZavershynskyi
Copy link
Contributor Author

Added more context here: wasmerio/wasmer#1338.

For now let's keep unknown error, since we don't have sharding and slashing it will not cause slashing. I asked Wasmer to prioritize it. Closing this issue since we #2055

@bowenwang1996
Copy link
Collaborator

But even with a single shard, why couldn't this cause state to diverge?

@lexfrl
Copy link
Contributor

lexfrl commented Apr 8, 2020

Great question! I guess, the logic is the following: imagine, there is no guarantee that (due to internal errors) node producer is not following the protocol (it could be an internal error). It means that it will produce non-valid block (since the healthy validators output will be deterministic). The assumption that the non-valid blocks will be ignored by the subsequent block producers (validators).

But I don't know the specific details of how block producers behave in this case, so this is an open question for me too.

@bowenwang1996
Copy link
Collaborator

bowenwang1996 commented Apr 8, 2020

@fckt what you said is true, but it may cause the network to diverge when more than 1/3 block producers have the same post state root while the rest has another. It can cause the network to stall for quite a while if that happens.

Actually if an attacker finds a way to leverage this non-determinism it is possible that they can stall the network indefinitely.

@lexfrl
Copy link
Contributor

lexfrl commented Apr 8, 2020

Right now we can't do anything. Wasm spec defined as deterministic (with exceptions we eliminated). The problem is that we can't separate Internal Wasmer (and internal host) errors from Wasm traps and Host logic errors today..

This issue is only about updating Wasmer. We hoped it will solve the problem. But it's not, due to bug in Wasmer - it doesn't return ExceprionCode's in the relevant cases (#2316 (comment)).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-transaction-runtime Area: transaction runtime (transaction and receipts processing, state transition, etc) P-high Priority: High
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants