Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible race condition due to async IO in VM constructor #650

Closed
davidmurdoch opened this issue Jan 14, 2020 · 9 comments
Closed

Possible race condition due to async IO in VM constructor #650

davidmurdoch opened this issue Jan 14, 2020 · 9 comments

Comments

@davidmurdoch
Copy link
Contributor

https://github.com/ethereumjs/ethereumjs-vm/blob/feef85f7d51c4e09da3105ebe473e86dcff45eb1/lib/index.ts#L121

This line begins writing data to the trie, which is an async operation that may perform file IO depending on the Trie's configuration. There doesn't seem to be a way of detecting when these writes operations are complete.

I only found this issue while debugging an issue in ganache that was causing a TypeError: Cannot read property 'pop' of undefined related to not waiting for this initialization to complete. Ganache currently uses a leveldb implementation for its trie storage which has a bug that permits db reads on values that were in the process of being written. If callers could wait for the VM to be fully initialized it would prevent this race condition from occuring*.

I'll pull the precompile initialization logic into ganache-core for the time being, so a fix isn't urgent.

*aside: i'm actually very happy this race condition exists in the VM... as the ganache read/write bug would have been even harder to track down without this reliable way of triggering it! 😃.

@s1na
Copy link
Contributor

s1na commented Jan 15, 2020

I generally have a queasy feeling about the precompile initialization...thanks for reporting this.

I see three options:

  • Async constructor
  • Move this and potentially other async initialization logic into an async init() method and require users to invoke it
  • Same as above, but lazily call init() in runTx, runBlockchain, etc. since they're already async we won't be requiring any change from users. And I don't think there's any VM sync method which requires this precompile initialization logic. This has the downside though that if someone initializes the VM and before calling any of the methods, queries the trie for precompile contracts they won't be there, but I'd say that's highly unlikely.

Maybe there are other alternatives I'm missing, but between these 3 I'd go with the last one

@davidmurdoch
Copy link
Contributor Author

Some other possible solutions:

  1. instead of init, it could be named activatePrecompiles. In either case, logging a deprecation notice whenever opts.activatePrecompiles === true in constructor, something like "opts.activatePrecompiles is deprecated and will be removed in future versions. To remove this message, migrate to await vm.init();". And then in the next breaking version log an obsolete notice like "opts.activatePrecompiles is obsolete and has been removed. You must migrate to await vm.init();." (this may just need to be an Error that is thrown instead of just a console warning). Deprecation/obsoletion messages like this would help tremendously when upgrading, especially for subtle changes like this.
  2. The vm could emit an event when it has finished initializing:
const vm = new VM();
if (!vm.initialized) {
  vm.on("initialized", vmReady);
} else {
  vmReady();
}

Actually, now that I've typed that out... I'm not a huge fan of this second approach, haha.

As far as the async constructor idea goes... how would this be done?

And for the lazy init call, like you, I feel like this could cause problems for applications that use the trie outside of the VM itself. This actually could cause problems for Ganache (in the future), depending on how it is implemented. Ganache has a mode where we persist the trie data to disk, and while we currently "reinitialize" precompiles every time we initialize a VM I've been planning on fixing this so we only init the precompiles on initial database creation. It could complicate the internal VM logic if runTx/runBlockchain/etc had to first look up if they've been initialized or not.

@s1na
Copy link
Contributor

s1na commented Jan 17, 2020

Hm, so far it's a choice between bad and worse for me... @alcuadrado what do you think? do you see any alternatives, or which of these would you say makes more sense?

@alcuadrado
Copy link
Member

Good catch @davidmurdoch! I think we could/should activate tslint's no-floating-promise to detect this kind of bugs.

None of the solutions is ideal, but I think I like David's (1) the most. Precompiles activations is already one of the weirdest and most confusing things of the vm. Moving it to it's own clearly-named method would be a good thing, regardless of this bug. I also think the deprecation & removal plan also makes sense.

@holgerd77
Copy link
Member

Did someone has some evolved thoughts on this?

@rumkin
Copy link
Contributor

rumkin commented Feb 18, 2020

Well, let me add my thoughts on that.

I think all the constructors' async operations should be moved into init() method as @s1na proposed, (it's a very common practice to use Class.create().then() as a synonym to new Class().init().then()). This method should emit init event when initialization is completed and an error if it's not, according to @davidmurdoch. And @alcuadrado is right about floating promises deprecation, this rule is lifesaving. What's about precompiles, I think there shouldn't be such option as activatePrecompiles. Blockchains have long lives which involve hardforks. Hardforks bring new precompiled functions and new opcodes prices. I think it would be proper to implement scenarios to make possible to initialize new precompile and change opcode prices after the certain block is mined. It would allow to repeat complete Ethereum mainnet history without stopping the VM. Also it will make VM configuration and work more predictable, controllable and stable.

@rumkin
Copy link
Contributor

rumkin commented Feb 22, 2020

Well, after a reading the project issues I found #652 (Hardforks backporting) which describes questions about new opcodes/prices and other problems and raises even more important questions which couldn't be done with scenarios in the way I described them. And such solution is a huge piece of work and researches. So I'll move discussion to that issue.

What's about precompiles initialization. As I said It's unnecessary and should be removed. Because currently it has one goal is to prevent from spending extra gas on address creation. But, as I see in Go Ethereum implementation the VM intercepts all calls to precompiled contracts, and if precompiled contract isn't exists yet because Hardfork is not activated, then new account should be created.

Go Ethereum CALL opcode source:
https://github.com/ethereum/go-ethereum/blob/master/core/vm/instructions.go#L792

Go EVM CallCode method source:
https://github.com/ethereum/go-ethereum/blob/6ffee2afd66892da1145f47e210210e6bf55c74a/core/vm/evm.go#L207

But in current JS implementation non-existent address would be created anyway:
https://github.com/ethereumjs/ethereumjs-vm/blob/feef85f7d51c4e09da3105ebe473e86dcff45eb1/lib/evm/opFns.ts#L703

I think this is where this behavior should be changed. And activatePrecompiles could be replaced with precompiledContracts which is a dictionary where a key is an address and a value is a precompiled executable function. Also it allows to create custom precompiles for different purposes (test, research, EIP reproduction, etc.) and fixes race condition issue.

@rumkin
Copy link
Contributor

rumkin commented Feb 22, 2020

Oops. Looks like I made wrong conclusion about Go implementation. Sorry for that. The good news I've fixed race condition in PR #660 and added customizable precompiles as VM option in the most backward compatible way as it was possible. So now developers can customize this set on their demand. VM option activatePrecompiles is returned back.

@holgerd77
Copy link
Member

Solved by #665, to be released in the v5 release (see #681), will close.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants