FVM Actor API Upgrades & Compatibility Requirements #830

Stebalien · 2023-09-14T00:44:53Z

Stebalien
Sep 14, 2023
Collaborator

Once we allow arbitrary "native" Wasm smart contracts (see #779), we'll need to figure out how to handle changes to the actor<->FVM API (i.e., the "syscalls"). What follows is a discussion of actor upgrades and how it relates to code CIDs.

One beautiful feature of Wasm is that it's highly structured. This means means, given two wasm modules:

old_actor:
- imports fvm1::x
- exports invoke
shim:
- imports fvm2::x
- exports x (calling fvm2::x internally)

It's possible to "link" shim to old_actor as fvm1, creating a new wasm module new_actor that imports fvm2 instead of fvm1. We can even allow these "shims" to run initialization code (e.g., before the invoke is called on the actor), as long as we're very careful.

This provides a very clean upgrade path where we can not only migrate data as we already do today, we can migrate user deployed actors by "rewriting" them to import a "newer" FVM API (as long a translation can be implemented).

However, this raises a few questions:

First, After upgrading from, e.g., some fvm1 API to some fvm2 API, should we still allow users to deploy new actors importing fvm1 (upgrading them to fvm2 on-chain) or should we drop support for fvm1 entirely (after upgrading all already deployed actors).

The first option provides a smoother user experience, but may encourage users to do stupid things like deploy actors importing fvm1 when we're on fvm12. These shims will add gas overhead).
The second option means that, after the upgrade, the we can "forget" previous versions.

What does this mean for code CIDs?

At a minimum, we'll need to keep some mapping of "deployed code CID" to "actual code CID". But that raises the question: should "deployed code CID" just be a "code ID"? Having "code IDs" instead of "code CIDs" would also reduce the size of actor objects significantly. It may also remove the need for our current "builtin actor manifest" (where we assign IDs to all builtin actors). Finally, it will definitely reduce the cost/time of upgrades (as we wouldn't need to rewrite all actors.

On the other hand, code CIDs are assumed everywhere so this might just be more pain than it's worth. But, we have two conflicting issues:

On network upgrades, we don't want the system to be changing the code CIDs of user-deployed actors as exposed via, e.g., the actor::get_actor_code_cid syscall. That is, we can change some internal code CID, but we don't want to mess with something the user might be relying on.
It would be nice if code CIDs matched the executing code without too much indirection.

anorth · 2023-10-06T01:10:37Z

anorth
Oct 6, 2023
Maintainer

My first thoughts here are that the complexity is not worth it and we shouldn't do it. We shouldn't make any backwards-incompatible changes to syscalls made available to native actors – they are locked in forever. Thus no re-linking or code CID mappings. We'll need to be quite conservative in exporting syscalls. Our path forward will be constrained by prior decisions, but there will still be paths forward. Just like other blockchain VMs.

Yes, it would be kind of nice to be able to upgrade the VM's API. But I don't think we should.

(Related, I think approximately the same thing about APIs to the built-in actors. We need to export more, especially around the miner actor. This will lock in some things, including internal details (e.g. wpost partitions) that we might prefer not to. But we need to to move forward. When we want to make big breaking changes, the path forward will be to add a whole new miner2 actor type to embody them while continuing support for miner v1. It will be costly, but worth it to be able to move forward now).

2 replies

Stebalien Oct 20, 2023
Collaborator Author

In my experience, there's no such thing as "never going to change". I agree we should try to avoid it, but I'm also trying to avoid having send2, send3, etc.... if at all possible (e.g., see Linux with open, openat, openat2, etc.; Windows is even worse).

Basically, I don't think it's fair to say "we don't need an upgrade path". We need an upgrade path, the question is: what should that upgrade path be and should we continue to support "old" method signatures for newly deployed actors?

anorth Oct 23, 2023
Maintainer

I hear you, but I think send2 is not a bad solution. The Linux kernel is probably quite a good guide here in terms of strict compatibility requirements, security-sensitive code, the implied value of reduced complexity, having been wildly successful with or despite the decisions they made, etc. The additional complexity of send2 is much more localised than that of API shims, difference code CIDs etc.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FVM Actor API Upgrades & Compatibility Requirements #830

{{title}}

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{title}}

Select a reply

FVM Actor API Upgrades & Compatibility Requirements #830

Stebalien Sep 14, 2023 Collaborator

Replies: 1 comment · 2 replies

anorth Oct 6, 2023 Maintainer

Stebalien Oct 20, 2023 Collaborator Author

anorth Oct 23, 2023 Maintainer

Stebalien
Sep 14, 2023
Collaborator

Replies: 1 comment 2 replies

anorth
Oct 6, 2023
Maintainer

Stebalien Oct 20, 2023
Collaborator Author

anorth Oct 23, 2023
Maintainer