-
Notifications
You must be signed in to change notification settings - Fork 208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
upgrading the kernel itself: controller.setKernelBundleID() #4375
Comments
Some folks in today's kernel meeting (@michaelfig ? @FUDCo ?) expressed concern about upgrading the kernel without actually restarting the process. I can think of three approaches In the first one, cosmic-swingset remains running, but it discards and replaces the controller object. The sequence is like:
In the second case, we do the same, but the entire cosmic-swingset process exits after committing the block results. In the third case, both the cosmic-swingset process and the controller remain running. The controller, however, knows how to shut down the old kernel and starts up a new one. It can do this within a single
From the chain's point of view, the In the first two cases, some arbitrary number of kernel cranks (deliveries) are made after the upgrade event, but using the old kernel. This makes the consistency of the kernel state a function of when the host decides to end the block, whereas normally it doesn't depend quite so much on that decision. In the third case, every crank executed after the upgrade command will happen with the new kernel, regardless of when the host |
Add upgrade-kernel-bundle API next to Maybe use kernel bundleID so that an explicit hash shows up in the v2 application code. MN-1 to MN-2 transition may not be the first. All upgrades will require replacing the validator code, which may or may not replace the kernel. |
After today's kernel meeting, @kriskowal and I figured that we might not need to make any code changes for MN-1, and we've sketched out some small code changes needed for the subsequent version What we need to add in time for version-2 is something like Now the timeline of upgrade will be:
We'd like to confirm with @michaelfig that this plan will work, and we'd like to understand how If so, we can defer this ticket indefintely, and/or close it entirely. If non-chain environments would use a similar "replace the whole process" approach for upgrade, then they wouldn't benefit from an in-place "live" kernel upgrade either. |
IMHO, it looks like it would work just fine.
The
That's right. The governance proposal would vote for a software upgrade at block 1000 to version-2, with human- and possibly machine-readable instructions for how to install the SDK that understands version-2. When block 1000 rolls around, the version-1 chain halts ("I don't know version-2"). It doesn't matter how many times you restart version-1, it just keeps halting. But, if you start version-2 with version-1's chain home directory, that would trigger the |
I’ve punted this to MN-1.1. Thanks for talking me through this, @warner. |
We decided (and executed, in #5679) to stop persisting the kernel bundle, so we now re-bundle the kernel each time the application launches. This will automatically pick up the current kernel code, removing that portion of the motivation for this API. We don't yet have a story for discrete upgrades of the kernel DB: basically the new kernel code must be prepared to handle data from any previously released kernel. We can introduce new DB keys to indicate the version of specific tables, if that helps. But the thing we're missing (and may or may not need) is some sort of distinct "you've been upgraded!" trigger that causes a schema conversion. Such an event would help us know if/when to rebundle the |
We've revised our plan for So I'm going to close this in favor of the #6596 plan and having the host app change the version of its dependency upon |
What is the Problem Being Solved?
We'll need an inline way to upgrade the kernel itself.
Currently, the kernel source code is bundled once during
initializeSwingSet
and stored in the kvStore under thekernelBundle
key. Each time the kernel is launched, this bundle is given toimportBundle
to form the "kernel compartment".I decided to keep this bundle around, rather than re-bundling the kernel source on each application restart, to 1: speed up restart (bundling can take a few seconds), and 2: reduce surprises when you update your source tree without resetting your chain or other application. During debugging sessions where we're replaying recorded chain state under modified kernels, we've needed to overcome this stickiness with tools like
packages/SwingSet/misc-tools/rekernelize.js
, to re-bundle and overwrite the kvStore entry. As a result, I was considering removing this feature, and have the controller re-bundle the kernel source code each time the application launches.But, after working on #4372 bundlecaps, I realized that this stickiness is actually a feature, which would play nicely into a mechanism to cleanly upgrade the kernel itself. The idea is that
kvStore['kernelBundle']
becomeskvStore['kernelBundleID']
, andinitializeSwingSet
is responsible for bundling and installing the initial version. Later, when the application is told to upgrade the kernel, it needs to:controller.installBundle(newKernelBundle)
and get backnewKernelBundleID
controller.shutdown()
controller.setKernelBundleID(newKernelBundleID)
controller.start()
setKernelBundleID
just checks that the bundleID is valid, and writes it into the kvStore.controller.start()
reads the bundleID out of kvStore, loads the bundle itself, then doesimportBundle()
as before.Of course, it is critical that the new kernel can handle the persistent state in which it wakes up. It must look for kvStore flags that indicate whether particular features have been initialized or not. But the kernel is not obligated to mimic the behavior of some earlier version. The host application is responsible for triggering the upgrade at a consensus-managed moment, between blocks, so the new kernel version only has to be consistent with itself.
A separate issue is how e.g. cosmic-swingset should decide when an upgrade is appropriate. One option is to require an application upgrade, and have the new version pay attention to the block height. When the height reaches a pre-decided point, cosmic-swingset can shut down the kernel, call
bundleSource()
on the usual pathpackages/SwingSet/src/kernel/kernel.js
, install the resulting bundle, then instruct the controller to use the new bundleID. This approach requires all validators to install the new application before the appointed cutover time, which is also what they would do to replace the Go code in cosmic-swingset, or other low-level non-JS code.An alternate approach would be to use an in-band transaction to trigger the upgrade. Some external client could use signed txns to perform the
controller.installBundle()
ahead of time, just as they would install contract code. Then maybe a governance vote triggers the execution of some SwingSet-module code that performs the shutdown/setKernelBundleID/start. This would be driven by governance vote, and would not require validators to install any new software. The governing committee should be equivalent to getting all validators to replace their software, however, because the new kernel code gets nearly complete control over the chain. But the execution of the vote might be easier if it can be handled entirely within the governance module.Description of the Design
Security Considerations
Replacing the kernel code is the most security-critical thing we can imagine, so both the implementation and the code that triggers it must be audited carefully.
Test Plan
unit tests
The text was updated successfully, but these errors were encountered: