Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

upgrading liveslots: controller.setLiveslotsBundleID() #4376

Closed
warner opened this issue Jan 25, 2022 · 4 comments
Closed

upgrading liveslots: controller.setLiveslotsBundleID() #4376

warner opened this issue Jan 25, 2022 · 4 comments
Assignees
Labels
enhancement New feature or request SwingSet package: SwingSet vaults_triage DO NOT USE

Comments

@warner
Copy link
Member

warner commented Jan 25, 2022

What is the Problem Being Solved?

As with #4375 to upgrade the kernel, it would be nice to be able to upgrade the version of liveslots that we use in all vats.

Upgrading liveslots, without also changing the vat bundle, could be used to add new features to liveslots, or to fix a bug that the vat has not yet provoked. It can only work if the vat is already prepared for upgrade (by recording its important state in Durable collections, and providing references to them in the #4325 "baggage", and by having a starting point which knows how to reconnect everything).

The idea is:

  • we store the liveslots bundle into the bundle store, just like vat bundles (in implement bundlecaps, bundle device #4372) and the kernel bundle (upgrading the kernel itself: controller.setKernelBundleID() #4375)
  • each vat's kvStore includes a key that says what bundleID we use to get liveslots
  • each vat-worker supervisor is modified to accept this bundle from the kernel, over the kernel-to-worker control pipe, instead of having an import { makeLiveSlots } from '.../liveslots.js' which drags it into the supervisor bundle
    • (come to think of it, it might be interesting to manage the supervisor bundle this way too)
  • the kvStore has an additional key that tells the kernel what liveslotsBundleID to use for new vats
  • we add a controller API to change this bundleID to something new

So once a vat is created, it keeps using the same liveslots, even if some new version is installed in the application. Various vats can use various versions of liveslots, depending upon whatever was current when they were first created.

Later, when you want to upgrade the liveslots in a given vat, you do a vat-upgrade but leave the vat source code untouched (or, you upgrade liveslots at the same time as you upgrade the vat bundle). This requires the kind of upgrade where we delete the heap snapshot and delete the transcript, and start the vat over from a new (/old) bundle and let it recover from durable data.

I suspect that most existing vats won't really benefit from upgrading liveslots, however being able to upgrade the vat code and have it take advantage of new liveslots features would be really valuable. So I'm thinking that vatAdminFacet~.upgrade(newVatBundleID, { liveslotsVersion: 2 }) would be great. (Or, maybe vats automatically get the latest liveslots when their vat bundles are upgraded).

Description of the Design

Security Considerations

We might consider allowing new vats to be created with an arbitrary bundle to use as liveslots. Liveslots enforces ocap security within a vat, so the code within the vat must not be able to tamper with the liveslots that supports it (else it could violate ocap rules).

However, the caller who asks for a vat to be created is already effectively in control of the whole vat. If they don't care for ocap safety within their new vat, that's their perogative, and it doesn't violate the security of the rest of the system. So it probably isn't technically wrong to allow upgrade() or createVat() to let the caller control what is used as liveslots. But anyone reading the vat source code (or any contract code later loaded into that vat) would need to know whether they can expect ocap rules or not, so it might be safer overall to restrict the options to well-known versions of liveslots.

Test Plan

unit tests

@warner
Copy link
Member Author

warner commented Mar 16, 2022

Don't give upgraders the option to use old liveslots, unless necessary for vat-code compatibility.

Think about versions on the capdata, especially w.r.t. auxdata.

@Tartuffo Tartuffo added this to the Mainnet 1 milestone Mar 23, 2022
warner added a commit that referenced this issue Jun 30, 2022
Previously, the kernel sources were bundled during
`initializeSwingset()`, and stored in the kvStore portion of the
swingstore kernelDB. It was read from that DB each time
makeSwingsetController() launched the application process.

That ensured the kernel behavior would remain consistent (and
deterministic) even if the user's source tree changed. Using a
persistent kernel bundle also speeds up application launch: we save
about 1.8s by not re-running `bundleSource` on each launch.

Once upon a time I thought this consistency was important/useful, but
now I see that it causes more problems than it's worth. In particular,
upgrading a chain requires an awkward "rekernelize" action to replace
the kernel bundle in the database. And we believe we can reclaim about
1.0s per launch by moving to a not-yet-written `importLocation()` Endo
tool that loads module graphs from disk without serializing the
archive in the middle.

This patch changes swingset to stop using a persistent kernel
bundle. Each application launch (i.e. `makeSwingsetController()` call)
re-bundles the kernel sources just before importing that bundle, so
that updated source trees are automatically picked up without
additional API calls or rekernelize steps. The bundle is no longer
stored in the kvStore.

The other bundles (vats, devices, xsnap helpers) are still created
during `initializeSwingset` and stored in the DB, since these become
part of the deterministic history of each vat (#4376 and #5703 will
inform changes to how those bundles are managed). Only the kernel
bundle has become non-persistent.

Bundling is slow enough that many unit tests pre-bundle both the
kernel and the sources of built-in vats, devices, and xsnap helper
bundles. This provides a considerable speedup for tests that build (or
launch) multiple kernels within a single test file. These
`kernelBundles` are passed into `initializationOptions`, which then
bypassed the swingset-bundles-it-for-you code. This saves a minute or
two when running the full swingset test suite.

Unit tests can still use `initializationOptions` to amortize bundling
costs for the non-kernel bundles. `runtimeOptions` now accepts a
`kernelBundle` member to amortize bundling the kernel itself. Tests
which use this trick and also use `makeSwingsetController()` need to
be updated (else they'll run 1.8s slower). All such tests were inside
`packages/SwingSet/test/` and have been updated, which would otherwise
cause the test suite to run about 50-60s slower.

The other agoric-sdk packages that do swingset-style tests are mostly
using the deprecated `buildVatController()`, whose options continue to
accept both kinds of bundles, and do not need changes.

closes #5679
warner added a commit that referenced this issue Jun 30, 2022
Previously, the kernel sources were bundled during
`initializeSwingset()`, and stored in the kvStore portion of the
swingstore kernelDB. It was read from that DB each time
makeSwingsetController() launched the application process.

That ensured the kernel behavior would remain consistent (and
deterministic) even if the user's source tree changed. Using a
persistent kernel bundle also speeds up application launch: we save
about 1.8s by not re-running `bundleSource` on each launch.

Once upon a time I thought this consistency was important/useful, but
now I see that it causes more problems than it's worth. In particular,
upgrading a chain requires an awkward "rekernelize" action to replace
the kernel bundle in the database. And we believe we can reclaim about
1.0s per launch by moving to a not-yet-written `importLocation()` Endo
tool that loads module graphs from disk without serializing the
archive in the middle.

This patch changes swingset to stop using a persistent kernel
bundle. Each application launch (i.e. `makeSwingsetController()` call)
re-bundles the kernel sources just before importing that bundle, so
that updated source trees are automatically picked up without
additional API calls or rekernelize steps. The bundle is no longer
stored in the kvStore.

The other bundles (vats, devices, xsnap helpers) are still created
during `initializeSwingset` and stored in the DB, since these become
part of the deterministic history of each vat (#4376 and #5703 will
inform changes to how those bundles are managed). Only the kernel
bundle has become non-persistent.

Bundling is slow enough that many unit tests pre-bundle both the
kernel and the sources of built-in vats, devices, and xsnap helper
bundles. This provides a considerable speedup for tests that build (or
launch) multiple kernels within a single test file. These
`kernelBundles` are passed into `initializationOptions`, which then
bypassed the swingset-bundles-it-for-you code. This saves a minute or
two when running the full swingset test suite.

Unit tests can still use `initializationOptions` to amortize bundling
costs for the non-kernel bundles. `runtimeOptions` now accepts a
`kernelBundle` member to amortize bundling the kernel itself. Tests
which use this trick and also use `makeSwingsetController()` need to
be updated (else they'll run 1.8s slower). All such tests were inside
`packages/SwingSet/test/` and have been updated, which would otherwise
cause the test suite to run about 50-60s slower.

The other agoric-sdk packages that do swingset-style tests are mostly
using the deprecated `buildVatController()`, whose options continue to
accept both kinds of bundles, and do not need changes.

closes #5679
warner added a commit that referenced this issue Jul 1, 2022
Previously, the kernel sources were bundled during
`initializeSwingset()`, and stored in the kvStore portion of the
swingstore kernelDB. It was read from that DB each time
makeSwingsetController() launched the application process.

That ensured the kernel behavior would remain consistent (and
deterministic) even if the user's source tree changed. Using a
persistent kernel bundle also speeds up application launch: we save
about 1.8s by not re-running `bundleSource` on each launch.

Once upon a time I thought this consistency was important/useful, but
now I see that it causes more problems than it's worth. In particular,
upgrading a chain requires an awkward "rekernelize" action to replace
the kernel bundle in the database. And we believe we can reclaim about
1.0s per launch by moving to a not-yet-written `importLocation()` Endo
tool that loads module graphs from disk without serializing the
archive in the middle.

This patch changes swingset to stop using a persistent kernel
bundle. Each application launch (i.e. `makeSwingsetController()` call)
re-bundles the kernel sources just before importing that bundle, so
that updated source trees are automatically picked up without
additional API calls or rekernelize steps. The bundle is no longer
stored in the kvStore.

The other bundles (vats, devices, xsnap helpers) are still created
during `initializeSwingset` and stored in the DB, since these become
part of the deterministic history of each vat (#4376 and #5703 will
inform changes to how those bundles are managed). Only the kernel
bundle has become non-persistent.

Bundling is slow enough that many unit tests pre-bundle both the
kernel and the sources of built-in vats, devices, and xsnap helper
bundles. This provides a considerable speedup for tests that build (or
launch) multiple kernels within a single test file. These
`kernelBundles` are passed into `initializationOptions`, which then
bypassed the swingset-bundles-it-for-you code. This saves a minute or
two when running the full swingset test suite.

Unit tests can still use `initializationOptions` to amortize bundling
costs for the non-kernel bundles. `runtimeOptions` now accepts a
`kernelBundle` member to amortize bundling the kernel itself. Tests
which use this trick and also use `makeSwingsetController()` need to
be updated (else they'll run 1.8s slower). All such tests were inside
`packages/SwingSet/test/` and have been updated, which would otherwise
cause the test suite to run about 50-60s slower.

The other agoric-sdk packages that do swingset-style tests are mostly
using the deprecated `buildVatController()`, whose options continue to
accept both kinds of bundles, and do not need changes.

closes #5679
@warner
Copy link
Member Author

warner commented Jul 9, 2022

For MN-1, I want to have most of this in place, but I'm ok with not having a controller API to set change the version used for new vats. That interacts with how we upgrade the kernel/chain as a whole (do we want a reinitializeSwingset() function that the host app calls, at some consensus-defined time, during reboot, just before launching the kernel?), and I don't know what that should look like yet, and I think we can afford to figure it out later (after MN-1). But having the worker evaluate the liveslots bundle as a separate step feels like it would line up the pieces better.

@warner
Copy link
Member Author

warner commented Jul 12, 2022

I measured some bundle sizes to get a sense of how much extra space we'll consume by separating liveslots from the supervisor bundle.

In these tables, the size value is the length of the string that contains the source code: .source for a getExport -format bundle, and .endoZipBase64 for a endoZipBase64 bundle.

On trunk (as of a few days ago):

vat/device format size
adminDevice endoZipBase64 226008
adminVat endoZipBase64 260260
comms endoZipBase64 373868
vattp endoZipBase64 189440
timer endoZipBase64 253472
-- -- --
lockdown getExport 291834
supervisor getExport 509064

with the split:

vat/device format size
lockdown getExport 291834
supervisor getExport 213986
liveslots endoZipBase64 679744

It's surprising that the split liveslots is larger than the unsplit supervisor, because the current supervisor bundle should include all of:

  • A: the supervisor code
  • B: the liveslots code
  • C: their shared dependencies
  • D: dependencies that are exclusive to the supervisor
  • E: dependencies that are exclusive to liveslots

whereas in the split bundles, supervisor should contain A+C+D, while liveslots should contain B+C+E, and there should be no way for B+C+E to be larger than A+B+C+D+E.

I think this is explained by two factors. The base64-ness of the endoZipBase64 format expands the source code by a factor of ln2(256)/ln2(64) = 8/6 = 33%. In addition, Endo's archiver performs a more literal walk of what needs to be included, and probably does less tree-shaking than what rollup does for the getExport format.

@warner
Copy link
Member Author

warner commented Jan 24, 2023

This will be folded into the #6596 "worker-v1" effort, because we need stability from both XS, and all of the JS code that runs on top of it (and gets recorded in the heap snapshot): SES, supervisor bundle, liveslots, user code. The kvStore ${vatID}.options will include a managerType that names worker-v1, and the requirement will be for any version of @agoric/swingset-worker-xsnap-v1 to accomodate those heap snapshots.

@warner warner closed this as completed Jan 24, 2023
@warner warner reopened this Jan 24, 2023
@warner warner closed this as not planned Won't fix, can't repro, duplicate, stale Jan 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request SwingSet package: SwingSet vaults_triage DO NOT USE
Projects
None yet
Development

No branches or pull requests

3 participants