Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make bundleStore accept nestedEvaluate bundles too #7190

Closed
warner opened this issue Mar 20, 2023 · 3 comments · Fixed by #7242
Closed

make bundleStore accept nestedEvaluate bundles too #7190

warner opened this issue Mar 20, 2023 · 3 comments · Fixed by #7242
Assignees
Labels
enhancement New feature or request SwingSet package: SwingSet
Milestone

Comments

@warner
Copy link
Member

warner commented Mar 20, 2023

What is the Problem Being Solved?

To support our new approach in #6596 (comment) , we need to store the xsnap-lockdown and swingset-xsnap-supervisor bundles persistently, so a vat created today will maintain the same behavior tomorrow (after those packages have been updated or changed in some way).

We're in the process of moving vat/contract bundles out of the kvStore and into a new dedicated bundleStore (#7089), which saves space in the IAVL tree (export data only holds bundle ID hashes, not the whole bundles, which are delivered in export artifacts instead). So we'd like to avoid putting these lockdown/supervisor bundles back into the kvStore where they used to live. The bundleStore feels like the right place for them.

However, these two bundles are special: they use moduleFormat: 'nestedEvaluate' instead of endoZipBase64. They cannot use the more advanced format because such bundles need a specialized loader, and the initial xsnap environment does not contain one (it is the supervisor bundle which installs this loader, so we'd have a bootstrapping / chicken-vs-egg problem). The nestedEvaluate format can be loaded with a simple IIFE wrapper.

To live in the bundleStore, and more specifically to be safely importable from a swing-store export artifact (i.e. state-sync snapshot), we need a safe integrity hash of these bundles to be included in the export data. The endoZipBase64 format defines an integrity hash which is the SHA512 hash of the compartment map, which works because 1: the compartment map defines hashes for all the other components of the zip file, and 2: the checkBundle/importBundle code enforces those hashes, so that one hash is sufficient to define the behavior completely (and is compatible with a future scheme where we build bundles at runtime out of shared components, while a single flat hash would not). The swingset-centric BundleID is then a version prefix (the literal string b1-) followed by the lowercase hex encoding of that SHA512 hash.

So we need to define an integrity hash "bundle ID" for the nestedEvaluate -format bundles. I'm thinking that we (somewhat retroactively) define b0-${SHA256(JSON.stringify(bundle))} as its Bundle ID. Compared to b1- -format bundles:

  • it is bigger: we're JSON stringifying a code string which has lots of double quotes and backslashes already, so there's some small expansion as a result
  • it overspecifies the bundle somewhat: the hash is sensitive to the ordering of the properties, so two bundles whose behavior would be identical but were built with { moduleFormat, source } instead of { source, moduleFormat } would not appear to be the same (not a big deal)
  • b0- can actually hold endoZipBase64 format bundles too, it is a superset, but we wouldn't use it for them, we'd only use it for these early-stage IIFE+eval -loadable bundles that are needed to bootstrap a worker environment

Description of the Design

  • add another table for b0- format bundles (probably? seems better than poly-schema-ing the existing table to handle both)
    • or, if we store compressed bundles (why not?) we might just need to add a format or version column
  • define the b0- format as above
  • change bundleStore.addBundle to switch on the prefix of the supplied bundle ID
    • if b1-, proceed as before
    • if b0-:
      • require that moduleFormat is nestedEvaluate
      • stringify the bundle object, compute its hash, compare against alleged bundleID
      • store the stringified form (possibly compressed) in the DB
  • change getBundle to extract, maybe decompress, JSON.parse the b0- format bundles
  • announce the new additions in export data (the exporter callback), add to one-shot export
  • add artifacts for the new bundles

Then we need to add kernelKeeper changes to talk to the bundleStore for these, something which knows the bundle names ("xsnap-lockdown" and "xsnap-supervisor").

The work to store these bundles at initializeSwingset time, and to use them when launching a worker, will be covered by #6596, not this ticket.

Security Considerations

The hash must be tight enough to prevent a "replace the bundle with a malicious variant" attack during the swingstore export/import process. The import code must assert that the bundle artifact matches the export-data hash.

Scaling Considerations

We'll record only a small number of these bundles, one per version of our runtime, so size isn't a big concern. Compressing the bundles in the DB might be nice, but not as important as compressing the vat/contract bundles (of which there will be a lot more).

We'll still have a yarn build step, so unit tests / etc won't be re-running bundleSource (adding a second or two to each test file). Unit tests will be doing a lot of loading that pre-generated bundle from disk, and storing it in the DB, and loading it back from the DB, but that's probably very fast compared to the bundleSource call (which rummages through all of node_modules to do its job).

Test Plan

Unit tests in swing-store, plus a few in SwingSet/test/test-state.js to exercise the kernelKeeper integration.

@warner warner added enhancement New feature or request SwingSet package: SwingSet labels Mar 20, 2023
@warner
Copy link
Member Author

warner commented Mar 20, 2023

Talking with @FUDCo just now:

  • compressing the serialized bundles sounds pretty straightforward, and a nice (although small) storage-space win
  • it's fine to rely on the bundleID prefix (b0- vs b1-) to determine the bundle-deserialization and validation code paths, no need for an extra DB column (which would be partially redundant with the bundleID one anyways: fewer columns means fewer opportunities for disagreement)
  • b0- and b1- obviously implies a b2- some day (which was always the intention). We'll invent b2- if/when we have bundles whose behavior/contents cannot be validated by the rules we've established for b0-/b1-. The b0- format can validate anything object-shaped, but we created b1- specifically to allow the bundleStore to synthesize bundles out of individual modules, and still maintain their integrity checks, which b0- doesn't enable. So a b2- would be motivated by some other interesting validation need, which hasn't happened yet. The version prefix gives us room for the future.

@warner warner added this to the Vaults EVP milestone Mar 21, 2023
@warner warner self-assigned this Mar 21, 2023
@mhofman
Copy link
Member

mhofman commented Mar 21, 2023

We'll also need to reference these bundle ids in any vat start/upgrade synthetic transcript entry as described in #7199

@warner
Copy link
Member Author

warner commented Mar 21, 2023

BTW by using b0- I'm retroactively starting the bundle-ID-version-numberspace from 0, not 1, but I think it fits in well. b0- can be (and is only) used for our original bundle type(s), namely nestedEvaluate. b1- appeared "later", and can (only) be used for our newer bundle type, endoZipBase64.

Glad I was in a 1-indexed mood when I decided to use b1- for the endo modules..

warner added a commit that referenced this issue Mar 25, 2023
This enhances the bundleStore to accomodate NestedEvaluate -format
bundles, with a BundleID that is a hash of the JSON-serialized bundle
object, using a `b0-` prefix.

This adds to the previous facility which handled only EndoZipBase64
-format bundles, using a `b1-` prefixed BundleID that hashes only the
compartment-map.json component of the zipfile.

It also adds a number of tests.

closes #7190
warner added a commit that referenced this issue Mar 25, 2023
This enhances the bundleStore to accomodate NestedEvaluate -format
bundles, with a BundleID that is a hash of the JSON-serialized bundle
object, using a `b0-` prefix.

This adds to the previous facility which handled only EndoZipBase64
-format bundles, using a `b1-` prefixed BundleID that hashes only the
compartment-map.json component of the zipfile.

It also adds a number of tests.

closes #7190
warner added a commit that referenced this issue Mar 29, 2023
This enhances the bundleStore to accomodate NestedEvaluate -format
bundles, with a BundleID that is a hash of the JSON-serialized bundle
object, using a `b0-` prefix.

This adds to the previous facility which handled only EndoZipBase64
-format bundles, using a `b1-` prefixed BundleID that hashes only the
compartment-map.json component of the zipfile.

It also adds a number of tests.

closes #7190
warner added a commit that referenced this issue Mar 29, 2023
This enhances the bundleStore to accomodate NestedEvaluate -format
bundles, with a BundleID that is a hash of the JSON-serialized bundle
object, using a `b0-` prefix.

This adds to the previous facility which handled only EndoZipBase64
-format bundles, using a `b1-` prefixed BundleID that hashes only the
compartment-map.json component of the zipfile.

It also adds a number of tests.

closes #7190
@mergify mergify bot closed this as completed in #7242 Mar 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request SwingSet package: SwingSet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants