SwingSet must save static vat sources in kernel statedir #1681

michaelfig · 2020-09-03T06:10:02Z

My current plan is to use https://github.com/cosmos/cosmos-sdk/tree/master/cosmovisor to manage transparent chain upgrades. To do so requires the kernel to understand prior versions' statedir (including kernel DB) and save static vat sources in the statedir so that they can be replayed.

Originally posted by @michaelfig in #1479 (comment)

This is, AFAICT, the only SwingSet gap preventing deterministic upgrades of a given statedir from an older ag-chain-cosmos install, to a newer ag-chain-cosmos install. The rest of the chain upgrade process is work to be done in cosmic-swingset.

The reason for this is that only a statedir (with kernel DB) and kernel/cosmic-swingset code (no Vat code) is required to remain compatible past a restart. This works already for dynamic vats (they are recorded by the kernel in the kernel DB, currently in transcripts, but later in a blob store). Static vat source code is precious to guarantee determinism regardless of the vat contents.

warner · 2020-09-09T01:41:42Z

@FUDCo pointed out that we might want to save multiple (sequential) versions of those sources, to support upgrades (#1691).

We might save on space by not recording a full transcript, just the inputs (deliveries) and the results of syscalls (device reads), and then merely hash all the outputs to check for consistency. To really save space we could retain a single accumulated hash of all outputs, rather than one per crank. But if we then replay with an upgraded version (which is supposed to behave identically up until the cutover point), and it fails (hash mismatch), we'd want to know exactly where they diverged (to debug the problem). So at that point, we could run the original version in parallel with the upgraded version, one delivery at a time, comparing the results before moving on to the next. This would require minimal transcript storage during the comparison (only one delivery's worth).

We also discussed recording the block range over which the versions are supposed to be active. If one version can be told to write its entire state out to some schematized DB, and the next version can load its entire state from that DB, then an upgrade can be a lot cheaper. When replaying the entire history (for whatever reason), the process wouldn't have to run all versions in parallel.

warner · 2020-09-10T20:00:04Z

We're slowly moving swingset from a one-call invocation pattern (which creates a state DB if missing, and then does different things depending upon whether it needed to initialize that DB or not), to a two-call pattern (one explicit "initialize" call which creates+initializes a state DB, but doesn't run anything, and a second "run" call which requires the DB be present and initialized first).

The initialize call will accept the config object and the static vat specifications. It will bundle those sources and write them into the DB, almost exactly like dynamic vat sources are stored (for now). It also uses the names of the static vats to construct and enqueue the bootstrap message. Then it stops.

The "run" call does not accept a config object. It will reconstruct the static vats from their DB records, rather than the config object.

FUDCo · 2020-09-10T20:10:30Z

That two call pattern is essentially how it works now, and has for a long time. It's just that the initial conditions are currently not completely persisted, so the resulting behavior is dependent on stuff that is carried over in memory between the two calls rather than via the state db. Really what we're taking about is erecting a proper barrier between the init phase and the run phase. Your concept of the two phases as separate executables makes that barrier explicit, though the developer ergonomics of "call this, then call this" is probably worth preserving rather than requiring two program executions.

warner · 2020-09-10T21:02:05Z

yeah, we need to start from the cosmic-swingset side, and look at how buildVatController is called to find a good site to split it out

I think devices must remain as arguments of run() rather than initialize(), which might be a bit weird.

FUDCo · 2020-09-24T23:15:45Z

Closed by #1814

michaelfig added the SwingSet package: SwingSet label Sep 3, 2020

warner self-assigned this Sep 10, 2020

warner assigned FUDCo Sep 15, 2020

FUDCo mentioned this issue Sep 22, 2020

Overhaul swingset startup #1814

Merged

FUDCo closed this as completed Sep 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SwingSet must save static vat sources in kernel statedir #1681

SwingSet must save static vat sources in kernel statedir #1681

michaelfig commented Sep 3, 2020 •

edited

Loading

warner commented Sep 9, 2020

warner commented Sep 10, 2020

FUDCo commented Sep 10, 2020

warner commented Sep 10, 2020

FUDCo commented Sep 24, 2020

SwingSet must save static vat sources in kernel statedir #1681

SwingSet must save static vat sources in kernel statedir #1681

Comments

michaelfig commented Sep 3, 2020 • edited Loading

warner commented Sep 9, 2020

warner commented Sep 10, 2020

FUDCo commented Sep 10, 2020

warner commented Sep 10, 2020

FUDCo commented Sep 24, 2020

michaelfig commented Sep 3, 2020 •

edited

Loading