Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SwingSet must save static vat sources in kernel statedir #1681

Closed
michaelfig opened this issue Sep 3, 2020 · 5 comments
Closed

SwingSet must save static vat sources in kernel statedir #1681

michaelfig opened this issue Sep 3, 2020 · 5 comments
Assignees
Labels
SwingSet package: SwingSet

Comments

@michaelfig
Copy link
Member

michaelfig commented Sep 3, 2020

My current plan is to use https://github.com/cosmos/cosmos-sdk/tree/master/cosmovisor to manage transparent chain upgrades. To do so requires the kernel to understand prior versions' statedir (including kernel DB) and save static vat sources in the statedir so that they can be replayed.

Originally posted by @michaelfig in #1479 (comment)

This is, AFAICT, the only SwingSet gap preventing deterministic upgrades of a given statedir from an older ag-chain-cosmos install, to a newer ag-chain-cosmos install. The rest of the chain upgrade process is work to be done in cosmic-swingset.

The reason for this is that only a statedir (with kernel DB) and kernel/cosmic-swingset code (no Vat code) is required to remain compatible past a restart. This works already for dynamic vats (they are recorded by the kernel in the kernel DB, currently in transcripts, but later in a blob store). Static vat source code is precious to guarantee determinism regardless of the vat contents.

@michaelfig michaelfig added the SwingSet package: SwingSet label Sep 3, 2020
@warner
Copy link
Member

warner commented Sep 9, 2020

@FUDCo pointed out that we might want to save multiple (sequential) versions of those sources, to support upgrades (#1691).

We might save on space by not recording a full transcript, just the inputs (deliveries) and the results of syscalls (device reads), and then merely hash all the outputs to check for consistency. To really save space we could retain a single accumulated hash of all outputs, rather than one per crank. But if we then replay with an upgraded version (which is supposed to behave identically up until the cutover point), and it fails (hash mismatch), we'd want to know exactly where they diverged (to debug the problem). So at that point, we could run the original version in parallel with the upgraded version, one delivery at a time, comparing the results before moving on to the next. This would require minimal transcript storage during the comparison (only one delivery's worth).

We also discussed recording the block range over which the versions are supposed to be active. If one version can be told to write its entire state out to some schematized DB, and the next version can load its entire state from that DB, then an upgrade can be a lot cheaper. When replaying the entire history (for whatever reason), the process wouldn't have to run all versions in parallel.

@warner warner self-assigned this Sep 10, 2020
@warner
Copy link
Member

warner commented Sep 10, 2020

We're slowly moving swingset from a one-call invocation pattern (which creates a state DB if missing, and then does different things depending upon whether it needed to initialize that DB or not), to a two-call pattern (one explicit "initialize" call which creates+initializes a state DB, but doesn't run anything, and a second "run" call which requires the DB be present and initialized first).

The initialize call will accept the config object and the static vat specifications. It will bundle those sources and write them into the DB, almost exactly like dynamic vat sources are stored (for now). It also uses the names of the static vats to construct and enqueue the bootstrap message. Then it stops.

The "run" call does not accept a config object. It will reconstruct the static vats from their DB records, rather than the config object.

@FUDCo
Copy link
Contributor

FUDCo commented Sep 10, 2020

That two call pattern is essentially how it works now, and has for a long time. It's just that the initial conditions are currently not completely persisted, so the resulting behavior is dependent on stuff that is carried over in memory between the two calls rather than via the state db. Really what we're taking about is erecting a proper barrier between the init phase and the run phase. Your concept of the two phases as separate executables makes that barrier explicit, though the developer ergonomics of "call this, then call this" is probably worth preserving rather than requiring two program executions.

@warner
Copy link
Member

warner commented Sep 10, 2020

yeah, we need to start from the cosmic-swingset side, and look at how buildVatController is called to find a good site to split it out

I think devices must remain as arguments of run() rather than initialize(), which might be a bit weird.

@FUDCo
Copy link
Contributor

FUDCo commented Sep 24, 2020

Closed by #1814

@FUDCo FUDCo closed this as completed Sep 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
SwingSet package: SwingSet
Projects
None yet
Development

No branches or pull requests

3 participants