diff --git a/packages/SwingSet/docs/host-app.md b/packages/SwingSet/docs/host-app.md index 60a3efb0ed0..581406c3907 100644 --- a/packages/SwingSet/docs/host-app.md +++ b/packages/SwingSet/docs/host-app.md @@ -18,22 +18,40 @@ SwingSet provides robust and deterministic computation, even in the face of unex ### Kernel Upgrade -The life cycle of a SwingSet kernel begins with the one and only call to `initializeSwingset()`, which populates the swingstore DB for the first time. After that, the kernel is presumed to be immortal, but its execution is broken up into a series of reboots. Each reboot (e.g. each time the host application is started), the app must build a new controller with `makeSwingsetController()`, to have something to run. +The life cycle of a SwingSet kernel begins with the one and only call to `initializeSwingset()`, which populates the SwingStore DB for the first time. After that, the kernel is presumed to be immortal, but its execution is broken up into a series of reboots. Each reboot (e.g. each time the host application is started), the app must build a new controller with `makeSwingsetController()`, to have something to run. -From time to time, the host app will be upgraded to use a newer version of the SwingSet kernel code (e.g. a new version of the `@agoric/swingset-vat` package). This newer version might require an upgrade to the kernel's internal state. For example, the way it represents some data about a vat might be made more efficient, and the upgrade process needs to examine and rewrite vat state to switch to the new representation. Or, a bug might be fixed, and the upgrade process needs to locate and remediate any consequences of the bug having been present for earlier execution. +From time to time, the host app will be upgraded to use a newer version of the SwingSet kernel code (e.g. a new version of this `@agoric/swingset-vat` package). The newer version might require an upgrade to the kernel's persisted/durable state. For example, the way it represents some vat metadata might be made more efficient, and the upgrade process needs to examine and rewrite vat state to use the new representation. Or, a bug might be fixed, and the upgrade process needs to locate and remediate any consequences of the bug having been present during earlier execution. -To make the resulting state changes occur at a predictable time, upgrades do not happen automatically. Instead, each time the host app reboots with a new version of the kernel code, it must call `upgradeSwingset(kernelStorage)`. It must do this *before* calling `makeSwingsetController()`, as that function will throw an error if given a swingstore that has not been upgraded. +To make the resulting state changes occur deterministically, upgrades are not automatic. Instead, each time the host app reboots with a new version of the kernel code, it must call `upgradeSwingset(kernelStorage)`. It must do this *before* calling `makeSwingsetController()`, as that function will throw an error if given a SwingStore that has not been upgraded. -It is safe to call `upgradeSwingset` on reboots that do not change the version of the kernel code: the function is idempotent, and will do nothing if the swingstore is already up-to-date. +It is safe to call `upgradeSwingset` on reboots that do not change the version of the kernel code: the function is idempotent, and will do nothing if the SwingStore is already up-to-date. + +Some upgrades (in particular bug remediations) need to add events to the kernel's run-queue. To avoid having these events be intermingled with work that might already be on the run-queue at reboot time (e.g. work leftover from previous runs), these events are not automatically injected at that time. Instead, the kernel remembers what needs to be done, and waits for the host to invoke `controller.injectQueuedUpgradeEvents()` at a time of their choosing. This should be done before the next `commit()`, to avoid the risk of them being lost by a reboot. So most host applications will start each reboot with a sequence like this: ```js const { hostStorage, kernelStorage } = openSwingStore(baseDirectory); upgradeSwingset(kernelStorage); -const c = makeSwingsetController(kernelStorage, deviceEndowments); +const controller = makeSwingsetController(kernelStorage, deviceEndowments); +controller.injectQueuedUpgradeEvents(); +``` + +followed by later code to execute runs. Inputs can be fed all-at-once, or each processed in their own run, but at the end of the block, the host app must `commit()` the DB: + +```js +async function doBlock(deviceInputs) { + for (const input of deviceInputs) { + injectDeviceInput(input); + await controller.run(); + } + hostStorage.commit(); + emitDeviceOutputs(); +} ``` +The actual signature of `upgradeSwingset` is `const { modified } = upgradeSwingset(kernelStorage)`, and the `modified` flag indicates whether anything actually got upgraded. Host applications which have other means to keep track of software upgrades may wish to assert that `modified === false` in reboots that are not associated with a change to the kernel package version. They can also safely skip the `injectQueuedUpgradeEvents` call if nothing was modified. + ### Crank Execution For convenience in discussion, we split execution into "blocks". During a block, the host may call one or more device inputs, such as inbound messages, or timer wakeup events. The end of the block is marked by one or more calls to `controller.run()`, followed by a `hostStorage.commit()`, followed by the host-provided device endowments doing whatever kind of outbound IO they need to do. diff --git a/packages/SwingSet/src/controller/controller.js b/packages/SwingSet/src/controller/controller.js index 05027542e2b..c3091989814 100644 --- a/packages/SwingSet/src/controller/controller.js +++ b/packages/SwingSet/src/controller/controller.js @@ -388,6 +388,8 @@ export async function makeSwingsetController( return kernel.deviceNameToID(deviceName); }, + injectQueuedUpgradeEvents: () => kernel.injectQueuedUpgradeEvents(), + /** * Queue a method call into the named vat * diff --git a/packages/SwingSet/src/controller/upgradeSwingset.js b/packages/SwingSet/src/controller/upgradeSwingset.js index e9cff415a6e..3c12c97f0b6 100644 --- a/packages/SwingSet/src/controller/upgradeSwingset.js +++ b/packages/SwingSet/src/controller/upgradeSwingset.js @@ -5,11 +5,14 @@ import { getAllDynamicVats, getAllStaticVats, incrementReferenceCount, - addToQueue, readQueue, } from '../kernel/state/kernelKeeper.js'; import { enumeratePrefixedKeys } from '../kernel/state/storageHelper.js'; +/** + * @import {RunQueueEvent} from '../types-internal.js'; + */ + const upgradeVatV0toV1 = (kvStore, defaultReapDirtThreshold, vatID) => { // This is called, once per vat, when upgradeSwingset migrates from // v0 to v1 @@ -97,11 +100,13 @@ const upgradeVatV0toV1 = (kvStore, defaultReapDirtThreshold, vatID) => { * `hostStorage.commit()` afterwards. * * @param {SwingStoreKernelStorage} kernelStorage - * @returns {boolean} true if any changes were made + * @returns {{ modified: boolean }} */ export const upgradeSwingset = kernelStorage => { const { kvStore } = kernelStorage; let modified = false; + /** @type {RunQueueEvent[]} */ + const upgradeEvents = []; let vstring = kvStore.get('version'); if (vstring === undefined) { vstring = '0'; @@ -308,40 +313,30 @@ export const upgradeSwingset = kernelStorage => { // only the upgraded vat will see anything, and those deliveries // won't make it past liveslots). - const kernelStats = JSON.parse(getRequired('kernelStats')); - // copied from kernel/state/stats.js, awkward to factor out - const incStat = (stat, delta = 1) => { - assert.equal(stat, 'acceptanceQueueLength'); - kernelStats[stat] += delta; - const maxStat = `${stat}Max`; - if ( - kernelStats[maxStat] !== undefined && - kernelStats[stat] > kernelStats[maxStat] - ) { - kernelStats[maxStat] = kernelStats[stat]; - } - const upStat = `${stat}Up`; - if (kernelStats[upStat] !== undefined) { - kernelStats[upStat] += delta; - } - }; - + let count = 0; for (const [kpid, vatID] of buggyKPIDs) { - const m = harden({ type: 'notify', vatID, kpid }); + // account for the reference to this kpid in upgradeEvents incrementReferenceCount(getRequired, kvStore, kpid, `enq|notify`); - addToQueue('acceptanceQueue', m, getRequired, kvStore, incStat); + upgradeEvents.push({ type: 'notify', vatID, kpid }); + count += 1; } - kvStore.set('kernelStats', JSON.stringify(kernelStats)); - - console.log(` - #9039 remediation complete`); + console.log(` - #9039 remediation complete, ${count} notifies to inject`); modified = true; version = 3; } + if (upgradeEvents.length) { + assert(modified); + // stash until host calls controller.injectQueuedUpgradeEvents() + const oldEvents = JSON.parse(kvStore.get('upgradeEvents') || '[]'); + const events = [...oldEvents, ...upgradeEvents]; + kvStore.set('upgradeEvents', JSON.stringify(events)); + } + if (modified) { kvStore.set('version', `${version}`); } - return modified; + return harden({ modified }); }; harden(upgradeSwingset); diff --git a/packages/SwingSet/src/kernel/kernel.js b/packages/SwingSet/src/kernel/kernel.js index 6f438cf1b5b..5b21dbe0894 100644 --- a/packages/SwingSet/src/kernel/kernel.js +++ b/packages/SwingSet/src/kernel/kernel.js @@ -215,6 +215,10 @@ export default function buildKernel( return deviceID; } + function injectQueuedUpgradeEvents() { + kernelKeeper.injectQueuedUpgradeEvents(); + } + function addImport(forVatID, what) { if (!started) { throw Error('must do kernel.start() before addImport()'); @@ -2207,6 +2211,7 @@ export default function buildKernel( pinObject, vatNameToID, deviceNameToID, + injectQueuedUpgradeEvents, queueToKref, kpRegisterInterest, kpStatus, diff --git a/packages/SwingSet/src/kernel/state/kernelKeeper.js b/packages/SwingSet/src/kernel/state/kernelKeeper.js index 29c1137b66f..78e44271ef2 100644 --- a/packages/SwingSet/src/kernel/state/kernelKeeper.js +++ b/packages/SwingSet/src/kernel/state/kernelKeeper.js @@ -143,6 +143,7 @@ export const CURRENT_SCHEMA_VERSION = 3; // gcActions = JSON(gcActions) // reapQueue = JSON([vatIDs...]) // pinnedObjects = ko$NN[,ko$NN..] +// upgradeEvents = JSON([events..]) // ko.nextID = $NN // ko$NN.owner = $vatID @@ -182,6 +183,8 @@ export const CURRENT_SCHEMA_VERSION = 3; // v3: // * change `version` to `'3'` // * perform remediation for bug #9039 +// (after v3, does not get its own version) +// * `upgradeEvents` recognized, but omitted if empty /** @type {(s: string) => string[]} s */ export function commaSplit(s) { @@ -1267,6 +1270,27 @@ export default function makeKernelKeeper( return dequeue('acceptanceQueue'); } + function injectQueuedUpgradeEvents() { + // refcounts: Any krefs in `upgradeEvents` must have a refcount to + // represent the list's hold on those objects. When + // upgradeSwingset() creates these events, it must also + // incref(kref), otherwise we run the risk of dropping the kref by + // the time injectQueuedUpgradeEvents() is called. We're nominally + // removing each event from upgradeEvents (decref), then pushing + // it onto the run-queue (incref), but since those two cancel each + // other out, we don't actually need to modify any reference + // counts from within this function. Note that + // addToAcceptanceQueue does not increment refcounts, just kernel + // queue-length stats. + + const events = JSON.parse(kvStore.get('upgradeEvents') || '[]'); + kvStore.delete('upgradeEvents'); + for (const e of events) { + assert(e.type, `not an event`); + addToAcceptanceQueue(e); + } + } + function allocateMeter(remaining, threshold) { if (remaining !== 'unlimited') { assert.typeof(remaining, 'bigint'); @@ -1966,6 +1990,8 @@ export default function makeKernelKeeper( getAcceptanceQueueLength, getNextAcceptanceQueueMsg, + injectQueuedUpgradeEvents, + allocateMeter, addMeterRemaining, setMeterThreshold, diff --git a/packages/SwingSet/test/kernel.test.js b/packages/SwingSet/test/kernel.test.js index f18a0720ff3..ef17e333a34 100644 --- a/packages/SwingSet/test/kernel.test.js +++ b/packages/SwingSet/test/kernel.test.js @@ -1817,3 +1817,46 @@ test('reap gc-krefs 12', async t => { test('reap gc-krefs overrideNever', async t => { await reapGCKrefsTest(t, 12, true); }); + +test('injectQueuedUpgradeEvents', async t => { + const endowments = makeKernelEndowments(); + const { kernelStorage } = endowments; + const { kvStore } = kernelStorage; + await initializeKernel({}, kernelStorage); + const kernel = buildKernel(endowments, {}, {}); + await kernel.start(); + // TODO: extract `queueLength`/`dumpQueue` from kernelKeeper.js? + { + const [start, end] = JSON.parse(kvStore.get('acceptanceQueue')); + t.deepEqual([start, end], [1, 1]); + } + + // These events would fail if they actually got processed: the vatID + // and all the krefs are fake, the methargs are not valid capdata, + // and we haven't updated refcounts. But all we care about is that + // they make it onto the acceptance queue. + const vatID = 'v1'; + const e1 = harden({ type: 'notify', vatID, kpid: 'kp25' }); + // kernelKeeper.incrementRefCount(kpid, `enq|notify`); + const target = 'ko11'; + const methargs = ''; + const msg = harden({ methargs, result: 'kp33' }); + const e2 = harden({ type: 'send', target, msg }); + kvStore.set('upgradeEvents', JSON.stringify([e1, e2])); + + kernel.injectQueuedUpgradeEvents([e1, e2]); + + { + const [start, end] = JSON.parse(kvStore.get('acceptanceQueue')); + t.deepEqual([start, end], [1, 3]); + t.deepEqual(JSON.parse(kvStore.get('acceptanceQueue.1')), e1); + t.deepEqual(JSON.parse(kvStore.get('acceptanceQueue.2')), e2); + } + + // It'd be nice to also check that the stats were incremented: + // stats.acceptanceQueueLength shoule be 2, acceptanceQueueLengthUp + // should be 2, and acceptanceQueueLengthMax should be 2. But the + // stats are held in RAM until a crank is executed, and we're not + // doing that here. And the kernel hides the kernelKeeper so we + // can't call getStats() to check. +}); diff --git a/packages/SwingSet/test/state.test.js b/packages/SwingSet/test/state.test.js index 8a5076b14a6..16450ab7b17 100644 --- a/packages/SwingSet/test/state.test.js +++ b/packages/SwingSet/test/state.test.js @@ -1109,12 +1109,15 @@ test('dirt upgrade', async t => { // it requires a manual upgrade let k2; + let ks2; { const serialized = store.serialize(); const { kernelStorage } = initSwingStore(null, { serialized }); - upgradeSwingset(kernelStorage); + const { modified } = upgradeSwingset(kernelStorage); + t.true(modified); k2 = makeKernelKeeper(kernelStorage, CURRENT_SCHEMA_VERSION); // works this time k2.loadStats(); + ks2 = kernelStorage; } t.true(k2.kvStore.has(`kernel.defaultReapDirtThreshold`)); @@ -1157,6 +1160,12 @@ test('dirt upgrade', async t => { t.deepEqual(JSON.parse(k2.kvStore.get(`${v3}.options`)).reapDirtThreshold, { never: true, }); + + { + // upgrade is idempotent + const { modified } = upgradeSwingset(ks2); + t.false(modified); + } }); test('v2 upgrade', async t => { @@ -1177,15 +1186,24 @@ test('v2 upgrade', async t => { // it requires a manual upgrade let k2; + let ks2; { const serialized = store.serialize(); const { kernelStorage } = initSwingStore(null, { serialized }); - upgradeSwingset(kernelStorage); + const { modified } = upgradeSwingset(kernelStorage); + t.true(modified); k2 = makeKernelKeeper(kernelStorage, CURRENT_SCHEMA_VERSION); // works this time k2.loadStats(); + ks2 = kernelStorage; } t.true(k2.kvStore.has(`vats.terminated`)); t.deepEqual(JSON.parse(k2.kvStore.get(`vats.terminated`)), []); t.is(k2.kvStore.get(`version`), '3'); + + { + // upgrade is idempotent + const { modified } = upgradeSwingset(ks2); + t.false(modified); + } }); diff --git a/packages/SwingSet/test/upgrade-swingset.test.js b/packages/SwingSet/test/upgrade-swingset.test.js index a14e409b83e..3ef59ccf995 100644 --- a/packages/SwingSet/test/upgrade-swingset.test.js +++ b/packages/SwingSet/test/upgrade-swingset.test.js @@ -1,4 +1,3 @@ -/* eslint-disable no-underscore-dangle */ // @ts-nocheck // eslint-disable-next-line import/order @@ -162,7 +161,8 @@ test('upgrade kernel state', async t => { upgradeSwingset(kernelStorage); // now we should be good to go - const _controller = await makeSwingsetController(kernelStorage); + const controller = await makeSwingsetController(kernelStorage); + controller.injectQueuedUpgradeEvents(); t.true(kvStore.has('kernel.defaultReapDirtThreshold')); // the kernel-wide threshold gets a .gcKrefs (to meet our upcoming @@ -245,7 +245,8 @@ test('upgrade non-reaping kernel state', async t => { upgradeSwingset(kernelStorage); // now we should be good to go - const _controller = await makeSwingsetController(kernelStorage); + const controller = await makeSwingsetController(kernelStorage); + controller.injectQueuedUpgradeEvents(); t.true(kvStore.has('kernel.defaultReapDirtThreshold')); t.deepEqual(JSON.parse(kvStore.get('kernel.defaultReapDirtThreshold')), { @@ -374,7 +375,7 @@ test('v3 upgrade', async t => { } kvStore.set('acceptanceQueue', JSON.stringify([accHead, accTail])); - let stats = JSON.parse(kvStore.get('kernelStats')); + const stats = JSON.parse(kvStore.get('kernelStats')); stats.runQueueLength += runQueue.length; stats.runQueueLengthUp += runQueue.length; stats.runQueueLengthMax = runQueue.length; @@ -394,8 +395,10 @@ test('v3 upgrade', async t => { // upgrade it upgradeSwingset(kernelStorage); + // now we should be good to go - const _controller = await makeSwingsetController(kernelStorage); + const controller = await makeSwingsetController(kernelStorage); + controller.injectQueuedUpgradeEvents(); // check state by mutating our dumped copy and then comparing // against a new dump @@ -415,12 +418,16 @@ test('v3 upgrade', async t => { const np5 = JSON.stringify({ type: 'notify', vatID, kpid: p5 }); data[`acceptanceQueue.${tail - 1}`] = np5; - // stats are updated with the queue changes - stats = JSON.parse(data.kernelStats); - stats.acceptanceQueueLength += 3; - stats.acceptanceQueueLengthUp += 3; - stats.acceptanceQueueLengthMax = stats.acceptanceQueueLength; - data.kernelStats = JSON.stringify(stats); + // note: the in-RAM copy of kernelStats will have the elevated + // acceptance-queue counters, but these are not written to the + // kvStore until a crank is executed, so the data we're comparing + // won't see them + // + // stats = JSON.parse(data.kernelStats); + // stats.acceptanceQueueLength += 3; + // stats.acceptanceQueueLengthUp += 3; + // stats.acceptanceQueueLengthMax = stats.acceptanceQueueLength; + // data.kernelStats = JSON.stringify(stats); // the refcounts should now be one larger, because of the queued // notifies