Liveslots finalizers run metered #6795

mhofman · 2023-01-15T19:58:37Z

Describe the bug

I don't see any code disabling / or restoring metering around finalizers execution.

Since metering differences may result in execution differences through the run policy (and in the future will be directly checked to be in consensus with #6770), we should be careful to exempt gc related tasks from metering. Conceptually related to #6784, but differently observable.

I have confirmed from looking at the xsnap-worker code that finalizers run after draining of the promise queue, but before executing any setImmediate callbacks. The callbacks themselves don't disable the metering, and I'm not sure if the callback invocation itself may not increment the meter.

Of course the finalizer itself should strive to not produce an observably different execution, which is currently covered by #6760.

The text was updated successfully, but these errors were encountered:

mhofman · 2023-01-15T20:39:45Z

A possible solution is to:

Update xsnapPlatform.c to capture and restore the meter around executing FR callback jobs.
In the supervisor specific to xsnap-worker, wrap FinalizationRegistry so that the callback is run unmetered to reflect the meter disablement of the host
For good measure, wrap WeakRef similarly and assert that any deref() call occurs while unmetered.

warner · 2023-01-24T20:15:29Z

If we add computrons usage to consensus (perhaps via the transcript holding the delivery results), this becomes even more important.

warner · 2023-04-12T20:06:26Z

@mhofman and I discussed this yesterday. I'm kinda inclined to WONTFIX this, and stop trying to make our metering insensitive to GC.

the original goal was to avoid overconstraining validators: allow them to run slightly-different versions of XS
- we're clearly not allowing that: the XS/xsnap they get is a well-defined component/dependency of the kernel they're using, and we aren't trying to permit them to each run a different kernel
the secondary goal was to tolerate variations during replay (if we upgrade validators to a new xsnap, their existing transcripts will be replayed under the new version, and differences could cause problems)
- the biggest problem was syscall divergence, because getting the syscall results out of the transcript depends upon them happening in the correct order
  - we think that's been resolved
- the second problem is knife-edge effects on metering limits: the new xsnap must complete any delivery that the transcript recorded as successful in the old xsnap
  - if the new xsnap consumes more computrons or memory, and the old delivery was close to the edge, then this will fail, and panic the kernel upon restart
  - "don't do that"
- the third problem is differences in recorded metering data (i.e. what we charge the vat for)
  - but we'll ignore that during replay: you only get charged during the initial delivery
a tertiary goal was to improve the chances of our hail-mary "sleeper-agent/replay-style" upgrade idea, in which we must replay a whole incarnation's worth of transcripts (or maybe even all incarnations), and syscall variation would make that harder
- this is exactly the sort of situation where we'd want to run under a significantly different version of XS
  - like one which was not compatible with existing heap snapshots
- we would expect significantly different metering and allocation behavior from the new XS
  - so the upgrade plan would need to tolerate that, rather than expecting the vat to hide it
- if we replay the entire history (all incarnations), then we'd build a new vatstore anyways
  - so we'd extract deliveries and send/resolve syscalls from the transcript for replay
  - but we'd build a new transcript that includes the new vatstore calls it makes

We put a bunch of energy into making the computron usage not be affected by code running in GC-sensitvity situtations, but as this ticket notes, that work wasn't complete. And we haven't put any energy into trying to make allocation counters be unaffected (and it's probably impossible, since allocation affects the future much more than a computron counter).

So I'm not sure it's worth trying to hide that usage. And it would simplify the code a fair bit if we removed the unmetered wrappers.

warner · 2023-04-24T17:55:36Z

@mhofman and I agreed to move this out of Vaults, while we think about how to handle GC metering in general

mhofman added bug Something isn't working SwingSet package: SwingSet vaults_triage DO NOT USE labels Jan 15, 2023

mhofman mentioned this issue Jan 15, 2023

does finalizeDroppedCollection() make syscalls sensitive to GC? #6760

Closed

warner assigned mhofman Jan 24, 2023

warner mentioned this issue Jan 24, 2023

does XS allow finalizers to be called outside of gcAndFinalize? #3810

Closed

toliaqat added the liveslots requires vat-upgrade to deploy changes label Sep 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Liveslots finalizers run metered #6795

Liveslots finalizers run metered #6795

mhofman commented Jan 15, 2023

mhofman commented Jan 15, 2023

warner commented Jan 24, 2023

warner commented Apr 12, 2023

warner commented Apr 24, 2023

Liveslots finalizers run metered #6795

Liveslots finalizers run metered #6795

Comments

mhofman commented Jan 15, 2023

Describe the bug

mhofman commented Jan 15, 2023

warner commented Jan 24, 2023

warner commented Apr 12, 2023

warner commented Apr 24, 2023