change virtual-object state LRU cache: write-through, write-back at end-of-crank, or remove entirely #6693

warner · 2022-12-17T22:25:56Z

What is the Problem Being Solved?

As listed in #6650, one blocker for removing the stopVat from the vat upgrade process is the write-back nature of the virtual object manager's LRU cache. This cache holds the state of virtual objects, but does not write changes back to the DB immediately. Instead, it waits until the object is evicted from the cache (by the loading of some newer object). The cache is also flushed explicltly during a dispatch.bringOutYourDead operation. We (now, see #6604) flush this during stopVat too, since otherwise the last few virtual-object modifications by a vat would be lost during the upgrade process.

@FUDCo and I aren't sure that the write-back cache is worth the 1: risk of bugs like #6604, 2: debugging/testing complication (the vatstore write you're looking for doesn't happen right away, and requires several dummy writes to elicit), and 3: the need for a flush before vat upgrade.

We figured that turning it into a write-through cache would be a decent compromise.

Description of the Design

When a virtual object is first referenced (specifically when something reads from state.${propname}), a vatstoreGet is used to read the data into the LRU cache, and it remains there until evicted. However, every time the state is modified, the VOM does an immediate vatstoreSet to write the changes to DB (leaving the modified state in the cache, for later reading).

A performance optimization would be to queue the writeback for later in the delivery (with Promise.resolve().then(doWriteback)). That might allow multiple changes to the same state to be collapsed into a single vatstoreSet, for code that does something like:

  state.foo = 1;
  state.bar = 2;

The foo write would modify the state in the LRU cache item, setting the dirty flag, and enqueue the first call to doWriteback. Then, in the same turn, the bar write would further modify the state, again set the dirty flag, and enqueue a second call to doWriteback (or maybe it checks the dirty flag first and doesn't enqueue additional calls). In a later turn, but still in the same crank/delivery, the doWriteback gets executed: it writes the data with vatstoreSet, and clears the dirty flag. If later calls to doWriteback happen, they notice the missing dirty flag and do no work.

Security Considerations

No security consequences (userspace doesn't get to see the LRU cache), but this should increase the reliability of virtual-object changes, particularly in the face of a vat malfunction that requires a vat upgrade to recover from.

Test Plan

The existing unit tests will be modified to match the new behavior. They are probably already sufficient to check that a DB write happens right away.

Relation to stopVat changes

Having a write-through cache will mean stopVat no longer needs to rely upon the explicit cache flush performed in BOYD. It may then become safe to remove the BOYD call from stopVat. However, BOYD also does GC, which might release durable objects. If we remove the stopVat BOYD, we might find a few durable objects are not released before the upgrade, and they might not be releasable after the upgrade until we implement a full mark+sweep GC (which we've deferred until some future software upgrade). So in the short term, this might result in more objects being retained. Note that this would be visible to other vats: durable objects might be recognizable (used as WeakMap keys) in other vats, and without a final BOYD, those keys might stick around forever.

The text was updated successfully, but these errors were encountered:

mhofman · 2023-01-13T21:17:04Z

Very much in favor of this! Didn't realize we were caching writes.

A performance optimization would be to queue the writeback for later in the delivery (with Promise.resolve().then(doWriteback)).

I guess it depends the granularity of batching we want. If you want to batch synchronous writes (aka all writes happening in a single turn), then a promise based queue is appropriate. However we may want to consider batching at the crank level since liveslots is in a privileged position and is aware of crank boundaries.

warner · 2023-01-20T20:33:03Z

I spent 15 minutes trying to reconstruct my understanding of the LRU cache and the innerSelf scheme, to see how we think we're maintaining determinism of vatstoreGet syscalls in the face of GC, and I failed. But I stepped back to think about it from first principles:

userspace creates a new virtual object, holds a Representative for a while, maybe exports it, maybe stores it in virtual storage, and then drops it (it becomes UNREACHABLE, in our jargon)
the Representative may or may not get collected (remaining in UNREACHABLE, or moving to COLLECTED, in our jargon)
now either a inbound delivery appears which targets the virtual object, or some other delivery provokes userspace to pull the vobj out of storage, so we need a Representative, and we either discover the existing one, or we reanimate a new one
- if the former, liveslots immediately invokes a method on the vobj
- if the latter, userspace might or might not invoke some methods on the vobj
those methods might read properties off state
- (or, if the vobj behavior is unwise, it might have shared its state externally, in which case non-vobj-behavior-methods might read properties off state, but holding state will also hold the Representative, since we maintain a link between them)
to satisfy those state.PROPNAME reads, we need data, and that data lives in the DB
state.PROPNAME writes create new data, and that data needs to make it to the DB eventually

And we need the sequence of transcript-recorded vatstoreGet/Set syscalls to be a deterministic function of the deliveries to the vat, and of userspace behavior. To maintain our goals of GC insensitivity, that sequence must not be sensitive to the presence or absence of the Representative at the time userspace fetches the vobj (moving it from UNREACHABLE to REACHABLE).

The simplest way to achieve this is:

creating a Representative should not involve any syscalls
- the vref should tell us everything we need: KindID (and Kinds are low-cardinality, so we can hold their metadata in RAM) and instance ID
every state.PROPNAME read causes a vatstoreGet
every state.PROPNAME write causes a vatstoreSet
as a result, there's no point in caching anything

The sequence of state reads/writes is deterministic. So we're allowed to modify this sequence if it remains purely a function of those reads/writes. For example, if we're ok with digging deeper into the hole of "liveslots does cleanup work at the end of every delivery", then we could say:

only the first state.PROPNAME read within a crank causes a vatstoreGet, and we cache the value in RAM for the rest of the crank
any state.PROPNAME writes are buffered until end-of-crank, and then produces a single vatstoreSet
(of course the writes are used to satisfy the reads: this is a write-back cache, flushed at the end of each crank)

If a Representative is used, then dropped and collected in the middle of a crank, then reanimated and used again, all within that same crank, we'd keep using the state data fetched the first time, even though we dropped the Representative. That means liveslots would hold a Map<vref, state> and a dirty=Set<vref>, which are empty between deliveries.

We (used to?) worry about deserialization being observable to userspace, and had protections in place to prevent this. The one I'm remembering retained the raw (still serialized) state data in the cache, but re-invoking unserialize each time userspace accessed it. The new virtual-object Kind construction process might not be vulnerable to this: no user-provided function is invoked each time we create a Representative. If so, then we can probably deserialize the state data once (per crank), the first time userspace reads from state.PROPNAME, and cache the serialized form. That extends the lifetime of the Representatives and Presences inside state a little bit (not past the end of the crank), but probably has useful performance benefits.

Note that currently we store the entire state in a single vatstore key (vom.o+${kindID}/${instanceID}), even though we have separate getter/setter pairs for each state.PROPNAME therein. Userspace generally accesses one property at a time, and the values are always hardened (so e.g. state = 4 is invalid, state.foo.bar = 4 is invalid, but state.foo = { ...state.foo, bar: 4 } is ok). We decided that this coarse serialization granularity was fine to start with, and if we do the once-per-crank caching scheme then it should continue to be so, but if we weren't caching, we might want to break the state out into a separate vatstore key for each property, like vom.o+${kindID}/${instanceID}/${propname}.

Also note that this would be a virtual-object-state -specific cache, not a general-purpose vatstore cache. A general-purpose cache would need to e.g. make vatstoreGetNext work properly even if priorKey (or some subsequent key) were in the cache, making it a lot more complicated. Fortunately we never iterate through the entire virtual-object-state cache, except at end-of-crank to flush it, so we don't use getNext on the vom.${vref} keyspace.

warner · 2023-01-20T21:59:44Z

I think that'll work.. we currently use afterDispatchActions() to flush the IDCounters table, which is cached in a similar way.

agoric-sdk/packages/SwingSet/src/liveslots/liveslots.js

Lines 1563 to 1565 in c6f074c

    
           function afterDispatchActions() { 
        
             flushIDCounters(); 
        
           }

We do a vatstoreGet('idCounters') during startVat via a function named initializeIDCounters, with an extra call from allocateNextID if (for some reason, probably never) it isn't already in RAM. We hold onto these counters in RAM forever, but to make sure a subsequent incarnation can avoid re-allocating old export vrefs, we must flush them, and we use afterDispatchActions() for that purpose.

So I think we could have virtualObjectManager return a flushVOState method to liveslots, which could call it from afterDispatchActions(). The VOM would maintain a Set of baseref->unserializedState mappings, along with a set of dirty baserefs. We'd build a write-back cache (more like a buffer) which uses that Set for the cache, calls vatstoreGet to populate, and has a flush that does vatstoreSet for everything that's dirty (and then clears everything, so nothing lingers to influence syscall behavior on the next delivery). Each getter would read from this cache (possibly triggering a vatstoreGet, but only on the first access per crank), setters would write into it (not triggering a vatstoreGet, not triggering a vatstoreSet, but replacing any existing data, which would be pretty common but not inevitable).

Then we get rid of the existing LRU cache and probably some of the innerSelf mechanics. The unserializedState object could be used as a GC sensor if userspace holds on to it, so we need to add a link from it to the Representative (like we do already), and we need to think through the unweakable thing again, but probably we can treat it like the earlier state or innerSelf objects.

FUDCo · 2023-01-20T22:03:34Z

Setting aside for a moment issues of code complexity vs. correctness & maintainability that the implementation of any kind of caching scheme introduces, I think the issue of determinism vs. caching is a bit of a red herring. In particular, I'm unclear what motivates the concern for crank boundaries in the narrative here. If we take the always-read-on-get, always-write-on-set idea as the baseline (which, now that you've articulated it, seems like a really good baseline for analysis, in the why-weren't-we-always-thinking-about-it-this-way? sense), then any amount of LRU-style caching that is driven solely by the vat code's own access patterns should be fine from a determinism perspective -- as long as the underlying sequence of get/set operations is deterministic, then the resulting sequence of reads and writes will also be deterministic, regardless of where the crank boundaries lie. GC enters the analysis if there are additional reads or writes that are triggered by things being swapped in and out based on local-but-not-global reachability determinations that come up if something can be local garbage but not global garbage. If we take as a policy position that we aren't going to have any of those, then I think we can be as sophisticated with our caching scheme as performance metrics push us to be (that said, I think the existing scheme is already too complicated for our own good).

warner · 2023-01-20T22:58:09Z

Yeah, I agree that the baseline is:

every x = state.propname read (i.e. getter call) is a vatstoreGet
every state.propname = y write (i.e. a setter call) is a vatstoreSet

which should be entirely insenstive to GC.

Trying to avoid some of those reads depends upon holding the data in RAM, and since we must limit the amount of RAM usage, we must eventually remove things from RAM, which means re-fetching it on the next state read. One policy could be just e.g. "hold 5 state objects in RAM, LRU cache, each getter moves the baseref to the head of the list", which would yield a weird-but-deterministic mapping from userspace behavior to syscall behavior. It wouldn't be great for debugging, you'd look at the syscall trace and the userspace code and be hard pressed to explain the correlation, but at least it would yield fewer redundant syscalls, and not violate our gc-sensitivity rules.

So yeah I guess I'm jumping ahead to picking a compromise point between debuggability and performance. "First state read of a crank gets a vatstoreGet, none of the others do" would be fairly easy to explain. It would suffer from high RAM usage if e.g. some code loops through a whole lot of virtual objects in a single crank.

I guess we need some way to estimate how frequently we get a batch of reads/writes for the same vobj state in a single crank, to estimate what kind of performance difference we could anticipate.

mhofman · 2023-01-21T00:49:45Z

we either discover the existing one, or we reanimate a new one

if the former, liveslots immediately invokes a method on the vobj

if the latter, userspace might or might not invoke some methods on the vobj

Ugh? Why would userspace get to do something only in the case where the vobj is re-animated? Did I misunderstand something?

creating a Representative should not involve any syscalls

the vref should tell us everything we need: KindID (and Kinds are low-cardinality, so we can hold their metadata in RAM) and instance ID

There is an alternative "simple" approach: always trigger a kind metadata lookup, and do not hold anything in ram.

I'm unclear what motivates the concern for crank boundaries in the narrative here

I was very confused by this as well, but seeing the subsequent reply, I understand the benefits from a debuggability point of view. However it may be an expensive trade-off to make performance-wise.

If we take the always-read-on-get, always-write-on-set idea as the baseline (which, now that you've articulated it, seems like a really good baseline for analysis, in the why-weren't-we-always-thinking-about-it-this-way? sense), then any amount of LRU-style caching that is driven solely by the vat code's own access patterns should be fine from a determinism perspective -- as long as the underlying sequence of get/set operations is deterministic, then the resulting sequence of reads and writes will also be deterministic

That's what I tried to express the other day, but looks like I failed at explaining myself. Happy to see we're converging :)

I guess we need some way to estimate how frequently we get a batch of reads/writes for the same vobj state in a single crank, to estimate what kind of performance difference we could anticipate.

So one way to mitigate any performance impact of extra syscalls, yet keep full debuggability of user code (not liveslots, but hey it's not like we get much insight into its behavior from the outside even today) would be to passthrough syscalls on all vatstore read/writes but have the xsnap process + liveslots be able to not block on the result of these and keep going if the result is in the cache. Then no matter the caching strategy, the only observable impact would be on snapshot content, and we could even clear the cache on BOYD calls before snapshotting to mitigate that impact (the caching implementation would still be part of the heap snapshot, so that's probably a step too far).

warner · 2023-01-21T02:11:31Z

we either discover the existing one, or we reanimate a new one

if the former, liveslots immediately invokes a method on the vobj

if the latter, userspace might or might not invoke some methods on the vobj

Ugh? Why would userspace get to do something only in the case where the vobj is re-animated? Did I misunderstand something?

No no, the "former/latter" was referencing the first split in the previous sentence (inbound delivery vs pull vobj out of storage), not the second (discover existing vs reanimation).

Userspace doesn't get to sense whether it was re-animated or pre-existing. I'm just pointing out that sometimes methods get invoked, sometimes they don't, but either way, method invocation (as well as state property reading/writing) is a deterministic function of the deliveries and userspace behavior that we're ok with, and not sensitive to GC activity.

There is an alternative "simple" approach: always trigger a kind metadata lookup, and do not hold anything in ram.

Chip and I were talking about this, and worked out a bit of a spectrum. The baseline is for vobj state reads/writes to perform vatstoreGet/vatstoreSet. And for virtual collections, the |schemata and |label values are needed for methods like .has() / .get / .init / .set (to match against the key/value arguments) , but are not needed to create the Representative. So the baseline would be for all the collection methods to do a vatstoreGet of the collection metadata, every time. No caching, no GC sensitivity, all syscalls are a direct consequence of userspace activity.

But of course we'd feel silly constantly re-reading a pair of metadata keys for every collection operation, especially when the metadata is immutable. We can't hold it in RAM forever because we might have a lot of collections (despite our mental image when designing them, we never instructed userspace authors that collections should be low-cardinality, and given how people are using them, we can't really add that constraint now).

So some sort of compromise is needed. We might start by changing liveslots/VOM/vc to use the "no cache" extreme, but I'm worried about the perf consequences.

I'm unclear what motivates the concern for crank boundaries in the narrative here

I was very confused by this as well, but seeing the subsequent reply, I understand the benefits from a debuggability point of view. However it may be an expensive trade-off to make performance-wise.

Yeah, that's where I'm looking for a sensible compromise, and "write everything / remember nothing at end-of-crank" might be a workable one.

would be to passthrough syscalls on all vatstore read/writes but have the xsnap process + liveslots be able to not block on the result of these and keep going if the result is in the cache.

Eeeargh. I don't think I like that: syscalls that sometimes block, sometimes don't, sounds like a recipe for interleaving surprises, and confusion about whether a syscall-response is for syscall-1 or syscall-2. A vatstoreGet that actually does need the results would have to wait for all earlier didn't-need-results requests to retire, which feels like it might negate a bunch of the hoped-for perf benefits. And the timing of the responses could influence the heap state, so heap snapshots become sensitive to timing races, which would be super nondeterministic.

mhofman · 2023-01-21T03:33:19Z

Eeeargh. I don't think I like that: syscalls that sometimes block, sometimes don't, sounds like a recipe for interleaving surprises, and confusion about whether a syscall-response is for syscall-1 or syscall-2.

I think I didn't explain myself correctly. Maybe another mental model is needed here: assume syscalls were asynchronous. Currently making the syscall would send the data over the wire, then await the result. What I'm suggesting is that it'd send the data over the wire, but drop the result promise, and instead resolve with the content of the cache. This could also be considered the equivalent of racing the wire result with a cache result. Do we agree that from that point of view, doing this is safe and doesn't result in interleaving surprises?

Now of course syscalls are not asynchronous, and the trick is how to make this work. One possibility would be for liveslots to include a flag in the issueCommand that would indicate that xsnap-worker does not need to block after sending the command over the socket, but simply record (increment a counter) that such a "dropped" command was issued. When a blocking issueCommand is performed later (or if the run-loop exits because the crank finished), xsnap-worker would read the number of dropped results out of the socket before proceeding. That's not too complicated and should avoid any interleaving surprises. The only complication of this approach is if the socket buffer gets full and the kernel can no longer send these results, but that's something node can be made to handle fairly easily (aka buffer the write stream infinitely).

A vatstoreGet that actually does need the results would have to wait for all earlier didn't-need-results requests to retire, which feels like it might negate a bunch of the hoped-for perf benefits.

Yes that is true we'd have to block, but whether it negates the benefits would require testing. I personally believe it would still be sufficiently beneficial, and at the end of the day, depends heavily on the kind of caching. A poor caching strategy would be no worse than blocking on all syscalls, which we've agreed is the baseline.

We could easily try this without a cache: mark all vatStoreSet syscalls as non-blocking (since unless we write too much data that always succeeds) and see what kind of improvement we get.

And the timing of the responses could influence the heap state, so heap snapshots become sensitive to timing races, which would be super nondeterministic.

I really don't follow. How would heap state be influenced in a non-deterministic way? The cache would still be deterministic, and it's the only thing that influences whether an issueCommand is blocking or not. We already established there are no timing races.

warner · 2023-03-08T00:04:23Z

I think we have consensus on the "populate an unlimited-size cache during the first userspace access of each delivery, flush all dirty entries and clear all entries at the end of the delivery" approach. I think we can investigate the async/non-blocking vatstore syscalls in a separate ticket.

I've made a start on this in the 6693-vo-lru-cache branch. I've rewritten the cache and it's test, but I haven't glued it into anything yet. The plan is to have two instances of this cache:

the first one maps baseRef to the unserialized/unmarshalled state object, for virtual objects
the second one maps collectionID to a { keyShape, valueShape, label } record, for virtual collections

I'm thinking that virtualObjectManager and virtualCollectionManager will both export a flushCache method to liveslots, who will call it at end-of-crank.

The tricky part will be unwinding the "innerSelf" parts of virtualObjectManager, or at least that's the part I know the least about. And a bunch of overly-sensitive tests will have to change.

warner · 2023-03-08T00:26:03Z

some more hacking on that branch: the collection manager is wired up, and I've made a really incomplete start on wiring up the VOM

This cache accumulates all data until an explicit flush, at which point it writes back all dirty state, and forgets all state. This avoids the somewhat unpredictable behavior of an LRU cache, which caused us problems with GC sensitivity. It suffers from the potential for larger RAM usage between flushes, but is quite simple, and yields easy-to-debug syscall behavior. We have other caches in our platform, but liveslots needs something with these specific properties, and I don't know what the needs in the rest of the platform are, so I'm fine with not sharing this implementation with other packages, at least for now. refs #6693

warner · 2023-03-30T18:40:41Z

note to self: vat-ulrik-1.js can be changed to remove the dummy cache-flushing object once this is fixed

This cache accumulates all data until an explicit flush, at which point it writes back all dirty state, and forgets all state. This avoids the somewhat unpredictable behavior of an LRU cache, which caused us problems with GC sensitivity. It suffers from the potential for larger RAM usage between flushes, but is quite simple, and yields easy-to-debug syscall behavior. We have other caches in our platform, but liveslots needs something with these specific properties, and I don't know what the needs in the rest of the platform are, so I'm fine with not sharing this implementation with other packages, at least for now. refs #6693

erights · 2023-03-31T01:59:58Z

Be aware of Draft #7286 , which I think is a complementary optimization. But I'm not sure.

This cache accumulates all data until an explicit flush, at which point it writes back all dirty state, and forgets all state. This avoids the somewhat unpredictable behavior of an LRU cache, which caused us problems with GC sensitivity. It suffers from the potential for larger RAM usage between flushes, but is quite simple, and yields easy-to-debug syscall behavior. We have other caches in our platform, but liveslots needs something with these specific properties, and I don't know what the needs in the rest of the platform are, so I'm fine with not sharing this implementation with other packages, at least for now. refs #6693

warner · 2023-04-12T18:51:22Z

This landed in PR #7138 (in 8e6a425). Changes to make the virtual-collection schema use a similar cache (#6360) landed in PR #7333 (in 24de1b6). So I can close this now.

warner added enhancement New feature or request SwingSet package: SwingSet labels Dec 17, 2022

warner mentioned this issue Dec 19, 2022

remove stopVat(): allow vat upgrade without participation of old vat version #6650

Closed

ivanlei added the vaults_triage DO NOT USE label Dec 20, 2022

warner mentioned this issue Jan 13, 2023

Suspected liveslots sensitivity to engine allocations #6784

Open

warner changed the title ~~change virtual-object state LRU cache to be write-through~~ change virtual-object state LRU cache: write-through, write-back at end-of-crank, or remove entirely Jan 24, 2023

warner mentioned this issue Jan 24, 2023

increase swingset DEFAULT_VIRTUAL_OBJECT_CACHE_SIZE #2644

Closed

warner added the liveslots requires vat-upgrade to deploy changes label Jan 24, 2023

ivanlei modified the milestones: Vaults RC0, Vaults Functional Testing, Vaults EVP Feb 1, 2023

warner self-assigned this Mar 8, 2023

warner mentioned this issue Mar 9, 2023

rewrite VOM LRU cache #7138

Merged

mhofman mentioned this issue Mar 18, 2023

Mainnet block 9074082 took 2-6 minutes to process #7185

Open

erights mentioned this issue Mar 31, 2023

refactor(swingset-liveslots): stateShape causes accessor sharing #7286

Closed

warner closed this as completed Apr 12, 2023

erights mentioned this issue Apr 18, 2023

refactor: stateShape causes accessor sharing #7444

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

change virtual-object state LRU cache: write-through, write-back at end-of-crank, or remove entirely #6693

change virtual-object state LRU cache: write-through, write-back at end-of-crank, or remove entirely #6693

warner commented Dec 17, 2022

mhofman commented Jan 13, 2023

warner commented Jan 20, 2023 •

edited

Loading

warner commented Jan 20, 2023

FUDCo commented Jan 20, 2023 •

edited

Loading

warner commented Jan 20, 2023

mhofman commented Jan 21, 2023

warner commented Jan 21, 2023

mhofman commented Jan 21, 2023

warner commented Mar 8, 2023

warner commented Mar 8, 2023

warner commented Mar 30, 2023

erights commented Mar 31, 2023

warner commented Apr 12, 2023

change virtual-object state LRU cache: write-through, write-back at end-of-crank, or remove entirely #6693

change virtual-object state LRU cache: write-through, write-back at end-of-crank, or remove entirely #6693

Comments

warner commented Dec 17, 2022

What is the Problem Being Solved?

Description of the Design

Security Considerations

Test Plan

Relation to stopVat changes

mhofman commented Jan 13, 2023

warner commented Jan 20, 2023 • edited Loading

warner commented Jan 20, 2023

FUDCo commented Jan 20, 2023 • edited Loading

warner commented Jan 20, 2023

mhofman commented Jan 21, 2023

warner commented Jan 21, 2023

mhofman commented Jan 21, 2023

warner commented Mar 8, 2023

warner commented Mar 8, 2023

warner commented Mar 30, 2023

erights commented Mar 31, 2023

warner commented Apr 12, 2023

warner commented Jan 20, 2023 •

edited

Loading

FUDCo commented Jan 20, 2023 •

edited

Loading