Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new defineKind API with { state, facets } argument #4905

Closed
warner opened this issue Mar 23, 2022 · 25 comments · Fixed by #4962
Closed

new defineKind API with { state, facets } argument #4905

warner opened this issue Mar 23, 2022 · 25 comments · Fixed by #4962
Assignees
Labels
enhancement New feature or request SwingSet package: SwingSet
Milestone

Comments

@warner
Copy link
Member

warner commented Mar 23, 2022

What is the Problem Being Solved?

To prevent the GC sensitivity of #4892, in today's kernel meeting we came up with a new API for defineKind. The current API looks like:

function init(args) {
  const initialState = {}; // use 'args'
  return initialState;
}
function actualize(state) {
  const oneFacet = { meth1(args) { .. }, meth2(args) { .. } };
  // or
  const facets = {
     facetA: { meth1(args) { .. }, meth2(args) { .. } },
     facetB: { meth1(args) { .. }, meth2(args) { .. } },
  };
  return oneFacet; // or facets
}
function finish(state, cohort) {
  const { facetA } = cohort;
  // do some final wiring
}
const makeFoo = defineKind('foo iface', init, actualize, finish);

In the new API, the third argument to defineKind will be a record of unbound functions, each of which takes an initial argument { state, self } or { state, facets }:

// one facet:
const behavior = {
  meth1: ({ state, self }, ...args) => { .. },
  meth2: ({ state, self }, ...args) => { .. },
};
// multiple facets
const behavior = {
  facetA: {
    meth1: ({ state, facets }, ...args) => { .. },
    meth2: ({ state, facets }) => { .. },
  },
  facetB: {
    meth1: ({ state, facets }) => { .. },
    meth2: ({ state, facets }, ...args) => { .. },
  },
};

In addition, the fourth argument will be an options bag, with the only currently-accepted option being finish.

const makeFoo = defineKind('foo iface', init, behavior, { finish });

The behavior record is examined during defineKind, and a copy is made of the functions it provides (to ensure that userspace does not get control later, during actualization). Later, when makeFoo(args) is invoked and we need to make a new virtual object cohort, the sequence is:

  • call init(args), get back the initial state record
  • create a state object with getters and setters for each property in the initial state record
  • create a "context" object with initial contents { state, facets: {} } (or just { state } for single-facet kinds)
  • for each facet in behavior (or just the one facet):
    • create a new object (this will become the facet)
    • for each function in the behavior record:
      • create a new function (with Function.prototype.bind) that curries the context object as the first argument
      • add that function as a property of the new facet object
    • add the facet to context.facets (or assign the one facet as context.self)
  • build the cohort object, harden everything
  • call finish if defined
  • return the cohort to the user (return value of makeFoo())

The goal is to prevent userspace from getting control during actualization, giving it a way to create per-instance objects that could be used to sense GC. The bound methods (and state, and the context object, and the facet objects within) are all per-instance, but they are all visible to the Kind machinery during actualization, so we can establish WeakMap links from them to the cohort record, which means they'll have the same lifetime, which prevents their use as GC sensors. By having userspace provide a table of functions at makeKind time, rather than executing a userspace actualize function at actualize time, we don't give them any opportunity to make new per-instance objects (or to even sense when actualization is taking place).

I think this will let us safely remove the proForma "call actualize anyways and throw out the result" code in the VOM, which would be lovely.

One small surprise for Kind authors is that their methods will have a different signature than the code which invokes them. The first argument is provided by the Kind machinery.

Kinds which simply want to refer to their state should get it by destructuring the first argument:

const behavior = {
  incr: ({ state }, delta) => { state.count += delta; }
};
// counter.incr(delta)

When a kind needs to refer to other facets, it should grab that from facets:

const behavior = {
  depositFacet: {
    deposit: ({ facets }, payment) => facets.purse.deposit(payment),
  },
  purse: {
    withdraw: ({ state }, amount) => { state.balance -= amount; ... },
    deposit: ({ state }, payment) => { ... },
    getDepositFacet: ({ facets }) => facets.depositFacet,
  },
};

If a single-facet virtual object needs to refer to itself, it should use self:

const behavior = {
  subscribe: ({ state, self }) => state.publisher.subscribe(self),
};

Rejected Variants

We considered method: ({ state, ...facets }) => ..., which would shrink some things like getDepositFacet: ({ depositFacet }) => depositFacet by the length of facets.. This would preclude the use of state as a facet name, and would complicate the addition of new context properties in the future. We decided that referencing other facets was infrequent enough that it wasn't worth the preclusion.

There may be value to internally representing all Kinds as multi-faceted, even those with a single facet. Independent of that, we considered having makeKind always be multi-facet, so authors of single-facet Kinds would need to pick a facet name. This would remove self from the context argument (it would always have facets, even if there was only one property on it). The Kind defintion would then look like:

const behavior = {
  dummyName: {
    method1: ({ state, facets }) => { stuff(state, facets.dummyFacet ...} },
  },
};

and userspace would look like:

const foo = makeFoo(args).dummyName;

either of which were sufficient to disqualify the idea. Instead, defineKind will look at the shape of behavior to decide whether it creates single-facet objects or multi-facet cohorts. The internals might or might not manage both in the same way, but the behavior record and the return value of makeFoo will be different for the two cases.

We considered using JavaScript's special this variable to hold the { state, facets } context object. This would be closer to the usual prototype-based inheritance (possibly identical), but would require behavior be defined with concise method syntax, precluding the use of => arrow functions (which have other benefits). We hope that "Far Classes" will be implemented eventually, which should avoid the surprising extra context argument and also be stored more efficiently.

We decided that finish is used infrequently enough that a reader of a defineKind definition would benefit from seeing its name appear in the options bag, and that this benefit (plus the general benefit of options bags if/when new arguments are added) justified changing the fourth defineKind argument from an optional finish function to an optional { finish } bag.

Security Considerations

Userspace must not get control during actualization, even if the behavior record they provide is a Proxy or has getters. We must enumerate the contents once, during defineKind, and assert that we get plain Function objects from it. We should store those Functions in the Kind record for use during actualization.

We must double-check that the act of binding a Function is not visible to that Function, perhaps through some Proxy trickery that I'm not aware of (does .toString get called for any reason?).

Test Plan

Unit tests, ideally ones that don't pull in all of swingset, but maybe that's the easiest way to do it.

@warner warner added enhancement New feature or request SwingSet package: SwingSet labels Mar 23, 2022
@warner
Copy link
Member Author

warner commented Mar 23, 2022

closes #4892 once implemented

@mhofman
Copy link
Member

mhofman commented Mar 24, 2022

but they are all visible to the Kind machinery during actualization, so we can establish WeakMap links from them to the cohort record, which means they'll have the same lifetime, which prevents their use as GC sensors

No need, the methods are binding the context object, so will hold the representatives alive on their own.

  • build the cohort object,

What does the cohort object contain that the context / facets object doesn't ?

We must double-check that the act of binding a Function is not visible to that Function, perhaps through some Proxy trickery that I'm not aware of (does .toString get called for any reason?).

Good call! Function.prototype.bind reaches for the function's name and length. A proxy of a function would be able to sense this!

But we can work around this, by binding it to nothing a first time during defineKind (aka Reflect.apply(Function.prototype.bind, unsafeMethod, [])). Then at every representative creation we just bind the neutered / partially bound function instead of the original potentially unsafe one.

Btw, that also disqualifies using this to pass the context, at least if we want to use the native bind to implement this feature. We can always do a user land bind by closure:

const bind = (fn, context) => (...args) => Reflect.apply(fn, null, [context, ...args]);

We should think about restoring length in the custom bind case, and restoring name in both cases (it'd have no name in the custom bind case, and bound bound originalName in the native bind case)

@warner
Copy link
Member Author

warner commented Mar 24, 2022

but they are all visible to the Kind machinery during actualization, so we can establish WeakMap links from them to the cohort record, which means they'll have the same lifetime, which prevents their use as GC sensors

No need, the methods are binding the context object, so will hold the representatives alive on their own.

I'm seeing two problems, and I only see a solution to one of them.

The first is using === on the bound method as the predicate.

If userspace holds a bound method foo, that will keep the unbound method and the context object alive. The context object will keep state and the facets record alive. The values of facets will keep all facets alive. So if they hold onto foo for their comparison any new deserialization of the vref will get them the same (still-living) cohort and Representative, and the comparison will always be true.

Holding foo won't keep bar alive. I was originally worried that they might hold foo but compare bar, but of course they can't use === for that without holding bar. I was thinking we should point from the cohort to all the bound methods to protect against this, but now I don't think that helps anything.

The second problem is using a WeakMap/WeakSet as a predicate, and I don't see an answer.

My concern is if userspace puts a bound method foo into a WeakMap, drops everything else, get a (maybe new) Representative by unserializing some virtual data, pulls the corresponding bound method out of the new Representative, and then queries the WeakMap with the (maybe new, maybe old) bound method foo.

We blocked this for Presences and Representatives by changing WeakMap/WeakSet, so instance 1 and 2 of a Presence (for the same vref) are treated as the same key. This also lined up with virtual storage, since all the objects that need this sort of mapping also have vrefs.

But bound methods foo don't have vrefs. Our current VirtualObjectAwareWeakMap would treat them as normal ("precious") objects, and instance 1 and 2 would be distinct keys. Userspace could probe instance 2 without holding instance 1, and that would enable GC sensitivity.

Come to think of it, the state object enables this problem too. Making it have the same lifetime as the facets protects us against the === predicate, but doesn't help with the WeakSet predicate.

One (bad) fix would be to hack WeakMap/WeakSet to forbid all per-instance objects that come out of Kinds, both state and the context object and the facets object within that, and all of the bound methods. Just throw an error if you try to use them as a key.

Another (worse) fix would be to figure out how to assign vrefs to all those things. I could vaguely imagine a scheme to do that for the bound methods, if I squint hard, but I'd be hard pressed to come up with a sensible one for the state object, or the other internal objects which don't normally need names but are still per-instance.

@warner warner changed the title new makeKind API with { state, facets } argument new defineKind API with { state, facets } argument Mar 24, 2022
@mhofman
Copy link
Member

mhofman commented Mar 24, 2022

One (bad) fix would be to hack WeakMap/WeakSet to forbid all per-instance objects that come out of Kinds, both state and the context object and the facets object within that, and all of the bound methods. Just throw an error if you try to use them as a key.

Meh, I think that's a fine approach. And the dirty way to do this is to brand all these objects with a symbol that WeakMap/WeakSet can recognize (as long as these objects are all frozen, and created by us, which I believe is the case)

@warner
Copy link
Member Author

warner commented Mar 24, 2022

Yeah, I think I agree. @erights would be interested to hear about your comfort level with that.

@FUDCo
Copy link
Contributor

FUDCo commented Mar 24, 2022

I propose the brand be called unweakable just because.

@erights
Copy link
Member

erights commented Mar 24, 2022

Since you're just "taking the temperature" at the moment, I'll say I do not like it. I am very uncomfortable with it.

@mhofman
Copy link
Member

mhofman commented Mar 24, 2022

Which part, the dirty symbol, or preventing these objects from being used as weak keys?
If the performance of XS with WeakSet is acceptable, we can forget about a dirty unweakable symbol.

@erights
Copy link
Member

erights commented Mar 24, 2022

preventing these objects from being used as weak keys?

That part. I'm much less concerned about how it is implemented.

@Tartuffo Tartuffo added this to the Mainnet 1 milestone Mar 24, 2022
@FUDCo
Copy link
Contributor

FUDCo commented Mar 24, 2022

I think I figured out how to take care of the problem Brian identified. The trick does involve being able to identify the unweakable objects, though it doesn't care if that's done via a symbol or a WeakSet or some other way. We modify VirtualObjectAwareWeakMap and VirtualObjectAwareWeakSet similar to how Brian suggested above, but rather than throwing an exception when somebody tries to insert an unweakable object as a key, we go ahead and insert it like any other key, but we also insert it into a separate Set that we keep on the side (a regular strong set). If they delete the key, we also delete it from this Set. If the containing collection is itself GC'd, the side Set gets GC'd as part of that and if it contains the last surviving reference to the unweakable object then it in turn becomes eligible for GC, but now in a state where nobody is left in a position to see it happening.

This can be done adding only a tiny bit of code to VirtualObjectAwareWeakMap and VirtualObjectAwareWeakSet. If we make creation of the side Set lazy (i.e., only bother to allocate it the first time somebody tries to actually store an unweakable in the associated collection) then in the overwhelming majority of cases where nobody is trying to do anything tricky it adds essentially no overhead to the virtual object aware collection instances.

I think the only open design question this leaves is the actual mechanism to mark an object as unweakable (and, I guess, being correct about which of the various objects involved in the constellation of objects making up a VO actually qualify as problematic and thus need to be so marked). I think I lean towards the WeakSet approach, but I'll leave that to those who have thought more deeply about the tradeoffs involved.

@warner
Copy link
Member Author

warner commented Mar 24, 2022

Hm, ok, so in that case unweakable means "if you use this as a key in a WeakMap, we'll keep a strong reference to it anyways", as opposed to the earlier meaning of "we reject attempts to use this as a key in a WeakMap". Gotcha.

Does that help with attack sketched out in #4892 (comment) , namely:

const ws = new WeakSet();
function one(pres) {
  ws.add(pres);
  ws.add(Object.getPrototypeOf(pres));
}
// time passes, GC happens or does not
// caller sends the same vref
function two(pres) {
  ws.has(pres); // always true, because WeakSet is vref-aware
  if (ws.has(Object.getPrototypeOf(pres)) { // varies
    console.log(`GC did not happen`);
  } else {
    console.log(`GC *did* happen`);
  }
}

Oh.. ok, to mount the attack they must put the extra object in a WeakMap, and if they do that, the extra object is held strongly, which means the primary object (Presence/Represenative) is also held strongly, so the second time we deserialize, they get back the same Presence/Representative as they did before, which means they get back the same extra object. Nice! I'll have to leave my devious hat on for a while longer and see if I can come up with another vector, but I think that'll work.

@mhofman
Copy link
Member

mhofman commented Mar 24, 2022

a separate Set that we keep on the side

What holds the Set? The VirtualObjectAwareWeakMap/VirtualObjectAwareWeakSet instance? That means we leak memory when those objects are used as Weak keys?

I'm concerned about such silent memory leaks, especially when I believe there is a way to make it not leak.

Once we recognize those objects, likely through a WeakMap populated when representatives are created, we can handle them similarly to the base virtual object / facet recognition in VirtualObjectAwareWeakMap/Set. The trick is to name all the methods, state etc. relative to the base virtual object / facet, e.g. "vo+5/facetName!methodName" or "vo+5![[State]]".

There is a little bit of Hilbert hotel to take care of if using the user provided names, but I believe we can make it work. One version of a non-hilbert based mapping would be to assign an index to every methodName, state, etc. The tricky part in this case is regarding upgrades and durable virtual objects. We need to save that mapping permanently (one per durable kind "brand"), and never remove entries from the mapping if an upgrade removes a method from a durable object (if it's allowed to do so in the first place).

@FUDCo
Copy link
Contributor

FUDCo commented Mar 24, 2022

The Set would be held by the weak collection and hold just that collection's unweakable keys. It's not so much a storage leak as maintaining a fiction that "I guess somebody else must also be holding onto that". I don't think doing some kind of complicated dance with naming things helps, because ensuring non-detectability of GC devolves to hanging onto the same things for the same lifetimes regardless of how that's accomplished.

Also, I'm frankly not worried too much about leakage mostly because I can't think of any good reason (aside from shenanigans) that you'd ever use one of these objects as a weak key. We just want to ensure that nothing breaks if somebody does. Ensuring that nothing breaks seems important, but working hard to ensure that something almost nobody would ever do (and probably should never do) is efficient seems less so.

@warner
Copy link
Member Author

warner commented Mar 24, 2022

a separate Set that we keep on the side

What holds the Set? The VirtualObjectAwareWeakMap/VirtualObjectAwareWeakSet instance? That means we leak memory when those objects are used as Weak keys?

Right. And we'd tell userspace "don't do that, you sneaky fool", but if they do anyways, they still can't sense GC.

I'm concerned about such silent memory leaks, especially when I believe there is a way to make it not leak.

Hm. I'm on the fence about that. It's a tradeoff between that self-induced memory usage and the complexity of defining a consistent name for each of these extra components (and storing them properly).

Let's see, the set of additional components is enumerable at defineKind time, when we see the set of facet names and their method names. We need unique names for everything despite userspace being able to name facets and methods with any legal string (so e.g. hierarchy should put the word "prototype" higher than the facet/method names, and we need a reversible concatenation for (baseref, facet name, method name)). And then we maintain a WeakMap from the component objects to their storage name. We only add (or discover the name is already in) this WeakMap during deserialization (when we make the Representative cohort or Presence), and deserialization is run unmetered, so the metering variation between already-present and not-yet-present is ok.

So the naming question is solvable.

For storage.. when you use a Presence or Representative as a key in a VirtualObjectAwareWeakMap, instead of using a WeakMap keyed by the Presence, we use a strong Map keyed by the vref. This doesn't keep the Presence alive, but the entry (and value) will remain until liveslots (actually virtualReferences.js) does something to explicitly remove it. We have a whole pile of refcounting code to detect when the virtual object is no longer alive (triggered by dispatch.dropExport, or a FinalizationRegistry, or by the virtualized data refcount dropping to zero), and to remove the vref from the strong Map only when all three pillars are gone.

The corresponding approach for these unweakables would be to take the component object's assigned name and use that in the same Map. But our GC machinery doesn't currently know about these additional component objects, so we'd have to define what it means for the abstract object to be deleted, and prune the Map at that same time.

Most of the extra objects can't be Far: the state is disqualified because of its getters, the prototype is (maybe?) disqualified because of the null prototype. The cohort record is pass-by-copy. The bound methods are not currently Far-able but we'd like to fix that (related to #61). If they are all strictly non-Far, then there's no export or virtualized-data pillar for these objects, leaving only the in-RAM pillar.

So let's see, we'd need a FinalizationRegistry for all of them, using the derived name as the "held value"/cookie, and a table that remembers which VirtualObjectAwareWeakMap/Set each one was added to (Map<name, Set<VirtualObjectAwareWeakMap | VOAWeakSet>). When the FR fires, we look up the set of maps/sets and delete the name from each of them. The FR can only fire during BringOutYourDead, so the metering divergence shouldn't be a problem.

We still need the cross-object cycles to ensure that new deserializations get the same extra objects as the original, to block ===.

If we ever make plain functions Far-able.. I guess we don't need an export pillar. The bound method will look more like a Remotable than part of a virtual object facet (and it will get a new vref). It's "precious" and therefore not Durable, but it can still be exported or stored in virtual (but not durable) data. While it's exported or stored that way, our vreffedRemotables Map will keep a strong reference, which means we'll keep the "extra component object -> derived name" WeakMap entry alive, and we'll recognize it correctly when submitted to a VOAWeakMap/SEt.

Ok, so at least so far, this seems sound. I have to say that I'm worried about the extra complexity, and if we can get away with the simpler "don't do that, you're only hurting yourself" instructions, I'd prefer it. Both solutions require tracking all the extra objects. The simpler one needs a global WeakSet, and an auxilliary strong Set inside each VOAWeakMap/Set, and userspace which attempts this trick will consume extra memory (and their weakmap values won't go away when they expect). The larger one needs a global WeakMap, the name derivation code (defensive against userspace format-confusion naming attacks), a global "name to VOAWeakMap/Set that uses it" table, a FinalizationRegistry from particular extra objects to their name, a handler which takes the name and the table and does the deletion, and changes to VOAWeakMap/Set (specifically vrm.vrefKey) to recognize these extra objects and use their name instead.

The tricky part in this case is regarding upgrades and durable virtual objects. We need to save that mapping permanently (one per durable kind "brand"), and never remove entries from the mapping if an upgrade removes a method from a durable object (if it's allowed to do so in the first place).

I don't think we do: these extra objects aren't Durable, so I don't expect the extra names to be seen in the DB at all. The only place I'd expect to see these names are in the WeakMap that assigns them to each extra object, and in the strong Map which holds the VOAWeakMap/Set entries keyed by those objects. So I think upgrade isn't an issue.

@mhofman
Copy link
Member

mhofman commented Mar 24, 2022

I'm not following, I'm probably missing something, but why do we need so much complexity for GC here. Use the name as a key in a Map, and when the base virtual object goes away (because it can only be recognized but not reached, whether remotely or locally), delete the map entry. No need to track each part explicitly, keep the tracking to the base virtual object, since they of the parts form a cohort.

The bound methods and other objects are all things the virtual object system creates, so besides the facets, nothing should be Far and user-space should not be able to make any of them Far (because already hardened).

these extra objects aren't Durable,

But they're related to a base durable object. Basically what I'm saying is that a durable virtual object should remember information about the shape of its facets across versions in case the user decides to change the shape during an upgrade.

@FUDCo
Copy link
Contributor

FUDCo commented Mar 24, 2022

I'm not following, I'm probably missing something, but why do we need so much complexity for GC here.

Which complexity are you referring to? GC for the virtual objects and collections generally (which is admittedly very complicated but also relatively mature as our stuff goes) or the wrinkle we're having to add here to avoid GC visibility (which is sufficiently simple that I can literally implement it in less time than it takes to explain it)?

@warner
Copy link
Member Author

warner commented Mar 24, 2022

Use the name as a key in a Map, and when the base virtual object goes away (because it can only be recognized but not reached, whether remotely or locally), delete the map entry.

We're saying that we change vrefKey to use the component object's name in the VirtualObjectAwareWeakMap's strong Map (the same one that's keyed by vrefs when someone uses a Presence or Representative as a key), right?

How to we remember that it added to those Maps, so that we can delete it from the correct ones later?

For Presences/Representatives, each VOAWeakMap/Set .init() will update a global table that maps the string key to the set of collections that now know that vref as a key. And we learn that the base virtual object goes away because of the finalizers, which gives us a set of vrefs to look up in that global table.

We could add a mapping that says "if base virtual object X goes away, here's a list of all the derived names of the related extra objects that should also go away". That would tell us what names to look for in each of the per-VOAWM/Ss that might include it.

But we don't know which VOAMW/Ss might include it, so we need a table to track those.

The bound methods and other objects are all things the virtual object system creates, so besides the facets, nothing should be Far and user-space should not be able to make any of them Far (because already hardened).

Being hardened doesn't prevent something from being made Far. Only being Far already would block that.

@mhofman
Copy link
Member

mhofman commented Mar 24, 2022

I'm referring to complexity that Brian is describing above. It seems like a lot of custom logic, and I don't see why we can't adapt what's already in place for tracking the virtual object facets, since all the methods, facets, and related objects form a single cohort, which ought to be the thing we're tracking.

@mhofman
Copy link
Member

mhofman commented Mar 24, 2022

How to we remember that it added to those Maps, so that we can delete it from the correct ones later?

The same way that the facet / virtual object are linked to a VirtualObjectAwareWeakMap instances when added ? We only need to remember about the base name once. Then we can maintain an internal map in each VirtualObjectAwareWeakMap from base name to set of fully qualified names that exist within. Actually the existing Map of vrefs should just become a Map of Map, then you can just nuke the outer map entry.

Being hardened doesn't prevent something from being made Far.

Far injects a proto in the chain, so a hardened object cannot be made Far after the fact.

@FUDCo
Copy link
Contributor

FUDCo commented Mar 24, 2022

It seems like a lot of custom logic

What custom logic? The thing you describe introduces a vast amount of new custom logic whereas the proposal on the table introduces nearly none.

I'm definitely confused here.

@warner
Copy link
Member Author

warner commented Mar 24, 2022

Being hardened doesn't prevent something from being made Far.

Far injects a proto in the chain, so a hardened object cannot be made Far after the fact.

Oh, right.. thanks, I'd completely missed that.

@mhofman
Copy link
Member

mhofman commented Mar 25, 2022

Far injects a proto in the chain, so a hardened object cannot be made Far after the fact.

Oh, right.. thanks, I'd completely missed that.

That said, we should have a test that asserts the failure of trying to make one of those objects Far, in case the way Far works changes in the future, and our assumption is broken.

@warner
Copy link
Member Author

warner commented Mar 25, 2022

Ok, we spent an hour talking about this. The conclusions are:

  • write up @mhofman 's approach, for possible use later
  • implement @FUDCo 's approach now

@FUDCo 's approach is:

  • add all the "extra objects" created in the course of making a Presence or a Representative into a single WeakSet, tentatively named allUnweakables
    • this includes: method/Function objects, state, and the prototype of all Facets and Presences
    • it must not include: Facets and Presences
    • it must include every identity-bearing Object that userspace can access via Facets or Presences
    • I had previously thought the record of facets provided as behavior fell into this category, but I was wrong: it is only created once per virtual object, not once per deserialization. When makeRepresentative creates a record full of bound facets, we use that record to create the in-memory "cohort" (an Array), but we do not reveal it to userspace, so it does not need to be protected this way.
  • add enough WeakMap references between the extra objects to ensure the existence of a reference cycle between all of them, to prevent the === distinguisher
  • each VirtualObjectAwareWeakMap/Set will create an additional Set, maybe named heldUnweakables
  • VirtualObjectAwareWeakMap.init and VirtualObjectAwareWeakSet.add will be changed to test if key is in allUnweakables, and if so, add key to heldUnweakables
  • .delete will be changed to remove key from heldUnweakables
  • by retaining the entire collection as long as any member is a key of a WeakMap/Set, userspace cannot build a WeakMap/Set predicate that will ever return false

In @FUDCo's approach, when userspace does something foolish with the extra objects, their WeakMap/Set will consume extra RAM, and (perhaps surprisingly to userspace authors) will keep the related WeakMap values alive longer than expected. We think userspace shouldn't be doing this, so we don't want to spend implementation time/complexity on making it more efficient.

A weaker/simpler form of @mhofman 's approach (after we talked about it some more) is like

  • assign a vref-space name to all the "extra objects", in addition to the existing vrefs for Presence and Facets
    • make sure name collisions cannot be caused by sneaky controlling the names of facets and methods
  • map the object to their names in a global WeakMap tenatively named weakTrackingNames
  • change virtualReferences.js vrefKey() to check weakTrackingNames in addition to getSlotForVal
    • therefore VirtualObjectAwareWeakMap.has/get/set/delete() will use the name instead of the e.g. state object
  • change VirtualObjectAwareWeakMap.set/delete to call vrm.addRecognizableName instead of vrm.addRecognizableValue
  • implement virtualReferences.js addRecognizableName(name, recognizer) to use the name in vrefRecognizers instead of deriving a vref from a value. Make sure it only gets called with Presence/Representative vrefs and the extra-object names, not Remotables or Promises.
  • update removeRecognizableVref in the same way
  • change virtualReferences.js ceaseRecognition(): when given a baseref, derive all of the extra object names (facets, state, method objects, prototypes), instead of merely deriving all the facet vrefs, and call ceaseRecognition on all of them

This approach will retain WeakMap entries as long as any member of the cohort or any of the extra objects are in memory, but once the last one is dropped, all those entries will be deleted. ceaseRecognition() will query vrefRecognizers with more names than before (roughly two per facet and one per method, vs the previous one per facet).

The biggest complexity of the latter approach is creating names safely (without collisions), and changing addRecognizableValue to addRecognizableName.

@warner
Copy link
Member Author

warner commented Mar 28, 2022

I think I figured out how to take care of the problem Brian identified. The trick does involve being able to identify the unweakable objects, though it doesn't care if that's done via a symbol or a WeakSet or some other way. We modify VirtualObjectAwareWeakMap and VirtualObjectAwareWeakSet similar to how Brian suggested above, but rather than throwing an exception when somebody tries to insert an unweakable object as a key, we go ahead and insert it like any other key, but we also insert it into a separate Set that we keep on the side (a regular strong set). If they delete the key, we also delete it from this Set. If the containing collection is itself GC'd, the side Set gets GC'd as part of that and if it contains the last surviving reference to the unweakable object then it in turn becomes eligible for GC, but now in a state where nobody is left in a position to see it happening.

I forget if I wrote it down already, but @FUDCo and I were talking and we figure this strong Set should remove the need for making the extra proForma call to makeRepresentative when unserialize is called on the virtual-object vref, but we already have a Representative available. It might also let us avoid the extra unserialize we do inside each getter when someone reaches for a state property. But before we remove either, we need to reconstruct the reasons we had for doing them and make sure it's still safe. (I know it involved hiding GC and activity of the LRU cache from userspace, and I know the calls to user-provided actualize enabled a sensor, but I don't remember if metering was also involved).

@FUDCo
Copy link
Contributor

FUDCo commented Mar 28, 2022

The reason for the pro forma makeRepresentative call was so that the actualize function couldn't be used to detect when a representative was being created because it would be called every time a representative might be created. By getting rid of the actualize function we eliminate the problem that the pro forma hack was the solution to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request SwingSet package: SwingSet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants