-
Notifications
You must be signed in to change notification settings - Fork 208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new defineKind API with { state, facets } argument #4905
Comments
closes #4892 once implemented |
No need, the methods are binding the context object, so will hold the representatives alive on their own.
What does the cohort object contain that the context / facets object doesn't ?
Good call! But we can work around this, by binding it to nothing a first time during Btw, that also disqualifies using const bind = (fn, context) => (...args) => Reflect.apply(fn, null, [context, ...args]); We should think about restoring |
I'm seeing two problems, and I only see a solution to one of them. The first is using If userspace holds a bound method Holding The second problem is using a WeakMap/WeakSet as a predicate, and I don't see an answer. My concern is if userspace puts a bound method We blocked this for Presences and Representatives by changing WeakMap/WeakSet, so instance 1 and 2 of a Presence (for the same vref) are treated as the same key. This also lined up with virtual storage, since all the objects that need this sort of mapping also have vrefs. But bound methods Come to think of it, the One (bad) fix would be to hack WeakMap/WeakSet to forbid all per-instance objects that come out of Kinds, both Another (worse) fix would be to figure out how to assign vrefs to all those things. I could vaguely imagine a scheme to do that for the bound methods, if I squint hard, but I'd be hard pressed to come up with a sensible one for the |
Meh, I think that's a fine approach. And the dirty way to do this is to brand all these objects with a symbol that WeakMap/WeakSet can recognize (as long as these objects are all frozen, and created by us, which I believe is the case) |
Yeah, I think I agree. @erights would be interested to hear about your comfort level with that. |
I propose the brand be called |
Since you're just "taking the temperature" at the moment, I'll say I do not like it. I am very uncomfortable with it. |
Which part, the dirty symbol, or preventing these objects from being used as weak keys? |
That part. I'm much less concerned about how it is implemented. |
I think I figured out how to take care of the problem Brian identified. The trick does involve being able to identify the This can be done adding only a tiny bit of code to I think the only open design question this leaves is the actual mechanism to mark an object as |
Hm, ok, so in that case Does that help with attack sketched out in #4892 (comment) , namely: const ws = new WeakSet();
function one(pres) {
ws.add(pres);
ws.add(Object.getPrototypeOf(pres));
}
// time passes, GC happens or does not
// caller sends the same vref
function two(pres) {
ws.has(pres); // always true, because WeakSet is vref-aware
if (ws.has(Object.getPrototypeOf(pres)) { // varies
console.log(`GC did not happen`);
} else {
console.log(`GC *did* happen`);
}
} Oh.. ok, to mount the attack they must put the extra object in a WeakMap, and if they do that, the extra object is held strongly, which means the primary object (Presence/Represenative) is also held strongly, so the second time we deserialize, they get back the same Presence/Representative as they did before, which means they get back the same extra object. Nice! I'll have to leave my devious hat on for a while longer and see if I can come up with another vector, but I think that'll work. |
What holds the Set? The I'm concerned about such silent memory leaks, especially when I believe there is a way to make it not leak. Once we recognize those objects, likely through a There is a little bit of Hilbert hotel to take care of if using the user provided names, but I believe we can make it work. One version of a non-hilbert based mapping would be to assign an index to every methodName, state, etc. The tricky part in this case is regarding upgrades and durable virtual objects. We need to save that mapping permanently (one per durable kind "brand"), and never remove entries from the mapping if an upgrade removes a method from a durable object (if it's allowed to do so in the first place). |
The Set would be held by the weak collection and hold just that collection's Also, I'm frankly not worried too much about leakage mostly because I can't think of any good reason (aside from shenanigans) that you'd ever use one of these objects as a weak key. We just want to ensure that nothing breaks if somebody does. Ensuring that nothing breaks seems important, but working hard to ensure that something almost nobody would ever do (and probably should never do) is efficient seems less so. |
Right. And we'd tell userspace "don't do that, you sneaky fool", but if they do anyways, they still can't sense GC.
Hm. I'm on the fence about that. It's a tradeoff between that self-induced memory usage and the complexity of defining a consistent name for each of these extra components (and storing them properly). Let's see, the set of additional components is enumerable at So the naming question is solvable. For storage.. when you use a Presence or Representative as a key in a The corresponding approach for these Most of the extra objects can't be So let's see, we'd need a FinalizationRegistry for all of them, using the derived name as the "held value"/cookie, and a table that remembers which We still need the cross-object cycles to ensure that new deserializations get the same extra objects as the original, to block If we ever make plain functions Ok, so at least so far, this seems sound. I have to say that I'm worried about the extra complexity, and if we can get away with the simpler "don't do that, you're only hurting yourself" instructions, I'd prefer it. Both solutions require tracking all the extra objects. The simpler one needs a global WeakSet, and an auxilliary strong Set inside each VOAWeakMap/Set, and userspace which attempts this trick will consume extra memory (and their weakmap values won't go away when they expect). The larger one needs a global WeakMap, the name derivation code (defensive against userspace format-confusion naming attacks), a global "name to VOAWeakMap/Set that uses it" table, a FinalizationRegistry from particular extra objects to their name, a handler which takes the name and the table and does the deletion, and changes to VOAWeakMap/Set (specifically
I don't think we do: these extra objects aren't Durable, so I don't expect the extra names to be seen in the DB at all. The only place I'd expect to see these names are in the WeakMap that assigns them to each extra object, and in the strong Map which holds the VOAWeakMap/Set entries keyed by those objects. So I think upgrade isn't an issue. |
I'm not following, I'm probably missing something, but why do we need so much complexity for GC here. Use the name as a key in a The bound methods and other objects are all things the virtual object system creates, so besides the facets, nothing should be
But they're related to a base durable object. Basically what I'm saying is that a durable virtual object should remember information about the shape of its facets across versions in case the user decides to change the shape during an upgrade. |
Which complexity are you referring to? GC for the virtual objects and collections generally (which is admittedly very complicated but also relatively mature as our stuff goes) or the wrinkle we're having to add here to avoid GC visibility (which is sufficiently simple that I can literally implement it in less time than it takes to explain it)? |
We're saying that we change How to we remember that it added to those Maps, so that we can delete it from the correct ones later? For Presences/Representatives, each VOAWeakMap/Set We could add a mapping that says "if base virtual object X goes away, here's a list of all the derived names of the related extra objects that should also go away". That would tell us what names to look for in each of the per-VOAWM/Ss that might include it. But we don't know which VOAMW/Ss might include it, so we need a table to track those.
Being hardened doesn't prevent something from being made Far. Only being Far already would block that. |
I'm referring to complexity that Brian is describing above. It seems like a lot of custom logic, and I don't see why we can't adapt what's already in place for tracking the virtual object facets, since all the methods, facets, and related objects form a single cohort, which ought to be the thing we're tracking. |
The same way that the facet / virtual object are linked to a
Far injects a proto in the chain, so a hardened object cannot be made |
What custom logic? The thing you describe introduces a vast amount of new custom logic whereas the proposal on the table introduces nearly none. I'm definitely confused here. |
Oh, right.. thanks, I'd completely missed that. |
That said, we should have a test that asserts the failure of trying to make one of those objects |
Ok, we spent an hour talking about this. The conclusions are: @FUDCo 's approach is:
In @FUDCo's approach, when userspace does something foolish with the extra objects, their WeakMap/Set will consume extra RAM, and (perhaps surprisingly to userspace authors) will keep the related WeakMap values alive longer than expected. We think userspace shouldn't be doing this, so we don't want to spend implementation time/complexity on making it more efficient. A weaker/simpler form of @mhofman 's approach (after we talked about it some more) is like
This approach will retain WeakMap entries as long as any member of the cohort or any of the extra objects are in memory, but once the last one is dropped, all those entries will be deleted. The biggest complexity of the latter approach is creating names safely (without collisions), and changing |
I forget if I wrote it down already, but @FUDCo and I were talking and we figure this strong Set should remove the need for making the extra |
The reason for the pro forma |
What is the Problem Being Solved?
To prevent the GC sensitivity of #4892, in today's kernel meeting we came up with a new API for
defineKind
. The current API looks like:In the new API, the third argument to
defineKind
will be a record of unbound functions, each of which takes an initial argument{ state, self }
or{ state, facets }
:In addition, the fourth argument will be an options bag, with the only currently-accepted option being
finish
.The
behavior
record is examined duringdefineKind
, and a copy is made of the functions it provides (to ensure that userspace does not get control later, during actualization). Later, whenmakeFoo(args)
is invoked and we need to make a new virtual object cohort, the sequence is:init(args)
, get back the initial state recordstate
object with getters and setters for each property in the initial state record{ state, facets: {} }
(or just{ state }
for single-facet kinds)behavior
(or just the one facet):Function.prototype.bind
) that curries the context object as the first argumentcontext.facets
(or assign the one facet ascontext.self
)finish
if definedmakeFoo()
)The goal is to prevent userspace from getting control during actualization, giving it a way to create per-instance objects that could be used to sense GC. The bound methods (and
state
, and the context object, and the facet objects within) are all per-instance, but they are all visible to the Kind machinery during actualization, so we can establish WeakMap links from them to the cohort record, which means they'll have the same lifetime, which prevents their use as GC sensors. By having userspace provide a table of functions atmakeKind
time, rather than executing a userspaceactualize
function at actualize time, we don't give them any opportunity to make new per-instance objects (or to even sense when actualization is taking place).I think this will let us safely remove the
proForma
"callactualize
anyways and throw out the result" code in the VOM, which would be lovely.One small surprise for Kind authors is that their methods will have a different signature than the code which invokes them. The first argument is provided by the Kind machinery.
Kinds which simply want to refer to their state should get it by destructuring the first argument:
When a kind needs to refer to other facets, it should grab that from
facets
:If a single-facet virtual object needs to refer to itself, it should use
self
:Rejected Variants
We considered
method: ({ state, ...facets }) => ...
, which would shrink some things likegetDepositFacet: ({ depositFacet }) => depositFacet
by the length offacets.
. This would preclude the use ofstate
as a facet name, and would complicate the addition of new context properties in the future. We decided that referencing other facets was infrequent enough that it wasn't worth the preclusion.There may be value to internally representing all Kinds as multi-faceted, even those with a single facet. Independent of that, we considered having
makeKind
always be multi-facet, so authors of single-facet Kinds would need to pick a facet name. This would removeself
from the context argument (it would always havefacets
, even if there was only one property on it). The Kind defintion would then look like:and userspace would look like:
either of which were sufficient to disqualify the idea. Instead,
defineKind
will look at the shape ofbehavior
to decide whether it creates single-facet objects or multi-facet cohorts. The internals might or might not manage both in the same way, but thebehavior
record and the return value ofmakeFoo
will be different for the two cases.We considered using JavaScript's special
this
variable to hold the{ state, facets }
context object. This would be closer to the usual prototype-based inheritance (possibly identical), but would requirebehavior
be defined with concise method syntax, precluding the use of=>
arrow functions (which have other benefits). We hope that "Far Classes" will be implemented eventually, which should avoid the surprising extra context argument and also be stored more efficiently.We decided that
finish
is used infrequently enough that a reader of adefineKind
definition would benefit from seeing its name appear in the options bag, and that this benefit (plus the general benefit of options bags if/when new arguments are added) justified changing the fourthdefineKind
argument from an optionalfinish
function to an optional{ finish }
bag.Security Considerations
Userspace must not get control during actualization, even if the behavior record they provide is a
Proxy
or has getters. We must enumerate the contents once, duringdefineKind
, and assert that we get plainFunction
objects from it. We should store those Functions in the Kind record for use during actualization.We must double-check that the act of binding a Function is not visible to that Function, perhaps through some
Proxy
trickery that I'm not aware of (does.toString
get called for any reason?).Test Plan
Unit tests, ideally ones that don't pull in all of swingset, but maybe that's the easiest way to do it.
The text was updated successfully, but these errors were encountered: