-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Isolated realms with sync messaging passing #289
Comments
My thoughts are that this is a bit weak by itself due to the lack of cross-realm references. However it does make me wonder if something like a sync version of what Puppeteer/Playwright do with JSHandles + ability to structure clone a handle would work. e.g. For example: const realm = new Realm();
const arrayHandle = realm.evaluateHandle(`[1,2,3]`);
// Push an item into the array in the realm, non-handles are structured cloned
realm.evaluate(`array.push(value)`, { array: arrayHandle, value: 4 });
// We can also ask for a handle to be structured cloned back to us
const arrayClone = arrayHandle.cloneIntoThisRealm();
console.log(arrayClone); // [1,2,3,4] For things like callback patterns, the ability to have handles works naturally, for example suppose we wanted to expose something like const realm = new Realm();
const realmSetTimeout = realm.createFunction(
function setTimeout(delayHandle, callbackHandle, ...callbackArgumentHandles) {
const delay = delayHandle.cloneIntoThisRealm();
setTimeout(delay, () => {
realm.evaluate(`callback(...args)`, {
callback: callbackHandle,
args: callbackArgumentHandles,
});
});
}
);
realm.globalThisHandle.setPropertyDescriptor('setTimeout', {
value: realmSetTimeout,
enumerable: true,
configurable: true,
writable: true,
}); A rough API would be something like this: type StructuredClonable = ...;
class RealmHandle {
// Clone the value from the realm into this Realm using structured clone
cloneIntoThisRealm(): StructuredClonable;
// Object meta operations, same as Reflect.* except
// accept RealmHandles and return RealmHandles
apply(
target: RealmHandle,
thisArgument: RealmHandle | StructuredClonable,
argumentsList: RealmHandle | Array<RealmHandle | StructuredClonable> | StructuredClonable>;
): RealmHandle;
construct(...) ...
...
}
class Realm {
// A RealmHandle for the global object
get globalThisHandle(): RealmHandle;
// Evaluate and clone the return value, this is basically a shortcut for
// .evaluateHandle().cloneIntoThisRealm();
evaluate(
code: string,
scope: Record<string, RealmHandle | StructuredClonable>,
): StructuredClonable;
evaluateHandle(
code: string,
scope: Record<string, RealmHandle | StructuredClonable>,
): RealmHandle;
// Creates a function in the Realm that calls the given function with RealmHandles
// for all arguments passed to it
createFunction(
func: (...args: any[]) => StructuredClonable,
scope: Record<string, RealmHandle | StructuredClonable>,
): RealmHandle;
} |
@domenic I want us to meet at a middle-ground for sure. I asked my teammates to take a look.
This is one of the concerns that requires specific review, I appreciate that you're weighing the options here. |
One of the problems is that we don't have structured cloning in the language. I have a proposal for that https://github.com/Jack-Works/proposal-serializer if you're interested. |
@domenic thanks for putting the time to do this write up, we have been debating this with @syg for few weeks, in fact, we did some prototyping around it to measure perf (I believe that's not a deal breaker, which is a good news). Also, we did some homework to see if some of the existing membranes will work with this proposal, and here is where things become complicated (as @leobalter mentioned above). One thing that occurs to me is that maybe we can provide other internal mechanism to help overcome this issue. I honestly don't even know if this is possible, but here is an idea: realm.eval(`globalThis.foo = { x: 1 }`);
realm.eval(`Array.prototype`) === realm.eval(`Array.prototype`); // to yield `true`
realm.eval(`foo`) === realm.eval(`foo`); // to yield `true` Basically, what I'm asking is if the UA can do some ref-tracking across this boundary, so the incubator realm's ref (in this case empty plain object based on your example) can continue to be linked to the corresponding ref from the realm, and release that memory when the realm doesn't need access to that object anymore. This is clearly a lot different than the already well establish structured cloning algo, but a variation of it to reuse some references when possible. Since both realms are in the same process, it might be possible. Similarly, we will have to define how that works for Realm.set/call/etc. But the bottom line is that such mechanism will eliminate the necessity of doing any user-land book-keeping for references that needs to be tracked across references, clearing the way for a membrane to support any kind of virtualization. |
I would be interested in seeing the explainer expanded to cover use cases where cross-realm cycles are important. I don't think "making existing membranes work" is a use case. I'd be interested to see things on the same level as the current explainer's "DOM virtualization" or "test frameworks". From what I can tell, no use case mentioned so far requires such cycles. |
I'm writing an expansion over the sandbox use case and some thoughts for perhaps have a workaround to overcome @caridy's concerns. I'll post it back here when I have a proper review. |
We have the TC39 plenary and a TAG Review meeting next week, they are too close for us to bring any good conclusion of potential next steps. I can say we are trying to understand better this new proposal and identify what might not work and how it would work for us. Our goal is to find an agreement point. Meanwhile, I'm still going to expand the use cases in the explainer. |
@domenic can you expand a little bit about whether or not the realm will be running in the same process / same agent? or if it must run in a separate process? Or is it that you have intentionally left that part out of the proposal to let the UA to decide about that? I'm asking because that might be another differentiation aspect between the two of them. Can you clarify? |
This proposal has them running in the same process, so that the communication is synchronous. I would of course prefer them to run in separate processes, and for the communication to be asynchronous, per #238, but such a modification does break some of the use cases mentioned in the explainer. This proposal is meant to cover all of the explainer's use cases and as such is synchronous/same process. |
I'm reasonably certain we can reuse the Membrane concept even with IsolatedRealms. Whenever an object crosses the boundary, instead pass a UUID (or anything unique, really). This will require wrapping the For instance, say we had an stored in the realm.eval(`foo`) === realm.eval(`foo`); Here, Using a Membrane, we'd additionally be able to express a cycle without any issues. If |
To ensure the child's |
Unfortunately cross-system cycles using UUID's can leak memory as this issue on the EDIT: Actually it's unclear with the proxy's returned by |
WeakMaps enable cross-membrane cycles to be collected. WeakRefs do not. That was one of the first motivations for splitting the concepts. |
Why not both? It looks like with the proposal accepted the kind of realms proposed in this issue could be implemented in userland. So maybe just do it and make programmers widely aware of its advantages. If deemed desirable, they could be included in the language as well as the full version, e.g. as a subclass. |
I had a long conversation with @syg about the features needed for membranes to function. It seems to me and @syg that there are other things that we can do to solve it, and we can consider them orthogonal and complementary to what it is being proposed here. That, I believe, it is a good thing. We will try to put together some material to explore those other options as a separate proposal. For now, my focus is to try to understand all the details of what is being proposed, and the implications of such in the context of the current realm proposal. |
@domenic, I finally got a chance to look at this in detail, and I can say that I'm on board. This, IMO, is a good compromise. I will continue discussing it with other folks, for now, I have few notes:
|
For the registers, we had a sync about this sharableIdentity API at SES Meeting on Wednesday and it's got a generally positive feedback. I'm looking forward for @domenic's feedback and hopefully we can set a working path forward. |
Hi @leobalter , the only email address I have for you no longer works. Assuming you have one for me, please send me your's as well as the recording of the last SES session. Thanks. |
Could Symbols as WeakMap keys provide this shareable identity concept? |
Assuming they can pass through and be returned by The main advantage I see for |
I'm glad to hear that this proposal is being taken seriously, and could work for at least some of the champions! As I said, it could work for me and the constituencies I represent. (Remaining major issues remain, such as #261, but it at least is a large step in the right direction.) I'll caution that @syg and I are still working to understand whether Chrome security is comfortable shipping sync/in-process realms in any form. (It turns out, they do not agree with my statement "This encapsulated-by-default proposal would bring realms onto the same footing as other encapsulation proposals such as trusted types or private fields, and thus make it more congruent with web platform goals.") This may be ameliorated with a name change, e.g. Regarding shared identity things of the sort you discuss, I think exploring such things makes sense as a follow-on. @syg would be the best point of contact there, especially given his work on disjoint object graphs for cross-realm concurrency, which seems pretty related.
I don't have any concrete ideas or opinions here, but I'll point out that |
Few more notes:
|
@annevk how this proposal alternative looks to you? I believe the existing Realms proposal has some ongoing issues that might be mitigated from @domenic's proposal.
|
I see that investing in the symbols as weakmap keys would be a better solution and investment for this. I'll get this sync'ed with @littledan and @caridy. |
It removes all the cross-realm concerns I had. I think saying it's comparable to Trusted Types is overselling it since that would ensure that the code that is run is actually vetted, which is not the case here. It's not a security mechanism, it's a way to run code encapsulated from global state, and if you don't trust the code you're still putting yourself at risk. |
I see this slogan causing a lot of confusion. We need better distinctions. See Realms are an integrity mechanism. They are certainly not a confidentiality or availability mechanism.
I don't trust the code I wrote yesterday. We manage and mitigate risks. We need to do better at that. We never eliminate risks. |
I would point out that Realms, by themselves, are not a confidentality mechanism. But they are still a practically† necessary part of any JS confidentality mechanism. I believe, but @erights would need to confirm, that the SES proposal is what is necessary for confidentality. Availability is covered by neither this Realms proposal or the SES proposal, but rather would need to be covered by an Agent proposal (which has often been mentioned in passing in the various realms/compartments/ses proposals, but nothing concrete has been proposed thus far). Most hosts already give a way to create agents (e.g. † Strictly speaking SES could be implemented purely with the root realm and compartments, but |
I believe we should avoid unnecessary complexity. The reason I see for adding [Async]Generator functions is just because they are other functions formats. We are not discussing what is available in the new realms, but the channels we need to operate with this Realm. Is there any use case that needs iterators through these channels? As I mentioned somewhere in the past, I'm in full support for module blocks and I still believe this example should not use them. We already have a long time baggage of challenges for this proposal, and I'd rather go through a solution that is compelling enough without another proposal that has its own challenges ahead. As @caridy has mentioned,
I believe these concerns can be mitigated if we use the specifier instead of a module block in the example. I simplified the names and arrow functions to support my own reading of the example. // connector.js
export default function(fn) {
return v => fn(v + 1);
};
// main.js
let n, send;
const r = new Realm;
await r.connect(
function(sender) {
send = sender; // The original example had the other way around, sender = send
return val => { n = val; };
},
'./connector.js'
);
send(4);
console.log(n); // 5 This example looks better, and the names should be distinct enough to avoid confusion. It still seems to be only one function per connecter, along with one new async tick. Can you help me with the steps here, like where the sender function is created, and how the returned I'm assuming we have internals connecting these functions but I'm getting lost every time I try to create a step by step. |
@ljharb I think you're right... if those can be created from syntax, and can work as bridge functions, then I don't see why not exposing them to have a complete API on the Realm. |
FWIW, I'm happy to add these extra bridge functions if the concern is a dealbreaker. |
Yeah, I can see how it is a problem that creating a realm and a communication channel is an async operation in my suggestion. At the moment, I can't think of a solution which both resolves the no-eval issue and is synchronous. I will keep thinking on this issue. |
Perhaps we could have a "script block" that creates unevaluated scripts similar to module blocks e.g.: const someScript = script {
console.log("Hello");
} Also thinking about your // Synchronous for scripts, this would be like .eval, but wraps the returned
// function for the current realm, so addOne !== the lambda inside the script
const addOne = realm.connectScript(script {
(n) => n + 1;
});
// Asynchronous for modules, this would act like .import, but the default export
// is treated as a function to wrap, the addTwo in this realm is not the same as
// the addTwo within the realm
const addTwo = await realm.connectModule(module {
export default function addTwo(n) {
return n + 2;
}
}); |
Isolated Realm API changesThis comment describes a possible solution for the API of the Realm to work with the new isolation model described in this issue. API (in typescript notation)declare class Realm {
constructor();
eval(sourceText: string): PrimitiveValueOrCallable;
importBinding(specifier: string, bindingName: string): Promise<PrimitiveValueOrCallable>;
} Skipping relaxed CSPIn this API review, the const realm = new Realm();
await realm.importBinding('console-shim', 'default');
const redTrySample = await realm.importBinding('sampler', 'trySample')
// redTrySample can still receive primitives such as symbols, etc
const result = redTrySample(2, 3); The wrapped functions can receive functions as arguments. This allows the constructed realm to trigger a callback in the incubator realm, without knowing about the incubator realm. const realm = new Realm();
const redRunTests = await realm.importBinding('testFramework', 'runTests');
function reportResults(...args) {
/* ... manages results from args ... */
}
reportResults.noop = 1;
// The constructed realm receives a new function that would chain to
// reportResults when called with its given arguments.
redRunTests(reportResults);
// The connecting function created inside the realm won't have any access to
// the property 'noop'. It does not receive a strucuted clone of that function. The (other) basicsThis API explores a modification of the Isolated Realms proposal while trying to preserve some of its goals. It has similar level of expressivity and isolation. It still disallows direct access between the parent and child realms, but it does not use structured cloning. This is possible through auto wrapping connected functions. In this API, any action hitting a disallowed completion would throw an exception. The only values that can be transfered are primitives (string, number, boolean, symbol, undefined, null, BigInt, etc). There is a special behavior to wrap callable objects, generally functions. By callable objects, we consider any object with a const realm = new Realm();
try {
realm.eval("[1, { foo: 'bar' }]");
} catch {
// Throws a TypeError
}
// If you try to get access to a constructed realm's constructor,
// you get a wrapped function, which isn't very useful as it can't return
// object values:
const redArray = realm.eval("Array");
// redArray is only another function that would eventually chain the call to the
// Array constructor in the other realm.
try {
redArray();
} catch (err) {
assert(err.construcor === TypeError);
}
// The wrapped functions are always frozen and do not share properties!
assert(Object.isFrozen(redArray));
assert(redArray.__proto__ === Function.prototype); // not from the other realms prototype The wrapped functions allow setting values in the other realm including Symbols while providing more API flexibility. const realm = new Realm();
const mySymbol = Symbol();
const fn = realm.eval(`(function(x) { globalThis.foo = x; })`);
fn(mySymbol); // equivalent to the previous realm.set("foo", "bar");
const result = realm.eval('globalThis.foo');
assert(result === mySymbol); The wrapped functions does some sugar for the previous const realm = new Realm();
const add = realm.eval(`(x, y) => x + y`);
const result = add(2, 3); // equivalent to the previous realm.call("add", 2, 3);
assert(result === 5); The avoided fingerprints also remove a need for such things as There are more details in this README file. |
From the developer's perspective, I think the best way is to make the membrane default and allowing opt-out.
(await new Realm().import("./val")).array instanceof Array
// true Membranes by default can avoid mysterious behaviors for simple use cases. 2 Opt-out if they actually don't need a membrane (await new Realm({ membrane: false }).import("./val")).array instanceof Array
// false |
@Jack-Works let's keep the membrane separate, that's not a direct goal of the Realms proposal. At some point we might propose some membranes specific proposal that can complement this proposal. |
@leobalter thanks for the write-up, that's a lot of info to digest, let me try to provide some high level thoughts that might help to understand what we have been discussing for few weeks:
From that, we are of the opinion that the coordination between the two realms can be done via Primitive Values plus Callable Values, and this will allow users to implement their own protocol, hence the proposed API. The implementation seems to be simple enough, the following is a very early draft of the spec for that: GetWrappedValue ( ReceiverRealm, value )The CrossRealmValue abstract operation takes arguments ReceiverRealm and value, it performs the following steps: 1. If IsPrimiteValue(value) is true, return value.
1. If IsCallable(value) is false, throw a TypeError exception.
1. Return ? CreateWrappedCallable(ReceiverRealm, value). CreateWrappedCallable ( ReceiverRealm, value )The CreateWrappedCallable abstract operation takes arguments ReceiverRealm and value, it performs the following steps: 1. Assert: IsCallable(value) is true.
1. Let F be a new built-in function object associated to ReceiverRealm as defined in Wrapped Functions.
1. Set F.[[OriginalCallable]] to value.
1. Return F. Wrapped FunctionsA wrapped function is an anonymous built-in function that has a [[OriginalCallable]] internal slot. When a wrapped function F is called with arguments, the following steps are taken: 1. Let foreignCallable be F.[[OriginalCallable]].
1. Assert: IsCallable(foreignCallable) is true.
1. Let foreignRealm be F.[[Realm]].
1. Let currentRealm be the current Realm Record.
1. Let argList be ? CreateListFromArrayLike(arguments).
1. For each element key of argList, do
1. Let o be argList[key].
1. Set argList[key] to ? GetWrappedValue(foreignRealm, o).
1. Let value be ? Call(foreignCallable, undefined, argList).
1. Return ? GetWrappedValue(currentRealm, value). With that in mind, both It is important to notice that a wrapped function does have identity (associated to the realms that they belong to), but have no way to trace them back to the original callable. We also see some intersection semantics with Records and Tuples in the sense that those will allow complex structures (without identity) to be shared between the two sides (under the same process). We hope to open the conversation about this option, as it seems to solve the ergonomic issues related to previous proposals from @domenic, @littledan and myself. |
@caridy I believe that what you've described here would be sufficient as a drop-in replacement for the various realm creation mechanisms that eshost uses in each JS engine host. Currently, eshost creates realms with whatever the host provides, as you mentioned, vm in node, iframe in browser, and other engine host specific mechanisms (they're different in each of the major implementations). |
But what about the desirable feature of enabling synchronous execution in a realm without passing code in a string? |
@ByteEater-pl you will have to go the module way, which is not sync, that's what |
Why is there no non-eval synchronous mechanism? (especially considering one could be created trivially, as long as one is willing/able to wait a tick to set it up) |
There is already a suggestion on creating a method to inject code like The I'm good with having a script injection method, but I'd do it as a follow up. Unblocking this proposal will also be helpful to clear out the next steps. |
I'm of the opinion that we can try to add that mechanism to the language in general, not just the Realm proposal. |
What do you mean? |
@ByteEater-pl This is a mechanism not yet existent in the ECMAScript field, maybe through other extension APIs. In my opinion if TC39 wants to explore code injection that is different than modules, we should have a larger discussion that is out of the bounds of the Realms proposal. There is more to discuss how we want to inject code and what it means in the language and hosts integration. This should not block this proposal as the main goals are still resolved with the current form. |
But at least as a rough sketch, what do you mean by code injection in a broader context? Injection into what? Something other than Realms? They are the abstraction present in the spec, and both new language features and host integrations are in each case defined in terms of them. Do you believe there are scenarios for which some other abstraction would need to be added, bypassing Realms, and can such vision at present be made tangible enough to warrant choosing that alternative direction as opposed to first trying to support those scenarios by adding features to Realms (possibly even with this very proposal)? Or maybe I'm missing your point, entirely or partially. If so, I'm sorry as a non-native English speaker for being possibly more demanding of your patience than I realized. |
@leobalter I believe this can be closed now since it is now codified as part of the callable boundary effort. I also want to thank @domenic for suggesting these changes, it turned out great IMO. |
Hi realms champions,
@syg and I have been considering a modification to the current realms proposal which trades some expressivity, to give better isolation guarantees. Essentially, instead of allowing direct bidirectional access between the parent realm and the constructed realm, all such communication would go through structured cloning. This ensures that the child realm never gets access to objects from the parent realm, thus making "sandbox escapes" such as those in #277 or nodejs/node#15673 impossible by construction.
Sample code and API
We don't have strong feelings on the API for this; we'd like it to be as ergonomic as possible. But here is an initial idea.
For pulling values out of the constructed realm, into the parent realm: introduce
realm.eval()
.For getting values into the constructed realm, from the parent realm, a bare minimum might look like this:
but you could imagine something slightly more complicated, and more useful, such as
(Compared to async realm boundaries on the web, this solves similar use cases to
webRealm.postMessage()
.)Finally, for pushing values out of the constructed realm into the parent realm, you'd need something like this:
(Compared to async realm boundaries on the web, this solves similar use cases to
webRealm.onmessage
.)And, of course, we'd remove
realm.globalThis
.Use case analysis
This proposal is arguably better than the current one for many sandboxing use cases. In particular, for cases such as templating or computation where the goal is to have a (conceptually) pure function execute inside the realm, this architecture is ideal, especially in how it automatically prevents "impurities" from cross-realm contamination. In such cases, the values passed are often primitives, or if not, they're within the realm of structured clone: plain objects, arrays, maybe some Maps and Sets and Errors and Dates and typed arrays/ArrayBuffers.
For cases such as a virtualized environment, it requires more work, but probably on about the same level membrane-based approaches. That is, to perform operations inside the realm while interfacing with a same-realm object API, you would have to create proxies (either literal
Proxy
s or just wrappers) which perform the appropriate calls torealm.eval()
andrealm.call()
. And similarly for the reverse: if code inside the realm wants to operate on a inside-realm object while really doing work in the outside realm, the outside realm would need to do some setup, usingrealm.eval()
to inject some proxies which callglobalThis.callParent()
. (Probably that setup code would then also dorealm.eval("delete globalThis.callParent")
at the end.) This is equivalent to what is being done today in the AMP WorkerDOM example that the explainer cites, but by using synchronousrealm.call()
etc. instead of asynchronousworker.postMessage()
, it would overcome the challenges you discuss there.Other use cases like running tests in the realm fall in between. You'd need to inject a small shim into the realm which provides globals that the test library depends on (such as
console
), proxied to the creator realm. But then you'd just run the test library inside the global. I.e. instead of the explainer's current sample code, you'd writeThis also gives you a stronger guarantee that tests don't mess with the test framework, or with the outer realm, or with other tests, all of which are possible in the current explainer's sample code.
This proposal does lose some expressivity though. In particular, it is not able to create reference cycles between cross-realm objects. Because all communication is via cloned messages, there's no way to communicate to the garbage collector that an object in the outer realm depends on an object in the inner realm, and vice versa, so that the cycle can take part in liveness detection. To some extent this is a good thing, as cycles are an easy way to leak an entire realm. But from what I understand it does cut off some use cases that go beyond the ones mentioned in the current realms explainer.
Performance
Adding a structured clone step for all boundary-crossing operations could come at a performance cost. But, less so than you'd imagine.
In particular, since primitives are trivially cloneable, any operation which returns them would suffer virtually no overhead vs. the current realms proposal, when communicating across the boundaries. This can account for a large number of use cases: e.g., most computation use cases, or the test framework use case (where it's just passing console.log strings across), or many of the interesting virtualization cases. Other cases will be covered by small objects or arrays, for which the structured clone overhead is quite small (less than JSON serialization and parsing). It's only the case of needing to return a large, nested object graph where there might be a noticeable performance disadvantage.
It's also worth noting that although this proposal does have a lower theoretical performance ceiling than the current realms proposal, it's probably comparable to the current realms proposal plus the associated membrane machinery that's needed to preserve encapsulation. There might be interesting tradeoffs in the large nested object graph case. There, structured cloning across the boundary means a larger up-front cost, but after that initial cost is paid, subsequent accesses throughout the large object graph are fast and well-optimized. Whereas membrane use means every access throughout the wrapped object graph incurs membrane-related overhead.
Finally, I haven't thought much about this, but you could probably get ultimate performance™ by passing in a SharedArrayBuffer and doing all communication through that.
Conclusion
I'm optimistic that this proposal removes the most dangerous feature of realms, which is that they advertise themselves as an encapsulation mechanism, but it is extremely easy to shoot oneself in the foot and break encapsulation. This encapsulated-by-default proposal would bring realms onto the same footing as other encapsulation proposals such as trusted types or private fields, and thus make it more congruent with web platform goals.
There still remains a danger with people over-using realms when they need security or performance isolation, beyond just encapsulation. This still weighs heavily on me, and its conflict with the direction the web is going (per #238) makes me still prefer not providing a realms API at all, in order to avoid such abuse. But I recognize there are cases where synchronous access to another computation environment is valuable, and I think if we curtailed the footgun-by-default nature of realms by prohibiting direct cross-realm object access, I could make peace with the proposal.
I look forward to hearing your thoughts, and hope we can meet on this "middle ground" between no realms on the web, and the current proposal.
The text was updated successfully, but these errors were encountered: