-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
concern about a module graph per realm for the web #261
Comments
Modules close over a global object. Our two options for modules and Realms is:
This proposal starts with 1, and compartments may provide 2. I think it's appropriate for us to start with a limited scope here. I'm having trouble understanding the concern about "intertwine"--closing over a global object (i.e. Realm) is just how modules work. It's true that loading modules in Realms will take memory; it also takes memory to load modules in iframes. @domenic could you clarify what you mean here? These feel like tradeoffs that JS developers can make, not red lines affecting the proposal's viability. |
Sure. In web specs and implementations, module maps are tied to WindowOrWorkerGlobalScopes and their associated memory caches/HTTP caches/etc. So you can't introduce a disjoint module map; like other things where realms delegates to their "parent" web global, you need to delegate to the parent web global's module map. |
The change I proposed was that HTML would maintain a module map per Realm. What is the reason that it can't be done? |
I must have missed that. I was told that the idea was to delegate to the parent realm for important host integration points like URL, HTTPS state, and module map. That is a requirement to avoid complicating the spec and implementation. |
The idea is to delegate to the parent realm for basically everything but the module map. It's true that this would consist of spec and implementation changes. Why is it a requirement? |
What if we left |
I'd rather pursue the ideas discussed in #262 (comment), which don't require creating a new type of realm which behaves differently from existing ones in module map behavior. |
a hopefully related question I had regarding this is - if each realm had their own module graph, would that also mean that each realm would have to have their own defined import map as well? or what would the relationship be regarding realms and import maps? |
Realms get a new copy of modules which closes over the global object, but module resolution is unchanged. All Realms share the same module resolution algorithm from the host. On the web, in the current integration proposal, they would all reference the same import map of the page. Configurability of module resolution is under discussion in the context of the Compartments proposal. |
I would like to reiterate over this issue. With the new proposed Callable Boundaries API, we still remain with our goals for the constructed Realms to have a new set of intrinsics in a separate Global object. The non-primitive access is not possible anymore, but I believe if we do share a module graph, evaluated modules might still leak existing global values (not limited to intrinsics) cross realms, not preventing the identity discontinuity problems. A separate module graph is important so each module is evaluated within the Realm importing it. @domenic I wonder if you would now consider this option as your feedback here is really important to advance this proposal. |
My understanding was the plan was to wrap incoming values (from the realm of the module graph) using the same mechanism as is used for the rest of the proposal. |
The wrapping only happens for callable objects being accessed between realms, not within a Realm. The goal remains for the constructed Realm code to be pure and not affected by another Realm. Starting with the following quick example: // my-module-file.js ---------------
globalThis.bar = 'preserved';
export const aGlobal = globalThis;
// inside-code.js ---------------
import { aGlobal } from './my-module-file.js';
export const bar = aGlobal.bar;
// main.js ---------------
import './my-module-file.js';
globalThis.bar = 'modified';
const red = new Realm();
const observed = await red.importValue('./inside-code.js', 'bar'); The goal is for // my-module-file.js ---------------
export const list = ['a', 'b', 'c'];
// inside-code.js ---------------
import { list } from './my-module-file.js';
export const result = list.__proto__ === Array.prototype;
// main.js ---------------
import './my-module-file.js';
const red = new Realm();
const observed = await red.importValue('./inside-code.js', 'result'); In this case, |
Right, but since the module is from another realm (since it's in that other realm's module map, since only HTML-generated realms can have module maps in browsers), the object is from another realm, so the wrapping (or exceptions for non-primitives) makes sense to me here. |
@domenic sharing the object graph in any shape or form is incompatible with this proposal IMO. Let me share few thoughts that I believe will illustrate this better:
Now, those are use-cases and fundamentals that will be completely broken if the module graph must to be shared. For me, the real question is:
From previous conversations with you, the main issue was just a technical challenge, which we (cc @littledan) believe is something that can be solved, and the precedent exists everywhere you have looked at (all environments support this one one of another via context creation). |
Right, you wouldn't share objects; you'd wrap functions and share primitives. So this preserves the isolation of built-ins. Some particular points:
This seems to be because you purposefully only wrapped [[Call]]; if you added wrapping for [[Construct]] it'd work fine then.
You would only be able to get references to modules that the outer realm has already imported. I agree the realm shouldn't be able to mutate its parent's module map! Instead whatever module importing functionality is present is just sugar over the parent having to feed values to the realm manually using the primitive sharing/callable wrapping protocol.
Well, |
The result of a [[Construct]] operation is normally an object, which cannot be wrapped. What would be the point of wrapping [[Construct]] ?
Wouldn't that require the use of "evaluate" to execute code in the inner realm, making it impossible to execute any ES module code in the inner realm? |
@mhofman explained this very well in the comment above. You can't really do new across realms. But also, what about objects? e.g.:
This makes it non-deterministic, and weird in essence. What if the host hasn't load the module yet, but will in the future vs what if it already evaluated it?
I was referencing to All in all, for me, any kind of solution that requires checking the current state of the host's module graph to make determinations about what to do next seems very messy, while the only real reason we are having this discussion is because it seems complex to implement for some folks, while others think it is very doable, with considerable prior art already in place in production systems (V8 has good support this this, which is used by node and android already), while JSCore also provide similar mechanism for iOS, etc. |
Trying page this back in for paths forward. The concern from @domenic is that the module map is best kept on the "principal Realm" only. AFAIU the concrete concern from #261 (comment) is interaction with network and HTTP caches. Do I understand correctly the observable difference from duplicating the module map at the spec level is multiple fetches for the same URL, if imported both from the page Realm and the user Realm? #261 (comment) points out the only functionality provided by duplicating the module map is to instantiate the same module source against a different global -- the user Realm's. A path forward here may be to split HTML's module map into two:
With that, there would be a single fetched and parsed module map shared by all realms on a page, but each user realm would have its own linked and evaluated module map. On the ecma262 side, module records (even synthetic ones like JSON, I guess?) would need refactoring:
Thoughts from both sides? Am I missing any requirements? Edit: The linked and evaluated module map should map unlinked modules to evaluated modules, not URLs. That way it's clear that everything in the linked and evaluated module map is an instantiation of a module script in the fetched and parsed module map. |
@syg Your proposal is consistent with our intentions for the design of a module loader API atop the Compartment proposal. Multiple compartments can safely share a cache that maps module specifier to static module record, which is a representation of a module that has been parsed and produced a shallow static analysis of its imports, exports, and reexports, which can then be linked and initialized in the context of a Compartment, and by extension, a Realm. |
@syg Thanks! This seems good. I have only one thing I need to confirm. IIRC, the parsed code can't really be affected by user-land code present exclusively in realm a or b, right? In this case, what matters most would be connected to the instantiation phase and this solution seems like even good for faster i/o, network matters. I'll bring this for my team to review, but I'm quite positive we can work through this aspect. |
I don't see how it can, JS is always parsed the same regardless of where it runs. |
Thanks for confirming that. I'll confirm this with the proposal champions and come in with a PR in this repo to kick off the respective changes in spec. |
It's not clear in this proposal whether the child realm gets to mutate the parent realm's module map, or just read the unlinked module records from it. If it's the latter, then this proposal seems pretty reasonable; it's approximately something you could already do anyway by having the parent realm fetch the module and post its source text into the child realm for evaluation. If it instead provides the ability to mutate the parent's module map, then that seems more problematic. (One aspect the realms champions might be interested in is how this provides a high-fidelity mutable shared state communications channel, observable by timing how long an |
can you clarify on mutating the parent realm's module map? IIUC it seems that it reads the unlinked module records, but then if the given module is not there it needs to load it, adding the record to the same parent map but only instantiating it for the child realm. If the parent realm loads that same module after that, it also captures it from the unlinked module record, and instantiate the module now for the parent realm. If it prevents the child realm to import modules that are not previously loaded in the parent realm, this would likely become a dealbreaker for us. |
My intention was no mutation, only reading the unlinked module records and linking and evaluating them. Another way to architect this with the same effect instead of splitting the module map into two is to extend module scripts to hold both the unlinked module record and all instantiations. Only the page realm is allowed to mutate the module map and put new module scripts into it, keyed by URL. Any realm is allowed to look up a module script and ask for an instantiation against its own global, which would mutate the module script, but not the map. That might make the "user realms can't mutate the module map" guarantee clearer. |
It's hard for me to understand why this constraint is necessary and why allowing child realms to "mutate the module map" is problematic. This won't meet the needs (use cases realted) we have for this proposal. |
Why is this a problem? I'm imagining something like if a child realm needs to import a module, it'll "RPC" into the page realm to do the fetching, since everything's synchronous anyway. My intention with the no-mutation-from-child-realm guarantee is that the current realm at the time of fetching a module for the first time is always the page realm. It's not that the child realm can't request modules to be fetched. Like @domenic's analogy, the RPC call here is like the child realm asking the page realm to fetch something then pass it over for instantiation. |
If this RPC goes without need for user land code, just underneath, it's ok. This would be an implementation concern then. E.g. child realms looks up a module, it's not there, triggers the fetch from page realm, no instantiation in the page realm, but now it triggers instantiation of the module from the child realm. AFAICT, the mechanisms available from ECMAScript don't allow me to fetch a module without instantiating it. The issue relies on a page-realm-first constraint that forces every module to also be instantiated in the page realm. That exposes the page realm globals and values to code I intend to run only in an instantiated realm.
For user land, it doesn't really matter who is mutating, but it heavily matters where module code gets instantiated. |
Talking to the internal team, I'm trying to illustrate what is being proposed in user land. This first example is good enough: // module.mjs
let x = 0;
export function iter() {
x++;
return x;
};
// program.js
// Preloads module.mjs
import { iter } from './module.mjs';
console.log(iter()); // 1
const r = new Realm();
const iterFromRealm = await r.importValue('./module.mjs', 'iter');
console.log(iterFromRealm()); // 1
console.log(iterFromRealm()); // 2
console.log(iterFromRealm()); // 3
console.log(iter()); // 2 This is what I believe the constraint shows the issue: // module.mjs
let x = 0;
export function iter() {
x++;
return x;
};
// program.js
// SKIP IMPORT
// import { iter } from './module.mjs';
const r = new Realm();
// Throws TypeError, module.mjs was not preloaded in the page realm
const iterFromRealm = await r.importValue('./module.mjs', 'iter'); |
To illustrate more simply, a module as simple as |
Shu and I discussed this offline and managed to clarify some things. Let me lay out my concerns in more detail with allowing the child realm to mutate the module map. To summarize, I think they are solvable with an opt-in, plus an understanding of how this design decision would impact future proposals. The communications channelSharing a module map which is mutable by both sides provides a high-fidelity cross-realm communications channel. By timing how long a dynamic import() takes, you can easily tell whether the module was already in the module map or not. Concretely, let's say the outer realm wants to run two pieces of un-audited code in two separate realms, A and B. And it wants to prevent them from communicating with each other, for whatever reason. The design where both A and B can mutate the shared module map allows such communication trivially: A does import('https://example.com/known-module.mjs?b');
import('https://example.com/known-module.mjs?c');
import('https://example.com/known-module.mjs?e');
// ... and B does const [a, b, c, d, e, ...] = await Promise.all(["a", "b", "c", "d", "e", ...].map(isSet));
async function isSet(bitName) {
const start = Date.now();
await import(`https://example.com/known-module.mjs?${bitName}`);
return Date.now() - start < 100;
} Now, realm boundaries preventing communications channels has never been an important property to me. But from what I understand it is an important property to some in the champion group, such as @erights. So I'd want it to be clear that if we allow sub-realms to mutate the module map, we are setting a precedent that realms provide no barrier to communication, and that future proposals which introduce communication channels (e.g. via mutable objects on shared prototypes or similar) can build on this precedent. Cache poisoning and version skewConsider a web application that sees frequent deploys, e.g. every 10 minutes or something. It also uses lazy loading: when you arrive at Consider the case where this web application runs un-audited code in a realm. It does not hand this realm any capabilities; it only provides it immutable data, and no functions. Perhaps it thinks it is running an image decoder or something. The application thinks that the image decoder cannot interfere with the larger workings of the application. However, with a shared mutable module map, this will not be true. Because calling real.evaluate("runSupposedlyPureButUnAuditedFunction()"); could in fact cause the shared module map to get populated with the current version of And this is bad, because it can cause version skew. In particular, let's say that the image decoder runs before a deploy, but the user actually clicks the button after a deploy. Now, This is a footgun concern, similar to the ones that drove the work on isolating object graphs. It's not clear to the user of the Realms API that, even though they provided no capabilities to the realm, the realm still has the ability to poison the cache. Basically, you have to move your web application to using content-addressable URLs for all its modules, before you can deploy realms in the manner that they're advertised to be used. This could be ameliorated by making this cache-poisoning capability opt-in, via an API such as const realm = new Realm({ allowSharedMutableModuleMap: true }); or perhaps const realm = new Realm({
sharedPotentialImports: [/*... list of URLs or maybe module specifiers...*/]
}); |
This comment has been minimized.
This comment has been minimized.
In a discussion today at the SES meeting we agreed (without objections from those present) that we are fine with the mutation of the module map by a child realm without any opt-in or enumeration of imports. Some of the people present at the meeting includes @erights, @littledan, @caridy, @rwaldron and a recording of the meeting should be publicly available soon. This way, the single module map extended to per-realm instantiation of modules works fine and has no major side effects that would prevent us going ahead with the proposal. We have some examples below that translates some of the side effects: Positional Flakinessconst r = new Realm();
const incr = await import('./module.mjs');
const rIncr = await r.importValue('./module.mjs', 'incr');
// Not a TypeError vs const r = new Realm();
const rIncr = await r.importValue('./module.mjs', 'incr');
const incr = await import('./module.mjs');
// would be a TypeError exception because 1 line reordering Race Conditionsconst r = new Realm();
const pagePromise = import('./module.mjs');
const childPromise = r.importValue('./module.mjs', 'incr');
await Promise.all([childPromise, pagePromise]);
// This might throw a TypeError or not if the host caches ./module.mjs for the parent realm before r.importValue |
To be clear, I'm not fine with a no-opt-in solution. |
Thanks! That's been clear and that's why the champion group took some time to discuss it including the reported discussion from the SES meeting to provide a proper informed decision on the paths ahead for this proposal. |
Great. It's also exciting to hear that we'll no longer have any problems adding cross-realm communication channels to the language, with this precedent! I know that's been a blocker for a lot of proposals before. |
I don't share this same interpretation but I'm welcoming to discuss details about this at the TC39 plenary. |
I note that the current spec draft states
but the proposal of allowing the child realm to mutate the module map by importing modules contradicts this. Perhaps we should remove the bolded phrase from HostInitializeSyntheticRealm? |
are the values from the import map observable in user land? The bolded text refers to values that can be observed, as far as I understand, changes to the module map can be guessed once by network cache and if no I can add a clarification or a note saying these are values observed in user land through bindings or properties, WDYT? |
I'm referring to module map, not import map. Indeed, changes to the module map are observable from userland in the manner outlined in #261 (comment) "the communications channel". |
I guess probably the section on "side effects such as I/O" should also be removed, since that's what |
That's a fair point @domenic, we have always struggled with the fact that |
Sure, so if the idea is that you can only mutate cross-realm resource asynchronously, then we could put that into the host hook definition. |
Side channel and I/O in the RealmI'm quite confused by the above discussion about the module map serving as a side-channel. The real side-channel here seems to be that I don't think anyone is proposing that Realms have a partitioned network cache. Instead, the proposal has long been that Module map organization and mutationI'm OK with the proposed idea of including an additional in-memory cache for the parsed representation of modules before they are imported by separate Realms, but I don't really understand the motivation for it. This would give certain very small additional guarantees about avoiding extra fetches (in the uncommon case where the response's cache control headers are a certain way). As described above, this change does not affect whether there is a timing side-channel. Overall, I'd prefer the simpler form without this additional caching layer, as described in the Realms HTML PR--this seems simpler and accomplishes the same thing in practice. But I'm fine with this caching, since it will have so little effect in practice (similar to the compromise we made about import assertions being part of the module map key). A number of concerns are listed above about "mutation". I have a lot of trouble understanding these. I don't see any real mutation spanning different Realms going on, just a shared cache where modules are pulled from (which will exist on the web either way, in the form of the network cache). I'd prefer to not have any particular opt in/out (as I do not see any use cases, only brokenness), but instead, to just let |
@domenic using a different lenses for this, the callable boundary is not enforcing no mutations across Realms, it means no direct access. From the user point of view, |
Well, the difference is that the outer realm opts in by passing
I agree that the callable boundary does not address the footgun I pointed out above. |
Or maybe you were discussing the point about HostInitializeSyntheticRealm? If the idea is that the callable boundary doesn't need to enforce no cross-realm mutations, then is the implication that disallowing cross-realm mutations is not a goal of the proposal? In that case we could just delete the prohibition on "mutation of values that are shared across different realms within the same host environment" in HostInitializeSyntheticRealm. But, I'm not sure if that's what you're saying... |
Maybe module map is a special case to allow (the mutation of internal state)? |
I'm closing this as it seems resolved. I believe the Stage 3 specs already resolve this, including the respective discussions in the last couple plenaries. Thanks! |
@domenic raised this concern during a conversation today. From what I gathered, the main issue will be to have more than one module graph per window. He think this is problematic due to how intertwine these two concepts are. Also mentioned memory, and allocation of multiple instances of the same module source as another potential issue.
From the proposal perspective, the reason to not reuse the module graph from the incubator realm is mostly because 2 reasons:
Up to this point, reusing module instances from other realms was only possible via the compartments API (another proposal).
The text was updated successfully, but these errors were encountered: