-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal for single-mode packages with optional fallbacks for older versions of node #49450
Comments
Thanks for posting this. I can see this potentially working as a solution. A few questions:
To cover the case where a user wants to use ESM in I have several issues with the approach of overloading
I don’t think there’s a need to couple this proposal to the fight for enabling extension searching. Just define a new field for the ESM entry point. If/when extension searching gets enabled in ESM mode, a new PR can propose removing the new ESM entry point field in favor of |
I don't agree; there's nothing inherently distasteful to me about this. One mitigation is making the ESM and CJS module caches shared - ie, once an extensionless specifier has a value in either module format, that's the value provided to the other. It kind of sounds like you're proposing using C++ to make otherwise-async JS code be sync. Is that accurate? As far as your case 2 suggestion - a package shipping Hopefully we can all agree that "a single package can be published that works the same on both ESM-supporting node and older node, whether it's |
As far as having a separate directory from which deep imports are resolve (a "deep root", as it were), I think that's a problem that CJS has as well, and I'd greatly prefer that node solve it for CJS and ESM just inherit that solution, rather than punishing CJS by depriving it of that kind of solution. |
I don’t have an opinion on this, other than to suggest that maybe “distasteful” might not have been the best word to choose 😄But I think it’s worth reviewing the long threads in the import file specifier resolution proposal repo that concerned how to handle the same specifier (e.g. the package name) resolving to different files in CommonJS versus ESM:
|
Yes and no, respectively.
Yes - I didn't explicitly mention it, but that's implicit in the "cjs loader and mjs loader should resolve the same specifier to the same identity" assertion I added at the end. It works trivially - the
Because I don't need to do so. (Less is more and so on) However were we to go with this and also finalize removing extension resolution on the main entrypoint, I would be forced to do so to still allow nodejs/modules#2 to happen, I believe.
This is purely a matter of organization and personal preference. I don't really need to or mean to pursue it in either direction here, since it's noncritical. Personally, TS has long supported both side-by-side configurations and separate entrypoints. Supporting both isn't bad, and a cjs/esm-wide The only thing I'm asserting here is that within the same node runtime, the same specifier should map to the same module (regardless of if the source is a cjs or esm module). That's technically not required for a synchronous For example, taking my statement that specifiers within
Just like how they already need to know where
And
Not so. I could merge it now, and not upstream it until we're happy with feedback one way or the other. I don't see us up-streaming regularly. And again, while that's incredibly useful for supporting older versions of node, it's non-critical for compatibility within the same node runtime, which is the primary focus here. It's just stating how it's very much still possible to support older versions of node without have a package that ever exposes two identities at runtime.
I don't need to do anything one way or the other here - it's just a matter of stating that even with this, no matter how the resolution argument pans out, there's a way forward for the back compat story. I don't need to work on either direction here (though my preferences are likely known, but since I missed the out-of-band meeting where the flag was decided on, maybe not - extension searching 4 lyfe).
Anyone who relies on prototype chains is going to have serious issues with packages that create them twice - once in
Yes, but that JS code is just JS code running within the loader itself, no user code is included, except potentially user-defined loaders, which, ofc, we'd very much like to isolate anyway (I mentioned those caveats in the OP and won't go into them again here). The
Yep. That's why I said it's "effectively free" - most all designs support it like that, except for some designs which remove extension resolution wholesale from the resolver (and those which do not aughta have a separate entrypoint field for esm exactly this reason).
I don't think I've explicitly called anything similar to that out here, except maybe how supporting older nodes comes effectively for free with extension priorities on the
And at the time I agreed - except I've actually done research and implementation work in the intervening time to find a better solution is quite palatable without the downsides of having the same specifier resolve to two separate identities. We've been saying that everything's subject to change for a reason ❤️
I 'd call it synonymous with "problematic" with a hint of opinion on the matter, which is IMO more accurate than just saying "problematic", no? Is it really the lack of expounding on why it's distasteful in the OP the real issue, or is the word "distasteful" itself distasteful? Unrelated to all those responses: I'm pretty sure that if |
So besides splitting this apart from the extension searching debate, there’s another way we can split this up: into Then once that’s merged in, you could add the code for how to handle bare specifiers that could resolve to either ESM or CommonJS. (In the current implementation they’re always either one or the other, that’s why I’m thinking that the dual part could be a second PR.) I don’t want to wait months to get a solution upstreamed for dual packages, so I would encourage you to use a second field to define the ESM entry point. We can always support Re my comment on “distasteful,” I was being tongue-in-cheek (sorry for the pun). I just meant that I was assuming @ljharb responded as he did because you chose such a forceful word. I mean, he surely would’ve responded anyway, but you made it even more likely 😄 Anyway what do you think of this phased approach? |
Just thought of one more thing you’ll need a good answer for: how does this work with dependencies? Both direct and dependencies of dependencies. For example, I just pushed jashkenas/coffeescript#5177, a PR for creating an ESM version of the CoffeeScript compiler for use in ESM-capable browsers. (Yes, I’m very proud of myself.) It’s an ESM version of only the existing CoffeeScript browser compiler, which itself has a more limited API than the Node API; the Node API has an option to enable transpilation via Babel, for example, while the browser API does not. Or looked at another way, my package’s CommonJS exports and ESM exports are different. Maybe this isn’t all that common of a case, as most people will transpile to produce the secondary output format, but it’s not inconceivable; it might very well be exactly what I do to add ESM support to the Node CoffeeScript package, when Node’s ESM implementation is stable. So say I did just that, and published So that can at least be worked around without too much pain, though it would be annoying. But what about dependencies of dependencies? Like, say, Express, which has 30 dependencies. What if one of its dependencies differs in its CommonJS and ESM exports? Would the ESM version even be loaded at all in Node 12, if the top-level Express is still a CommonJS package? What if Express itself gets updated to be dual; would then the ESM versions of Express’ dependencies be loaded as ESM if possible? What if any of them have different CommonJS/ESM APIs? In short, how do we keep Express from behaving differently in Node 12 as Node 10, if any number of its dependencies might change under its feet because they suddenly are exporting different ESM files than the CommonJS files that Express is expecting? This might require that Express’s authors updated its dependencies to their latest versions without testing that Express still worked in Node 12, which is unlikely for a project as high-profile as Express but surely will happen for at least some commonly-used packages. Edit: I know I ask others to always try to provide answers for problems they pose, and I just realized I didn’t do so for the Express case, because I can’t think of any 😄though one could argue that it’s unavoidable and still preferable to the other PR, which has some of the same issues. |
You have two different API surfaces. Ship two different packages.
If you have two different APIs, ship two different packages. If your dependent is reexposing you it would also be reexposing your bad practice. Two packages, one of which with an appropriate advisory
A conditional require based on The gist is this: If you have a cjs package, and you follow semver, your upgrade path needs to look like this (to provide the same API to all consumers at a given version):
IMO, there is no good reason a package should present two different APIs based on node version or caller module kind. Doing so feels like a violation of semver - if not explicit, in spirit at least. If different APIs are for some reason needed, the solution is always trivial - just publish an additional package with the differing APIs.
When they published an esm version with an incompatible API, that was a semver break and the version number should indicate that. It should not match the dependency version until it's dependents have updated to handle the differing API (which may include conditional behaviors based on the shape presented, if compat with multiple APIs is needed).
Anything that ships esm should be loaded as esm in runtimes that support esm.
As much of the dependency tree as is loadable as esm is loaded as esm. Only things which only ship cjs would be loaded as cjs. On older node, only cjs would be loaded. If a dependency no longer shipped cjs, it would, ofc, be a semver major break that prevents it and its dependents from running on older versions of node.
A semver compatible dependency should provide a semver compatible API when a compatible version is selected - that includes consistent APIs across node versions. If it does not, it has violated the expectations of semver and needs to be handled as a misbehaving dependency (ie, by locking down it's versions more tightly and scrutinizing changes in it), same as if it had suddenly deleted or incompatibly modified any other important API of the library. |
So this is easy to say in the abstract, but reality is more complicated. Let me use the I want to make my package easy to use in Node 12 in an ESM environment. It’s already usable via import CoffeeScript from './lib/coffeescript/index.js';
const { compile } = CoffeeScript;
export { compile } I tried this out by creating a new folder, running import { compile } from 'coffeescript';
console.log(compile('alert "hello!"', {bare: true})); And it works as expected. Done, right? That one Anyway so as a developer I think I’m done; I’ve exposed my public API in ESM. Except there’s no concept of a public API in CommonJS, at least not programmatically. If you run In other words, it might not be obvious to developers that the CommonJS and ESM versions of their packages’ APIs need to match. Some developers might well just decide to use the default export and stop there. There’s no telling what developers will do, and you’ll need to engage in an campaign to educate them on doing dual packages right to avoid these issues. There’s also the problem that even if the APIs match, by definition they’re not running the same code. For packages where the ESM is transpiled into CommonJS, you’re basically trusting that Babel (or whatever transpiler) is doing a faithful job, and that nothing in the user’s package falls into one of the edge cases where transpilation isn’t quite the same as native execution (like the little gotchas of subtle differences between Now of course, most of the time there won’t be an issue. (To be really cynical, I could argue that that makes this even more of an issue, as then it’s more likely to be a surprise; but I won’t go there. 😄) So the question is whether this “different code in different versions of Node” issue is worse than the “different code whether the package is used in CommonJS versus in ESM” of the other PR. I’m not sure how to answer that question; one factor surely should be how common problems are likely to occur in one versus the other, and what those problems are likely to be. |
what? how is this any different from esm? i can
this is a problem with your package, not this proposal. |
Y'all should fix that. People will come to depend on those privates if you expose them, and it'll then be a semver break to remove them. Not including them in your esm API is likewise semver breaking, since something proving the same "version" of your API isn't actually API compatible. And I'm not talking in the abstract here - we deal with these discrepancies every day over at DefinitielyTyped. People moving from, eg, cjs to esm, do notice the small changes to an API based on environment, and do classify them as breaks, even if the library maintainer didn't. Having a consistent API across all access patterns is expected.
Different code within the same runtime is worse. Significantly. That's a forever cost - a legacy you can't be rid of. That support of that behavior is there forever. People don't transition node versions often, and once you've crossed the boundary, that's it, you're done, you can almost never have a similar issue again (and the issue is unlikely enough to begin with for well-behaved packages), short of repeated downgrade/upgrade cycles across the esm boundary (which would imply we've done something very wrong that necessitated repeated node downgrades).
Usually and ideally in unobservable minutia that very few reasonably relies on. Pulling a name out of a namespace and calling it is pretty universal. Reflective patterns are more rare and by nature require inspection of the object being reflected upon and if you have consumers whom you value and you value reflective stability in the minute details of your module's construction (why), then you many need to include that in your API contract and version appropriately. Generally, it's something pretty much no one needs to care about. |
The I don’t need to be explained the value of a consistent API. I understand the issues. The point I’m trying to make is that in the examples I describe above, I can imagine a reasonable developer making the same mistakes that I’m tempted to make here. My point is that these mistakes might be common, and that’s an important thing to consider. The harder it is to successfully make workable dual packages, the slower the migration to ESM will be. One way we can limit the risk of developers screwing up their migrations is to have a CommonJS dependency only receive CommonJS versions of its dependencies. In current Node, I think, if my project uses
It’s only a cost for as long as a package author wants to maintain both CommonJS and ESM versions of their package, as there’s no “different code within the same runtime” if a package contains only ESM sources (or only CommonJS sources, for that matter). If we backport ESM to Node 10 and 8, then a package author could stop publishing CommonJS versions of their packages by next month, when Node 6 is end-of-life. If we backport ESM to Node 10, a package author could drop CommonJS by the end of this year. The transition period could be very short. (If it’s only 12+, then packages will need to dual-publish until 12 is end-of-life in April 2021.) |
I don’t think it’s reasonable to expect that node’s EOL has any impact on the ecosystem dropping support; most of my packages support down to 0.6 and will do so for the foreseeable future, and plenty of people still use node 4 in production. We need a solution that works for a long transition period; some group of people will need/want to dual-publish for a very long time regardless of node’s support period. |
One way we can limit the risk of developers screwing up their migrations is to have a old version dependency only receive old versions versions of its dependencies. In current Node, I think, if my project uses lodash@4 and request@2, and then Request uses lodash@1, both my project and Request each get the separate versions of Lodash that each expects. The same can happen with old versions/new versions, where my project would get new Lodash and the old Request dependency would get the old Lodash. Then at least I don’t need to worry about poorly migrated packages that are dependencies of dependences; my scope of concern is limited to my project’s direct dependencies, which I have more control over. Nothing about that situation cared about module kinds and it's why semver is used. The solution is already in wide use. Yes, people can mess it up, but the workarounds, too, already exist as well (version pinning). Nothing about that is a new problem. It's identical to major breaking changes in packages today. |
My recommendation based on meeting discussions today, can we zero-in on the sync/async flattening concept and make into a meeting? Edit: @weswigham I think there would be interest if you feel an offline meeting first is something useful towards having the PR. |
Per 2019-04-10 meeting, next steps:
|
If I understood correctly this proposal, I think making Having We should move toward a place where Live bindings also might confuse |
Very few modules utilize live bindings (in cjs or ESM, since it’s simulable in cjs too), so i don’t think that’s really something that needs considering, |
if we return the namespace object (which is what is proposed) then live bindings just work normally, so i don't think its a huge issue. |
so, are you saying: const {a, b, c} = require('./live-bindings.mjs'); will have Anyway, live bindings were the very last point of my comment, if these work anyway, other thoughts still apply. |
@WebReflection a, b, and c bindings you created would never change. only repeatedly accessing |
@devsnek then live bindings don't really work, 'cause importing those instead, would carry new values once changed, right? I still think don't see why Moving to ESM should implicitly mean migrating from CJS, in the long term, unless impossible. Importing |
@WebReflection it's the same as
right... this allows modules to migrate to esm regardless of how they are consumed. |
that's what I meant, yes.
by keeping |
@WebReflection I think live bindings which are often glossed are a much bigger headache in the equation — that is with me staying neutral to this discussion, migration of pseudo-esm code is a very layered problem that maybe needs to be worked out in a separate issue (#314). |
require(esm)@weswigham and I had a brief call to discuss The main issue we focussed on was understanding the impact of In practice, the issue normally arises when a CJS consumer immediately uses a The conclusion was that package authors will be responsible for judging whether they are likely to have transitive CJS consumers that would be broken when they make a release that first introduces irreversible async behaviour into one of their modules. Post-call thoughtsMy main question to the group is: do we think this will become an ecosystem compatibility footgun? I will offer my personal experience here, of operating a reasonably large AMD module ecosystem (~10k packages) that permits both async module evaluation (equivalent to top-level await) with both synchronous As a result, I am concerned that if we introduce this hazard on an ecosystem the size of npm, we'll see a large number of breaks over a long period of time. |
As far as I understand, synchronous |
False. All it does is make the main thread wait on the results, which it would be doing async anyway. Multiple threads can still be compiling simultaneously while the main thread waits for all of them. |
It is quite literally only the main thread that is prevented from executing random stuff while the syncified promise chain is executing - the syncified chain can run whatever parallel/concurrent operations it wants, which the main thread will wait for completion on at the sync point. |
To respond to @robpalme: The "hazard" as described already exists in cjs today, especially given |
How do you queue up these multiple pieces of work to be run in parallel, if all the |
Why would one assume that work can be queued up? There's potentially arbitrary code executing between those Besides, it's moreso if your |
I share your impression that CJS does not lend itself to batching things up. This makes it hard for me to see how you could make use of parallelism/off-main-thread work. |
@littledan he's saying dependencies of things you require can be batched, not the multiple things you require. |
Responding to @weswigham It's true that in CJS you can write an async-IIFE that will assign exports in a later tick. I suspect most people who consider that pattern will understand they are creating a dangling promise hazard that may unleash zalgo. For top-level await in ESM, it's more reasonable to use it care-free because it's an endorsed pattern with no dangling promise. So I do not think the two case are equivalent. |
While I think TLA has important usecases in the REPL context and at application entrypoints, even a TLA that blocks module graph execution introduces sequencing hazards and potential deadlocks in the context of dynamic import - one of the TLA proposals tries to hide the issues by making the "common" case of a single closed module graph execute without issue, but it still does introduce hazards. I'm just arguing that the difference between those hazards and existing hazards isn't that much, and that the benefit of allowing the vast majority of existing modules to migrate without any kind of penalty on the ecosystem (as they do not rely on anything async during execution time, as doing so today, as you said, would be regarded with great suspicion) is totally worth it. Especially since it's only, at that time, "legacy" cjs callers who'd have to "deal with" such a hazard, anyway (and workarounds like an await-able namespace symbol can be provided). It's a much smaller hazard than, say, your entire package ecosystem crumbling around you as they outright drop support for cjs, IMO. |
Things are not so black and white and there are lots of ways to continue to have legacy support, this to me feels like a non-sequiter Take a look at this repo where i've attempted to explore one method for moving a module from CJS -> ESM in a Semver-Major without requiring interop or overloaded specifiers CJS: https://github.com/mylesborins/node-osc In the CJS implementation there is a deep import This works, does not leave the "ecosystem crumbling". While it may not have the same ease of use and automagic loading, it does not introduce the hazards of either dual mode or |
Needing to know the format of the thing you're importing/requiring in order to consume it, however, is a pretty big hazard to some of us (altho not, obviously, to all of us). For every single (extensionless, to be fair) path, i think I should be able to import or require it without having to know what format it's authored in - whether it's CJS, ESM, wasm, etc. |
There are no constraints there. You can publish an entirely different package with a semver major bump. There's nothing you can't do. It's about making a backwards compatible (aka semver minor) migration path. |
This implements the ability to use require on .mjs files, loaded via the esm loader, using the same tradeoffs that top level await makes in esm itself. What this means: If possible, all execution and evaluation is done synchronously, via immediately unwrapping the execution's component promises. This means that any and all existing code should have no observable change in behavior, as there exist no asynchronous modules as of yet. The catch is that once a module which requires asynchronous execution is used, it must yield to the event loop to perform that execution, which, in turn, can allow other code to execute before the continuation after the async action, which is observable to callers of the now asynchronous module. If this matters to your callers, this means making your module execution asynchronous could be considered a breaking change to your library, however in practice, it will not matter for most callers. Moreover, as the ecosystem exists today, there are zero asynchronously executing modules, and so until there are, there are no downsides to this approach at all, as no execution is changed from what one would expect today (excepting, ofc, that it's no longer an error to require("./foo.mjs"). Ref: nodejs/modules#308 Ref: https://github.com/nodejs/modules/issues/299 Ref: nodejs/modules#454
@weswigham can we close this particular proposal? |
Nope. While conceptually I don't think anything was wrong with the original proposal (indeed, I don't think there were any objections on anything other than technical grounds), it seemed like some people would not be convinced of the implementation (or speculated changes thereupon) until they could see how it actually plays with TLA in practice. Myles looks to be merging an unflagged TLA implementation in the (near?) future (which I've been waiting for); once that's in, a revised implementation that handles TLA should be doable (well, no less than 3 seperate implementations that each handle TLA differently are, anyway). I'm hoping that showing the interop in practice should assuage some concerns. |
Perhaps we can close this thread and you can open a new one that takes into account the TLA that's merged in? |
Nothing in the OP is out of date - it would be a nearly verbatim repost (as only technical implementation details are changing). There's no reason to. |
This is a counter-proposal to nodejs/modules#273.
A bunch of us have brought this up during informal dual-mode discussions before, but the very concept of "dual mode" packages which execute different code for cjs vs esm callers is distasteful. It pollutes a single identifier (the package name or specifier) with two distinct and potentially irreconcilable package identities (which themselves potentially contain duplicated identities and dependencies). From anything other than the perspective of the registry, it's effectively shipping two separate packages for a single runtime - one for esm callers and one for cjs callers.
The original desire for so-called "dual mode" packages derives from two uses:
cjs
consumers in the newest versions of node.In this proposal, I will tackle the two issues separately.
First, case 1:
In the current implementation, a esm-authored package cannot be
require
'd. This means that you cut support for all cjs consumers when you migrate to esm. This kind of hard cut is, IMO, obviously undesirable, and both the "dual-mode" proposal and this proposal seek to remedy this. In the "dual-mode" proposal, this is solved by shipping seperate cjs code alongside the esm code, which is loaded instead of the esm code. In this proposal, the esm code itself is loaded. This means that when a package specifies that it has an entrypoint that is esm, it will only ever be loaded as esm. The astute in the crowd would note that while yes, that's all well and good, the cjs resolver is synchronous, while we've specified that the esm resolver is asynchronous - a seemingly irreconcilable difference. I arrive to tell you something: this is not so. An esm-based require may be executed async, but appear to be synchronous to the cjs caller - similarly to howchild_process.execSync
works today (and, in fact, using similar machinery). This synchronization only affects instantiation and resolution - the execution phase remains untouched, so if at some point in the future top-level await becomes a thing, depending on variant, either therequire
can conditionally return a promise (if TLA is supposed to be blocking) or happily return the exports object while the module is still asynchronously executing. The only other concern would be the observable affects on user-defined loaders, which, if we follow through on that design with out-of-process (or at least out-of-context) loaders (which are very desirable from an isolation perspective), the solution there, likewise, is simply using an apparently synchronous execution of the async loader, just as with the builtin loader. In-process loaders can also be syncified (and in fact are in my current implementation), but care must be taken to not have a user loader depend on a task which is meant to resolve in the main "thread" of execution (since it will not receive an opportunity to execute, as only the child event loop will be turned) - this means relying on a promise made before the loader was called is a no-go. This shouldn't be a problem (the builtin loader, despite being written to allow async actions, is actually fully synchronous in nature), and, again, is fully mitigated by having loaders in a new context (where they cannot possibly directly depend on such a task).By allowing
esm
to berequire
'd in cjs, we can close the difference between the esm and cjs resolvers. If the cjs resolver can resolve a file, IMO, in the esm resolver it should resolve to the same thing or issue an error - it should not be possible to get two different identities for the same specifier based on the resolver used (this implies that the esm resolver we use should be a subset of or identical to the cjs resolver). This means implementing knowledge of"type": "module"
and the like into the cjs resolver, since afaik, it's not already there (though it does already reserve .mjs).And case 2:
With the above, a package can only have one runtime execution associated with it, however that only requires shipping esm. To support older versions of node, a cjs version of a package must be shipped. An answer falls out naturally from extension priorities: Older versions of
node
do not search for the.mjs
extension. Making your entrypoint a.mjs
/.js
pair (with.js
ascjs
), the.mjs
will be found by versions of node which support esm, while the.js
will be found by older versions of node. This is ofc constrained by what older versions of node already support - there's no "work to be done" to improve this, just work to ensure that it continues to work and that it does not become blocked by some other modification. naturally, this means in a package with a cjs fallback for an older node, the entrypoint cannot be a.js
esm
file - however otheresm
in the project can be (by placing apackage.json
with{"type": "module"}
in whichever project subfolder has theesm
source beyond the entrypoint). This, too, has been alluded to by many people in many other issues, yet has not yet been written down.TL;DR:
require
ofesm
can be done. Doing so allows anesm
to actually replace acjs
one, and alleviates the need for a "dual mode" resolver system. This also greatly aids the migration story (especially when paired with the dynamic modules spec, which allow for more than just thedefault
member to overlap during migration).esm
(andcjs
) resolver to be a proper subset of the cjs resolver and resolve to the same cache entry or an error for each specifier which can be represented in both (cjs obviously does not respect URLs in any way, so any URL resolution remains solely the domain of the esm loader)cjs
fallbacks for older versions of node come free with the extension searching that older versions of node already do (so long as we allow a higher priority entrypoint to be found in newer versions of node, this method of backcompat comes "free"). Without that, an alternate esm entrypoint is useful. Neither is tied to this, but it's worth mentioning that it's still easily doable.The text was updated successfully, but these errors were encountered: