-
Notifications
You must be signed in to change notification settings - Fork 29.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Detect require vs. import based on the entry point, not package.json #39353
Comments
Related discussion: #37857, nodejs/modules#300 (comment), nodejs/modules#296.
I don't thing that's a reasonable solution: it would introduce a performance penalty on startup time if node has to parse the file for
You could use loaders to customize which format / parse goal for a given file: https://nodejs.org/docs/latest/api/esm.html#esm_loaders. It's not stable yet though.
Legacy code is exactly why changing the default behaviour of Node.js is a very difficult thing to do, it would be a problem if older code base stopped working. Note that Node.js looks for |
I'm suggesting changing the default behaviour, so if Node should allow mixed import/require when there's a package.json, then perfect: that is not at issue, I'm all for that. But Node should also do the right thing without a package.json.
Correct. And that would be very little time because it only has to scan one file, plus (and this is the big one): if you use a package.json, Node doesn't need to scan anything, it already knows which mode to run in, so it would do exactly what it's already doing right now, and no one following current necessary practices will be impacted. Only folks who invoke Node without a package.json would run into this performance hit, although I struggle to call it that: it'd be insignificant compared to the rest of Node's initialisation.
While I see a lot of potential for loaders, they are antithetical to why I filed this issue: Node should do the "obvious" right thing out of the box (i.e. if asked to run a file, it should run that file, either in cjs mode, or esm mode, automatically. Not first be configured to do the right thing through runtime flags, or config files, or config code).
Sure, but again: that's not doing the right thing, that's forcing people to use nonstandard extensions as workaround. Node should (and this is a long term "should", not "it must, now!") do the right thing when given a
No it isn't? That's what major versions are for. I have no expectation of this getting changed in the current version, but I do kind of expect this to be seriously considered for Node 17 or 18. The way Node works as of v16, especially as LTS, should stay exactly what it is now, let's definitely not change that. People rely on it to work a specific way. But 17 or 18 are fair game. |
Hum so you'd like to see this behavior only for the entry point 🤔 In this case, I have more questions, how would you define parse goal for dependency of such modules: // entry-point.js
import './submodule.js'; // is this ESM no matter its content?
import './submodule.cjs'; // is this CJS?
import 'some_package'; // can this be CJS? // other-entry-point.js
require('./entry-point.js'); // is this CJS? // another-entry-point.js
// no import, no export, no require, is this ESM or CJS?
console.log(this); // is this undefined?
var obj2 = { get x() { return 17; } };
obj2.x = 5; // should this throw?
Why would the "right" thing to remove dynamic imports for CJS? Are they causing an issue, or is this just for convenience to tell if a file is CJS or MJS?
Note that what is "obvious" to you may not be for everyone. While I understand loaders are not the long term solution you'd like, it can still be useful for you (or someone else) to build a POC. |
this is not a novel topic of discussion and I think the general sentiment is that this would be cool if someone builds something, but no one has built something because it's a very complex problem, and you have to make a lot of subjective calls which people disagree on. |
Good questions. In the absence of a package.json, keep it simple, and keep it naive:
would result in going "that's a normal import on line 1, stop checking and use ESM parsing" and then when it runs the file it tries to import submodule.cjs using ESM rules and it'll throw an error, and that's fine. Use a package.json if you need to mix modalities.
Yes, it is, because the first import mechanism encountered uses
This is Node's default mode. If that's still CJS by v17 or v18, then it's CJS, unless Node's finally ready to switch to ESM parsing by default, then it's ESM.
Who said anything about removing dynamic imports? If the file starts as a normal, modern, plain, JS file (that is it starts with a bunch of This feature request is entirely about what Node does when asked to run "plain files", in the absence of a This really is a matter of "if the entry file starts as a normal JS file with
Hence the quotes, I'm talking about the kind of obvious that exists when looking in from the outside: "I have some JS files, I've written them following current standards, running |
Thanks for the clarification, I think this makes a lot of sense; first we'd need to find or write an algorithm that can tell if a JS file is ESM or CJS, is this something you'd be interested in contributing? Related (vaguely): TC39 proposal
If you allow me to be pedantic, you can have external deps without a |
Something that goes in this direction is being done in #39175 |
Ah, yes that's true, good point.
Nice!
While I wouldn't mind giving it a shot, given the number of projects I'm already involved with any promise to try to get to that soon would pretty much be a lie, I'm basically booked up for projects for at least a year, filing issues and discussing whether there's merit to it is the most I can do for larger projects for the foreseeable future =( |
#39508 didn't get much traction, and there was at least two outstanding objections for not implementing this proposal. |
These arguments seem to apply on the idea that "this makes it easier for beginners". That's an argument I don't buy into. Changing the behaviour as suggested in this issue (making Node use ESM parsing mode if the entry file and only the entry file is ESM) makes Node better as a tool. Tools should have sensible default behaviours, and a way to override any and all of that behaviour by specifying an explicit configuration (either throught runtime flags, or a config file, which for Node is the So if we're talking about sensible default behaviour: if someone writes plain, modern JS, which because of the JS spec can only be ES module based code (because that's what TC39 decided is the only import mechanism that spec-compliant JS can use) then Node should be able to execute that file, because Node's job is to run JS, which means at the very least it should run spec-compliant JS, even if it can also run its own flavour of JS. And if it's asked to run legacy CJS code, it should run that without any complains too, because it's been doing that for about a decade now; that should keep working. And if someone writes code and specifies a the configuration file that Node looks at, then obviously now Node should do whatever it does based on what's in that configuration file. This is basic tool design: if there's a config, that config kicks in. And yes: that means that ESM code that works "on its own" suddenly won't run anymore when you use "What if your ESM code has imports that somewhere down the line switchi to CJS style requires?" "What if someone mixes ESM and CJS in their own (flat file) code?" "What if their ESM imports a Node API, which uses CJS?" |
CC: @nodejs/modules I think we should close this as a duplicate of many older issues and discussions unless something novel comes up. Per:
This is only a heuristic to help debugging, it isn't 100% reliable or performant nor is it expected to become so to my knowledge. |
The hope was that this explanation was the "novel" thing people keep mentioning. Every other issue seems to want way more than what Node, the command line util, should be doing, making this proposal far more reasonable in comparison. |
I don't see any new arguments here to be clear. I think something novel would need to explain how to avoid breaking things rather than stating that breaking things such as existing code without |
If anyone is interested in working on this problem, I can provide the guidance wrt the right contributions to cjs-module-lexer to enable this analysis to be fast. Feel free to DM me. I don't personally have the bandwidth to take such a thing on as much as I would like to. |
Nor do I, given both my day job and other FOSS projects I'm involved in, but closing it feels the wrong solution. Keeping it open with a "contributors welcome" label would seem a better approach here. |
To be clear, even if parsing is very fast I don't think we should land the currently proposed feature in this issue. See previous PR linked above and Issues/PRs before that. I will state in 2014 I was a proponent of dual parsing, but would rather we find an approach without grammar collisions and that is forwards compatible. |
I'd like to counter that you can do both: you have grammar collision right now, moving to a solution that works without solving that, and then moving to a solution that does solve that afterwards, is infinitely better than skipping the step in between and having people wait years more before node "just works" when asked to run a standards-compliant file. (After all, the more time passes, the more the TC39 syntax is the correct grammar that interpreters should always support, even if they also support some other non-standard grammar, and the more cjs becomes "that thing that we needed in the past, but why is it still the thing node insists on is the standard?") |
@Pomax the grammar collisions have increased over time, not decreased. |
@Pomax there are two standard TC39 syntaxes, not one, that can not be unambiguously distinguished by parsing, and at this point that will never change. |
@Pomax the ambiguity that is an issue is a file without any |
Other examples of subtle differences:
Some of these can be detected by parsing - and some cannot. And even for the ones that can be detected by parsing: What if a file contains:
Is this an invalid script or an invalid module? It's an early/parse error. But in which line? You can gloss over these by saying "every module has to have at least on static |
In these cases, I'd honestly strongly advise to go with "whatever lets If the entry file has The goal should not be to make Node do the right thing on every input and every edge case: all it needs to do is "just work" more often than it does right now, in cases where it's obvious which mode it should be running in. On that note, if there are no Node can be made to fail less. |
imo node.js definitely needs to succeed (syntactically) on all possible JS input. It's not just a problem with the momentary snapshot of a file - which may contain ESM syntax, or CJS syntax, or be ambiguous. It's also that when you start to make your file, before you add any The user's responsibility is to use the extension and package.json to tell node what parse goal is intended, and then node can always parse it. |
honestly i still don't understand why people are so adverse to using |
@Pomax I think a more useful line of discussion would be to work out the exact nature of the feature you want rather than trying to state that you want things to act in a way that obviously has contention and a lot of history. If the goal is only to discern file contents as a heuristic (not a standard supported discernment) https://github.com/tc39/proposal-modules-pragma for example covers the ability to statically declare that a source text should be in a specific mode of JS (both Script and Module are first class, neither deprecated) and avoids a grammar collision so that could be a more amicable route. However, this goal is odd since it starts to expand into other things, like should WASM do a similar heuristic, or TS if Node supports that as a first class module syntax for example. The parsing method is just too forwards compatibility problematic and doesn't serve to allow all Modules writing in the JS spec to run under this method. It also means that adding a package.json above the file would further alter how the file works. This is a lot to discover vs a set of small rules even if that means action must be taken by a developer to resolve the error. However, if the goal is just to run the entry point in a special manner as above it means that behavior alters based upon if the module is the entry point. With various runners like PM2 or even things like AWS lambda to some extent, there is a long history of the user application not being the entry point for the node process. A solution likely needs to mitigate this, and likely would have to apply to all files in some way, not just the entry point. |
@bmeck with eg runners, surely they still use spawn or fork, so any flag or top-level behaviour carries over fine to these scenarios? It's the sort of argument that can easily discourage further discussion unless clearly unpacked. |
@guybedford I know that at least FAAS are keeping hot instances by doing something during startup, so they are not using spawn/fork. I am fine having a discussion but not convinced that the idea of parsing simplifies things like I was at one point in time. I think parsing leaves a very precarious edge that we don't have a guarantee of being safe that I am trying to explain. I also think having unique behavior for entry points likewise has an edge. Wrappers around ESM do exist and even in some workflows
|
This was implemented in nodejs/ecmascript-modules#55. That PR forced the user to use a new flag, |
@GeoffreyBooth that sounds amazing, and it would be super useful if its docs specified which values it can take, rather than leaving users guessing (it currently says Still, the fact that it's there means that most of this issue's proposal might be entirely moot. However, I just gave this a try and as far as I can tell, this doesn't seem to do what what this issue is about: Using a
And if I have a dir with a package.json that indicates type module, and I try to run a CJS script that just contains
|
|
Cool: so where's the option too force it for files, too? Because that would solve this entire issue. And if there isn't one: can we add one to solve this issue? A runtime flag is 99.99% as good as "doing the right thing without a runtime flag". |
There is no option, by design. The only way a file's parse goal must ever be determined is by extension + package.json. |
But that's bad design for a commandline utility. The whole point of runtime flags is that people don't need entire configuration files to specify individual primitive (e.g. number or string) values. Forcing this though an external config only is cli design can be made so much better. (Sure, "where does it stop?" because everyone's afraid of the slippery slope, but in this case that's a fairly straight forward answer: it stops when people stop asking for individual runtime flags to mirror individual primitive values that are currently only available through a separate package.json file. Not allowing flags for complex values: perfectly sensible. Not allowing runtime flags for stringsand numbers that change how node runs: that's just ... strange) |
nodr isn’t solely a command line utility - it’s a programming platform. A textual file is already a primitive that encapsulates its parse goal by extension. That you can use package.json to subvert that doesn’t mean that all possible means of subversion should be permitted. |
Yes, but And no, a text file isn't a primitive, it's an external dependency in this case, with additional cognitive load, and a clunky user experience for things that could easily be runtime flags. Heck, to draw parallels to other programming tools: this is akin to passing |
The PR was never merged in. I’m just referencing it since it seems to do what you’re asking for. It would need to be updated to reflect changes to the ESM implementation since it was written, most prominently that a new flag would need to be chosen. Also presumably the version of Acorn in Node core is newer now, so the problem with |
With ESM gaining traction in and beyond node.js worlds, I am now also in the category of users who find both the available options to enable ESM less-than-ideal: Two questions:
|
It is unlikely that a change based upon syntax guessing will make it in; but breaking changes do happen for more minor things that are more forwards compatibility safe after the breakage, yes.
Node could always just ship a 2nd binary that swaps the default. A |
I don't understand why |
Yes, my thinking is also aligned that such validation can/should be done by the code analysis tool rather than the node.js runtime. In this situation, I'd rather have the runtime continue and fail in implicit ways (whatever may happen, happens), rather than guessing anything just to be able to produce a high-level speculative-at-best kind of error message.
If this ES feature "just works" with .js extension like many other ES features do, then there is no confusion and we don't have to explain anything. When it doesn't work OOTB, that kicks things into the situation where we are right now.
No, but you don't ask users to change file extensions to use another feature of same language, just because it conflicts with an existing feature; a command-line option (which we are missing here) and configuration option (which we have in package.json) comes to mind; sometimes accompanied by additional environment variable support. This approach of "for this particular feature only, change extension of your files" is a bad design and should not be promoted for anything, imo. |
@kasperk81 that's not really possible though, because the JS spec created two ambiguous parse goals. Anything that appears to "just work" will also silently fail for nonzero programs. JS kind of contains two languages now - ie, two parse goals. Modules and Scripts aren't the same and it makes perfect sense to me that they require different extensions (or explicit configuration). |
There has been no activity on this feature request for 5 months and it is unlikely to be implemented. It will be closed 6 months after the last non-automated comment. For more information on how the project manages feature requests, please consult the feature request management document. |
There has been no activity on this feature request and it is being closed. If you feel closing this issue is not the right thing to do, please leave a comment. For more information on how the project manages feature requests, please consult the feature request management document. |
FWIW Node.js 21.1.0 ships a |
that's great to hear, thank you letting us know! |
Is your feature request related to a problem? Please describe.
Node does not require a
package.json
file to run, unless you're writing modernimport/export
code, in which case you suddenly need to define a package.json even if you have no intention of creating a project out of the code you just wrote. You can't even usenode --input-type=module
because it will --for no reason that makes sense for users-- complain that youCannot use import statement outside a module
, the thing we're literally saying that's what we're doing by using that flag.Describe the solution you'd like
Node should not need folks to tell it which parsing mode to use: it should scan the entry point it's being asked to run, and simply check whether or not the reserved JS keyword
import
is found in that file. If it is, done: run in ES module mode without needing folks to create a file that Node should not rely on to do its job.require(...)
during the run is now a perfectly normal "this function is not defined in this scope" error.import
during the run is now a perfectly normal "import is a reserved keyword" error.import(...)
during the run is now a perfectly normal "this function is not defined in this scope" error.And now Node does the correct thing, given perfectly normal code as run target. Will that very first step add to the startup time before code actually runs? Sure, but scanning for the
import
keyword takes on the order of nanoseconds, not milliseconds. We're not scanning the entire dependency tree to see if somewhere down the line we suddenly switch from import to require or vice versa: by default, without a package.json, Node doesn't need to mix and match: as default behaviour Node should run both legacy CJS and modern JS (either/or, not a mix, obviously) without runtime flags or needing files that have nothing to do with Node itself created.Describe alternatives you've considered
There are no alternatives: Node should not rely on package.json just to run normal modern code, it should do the right thing automatically, with runtime flags and package.json only for folks who need it to do something else (and it should probably never need package.json, that file is so that NPM can do proper package management. Node should never need to consult it just to run normal, valid, modern or legacy, JS)
The text was updated successfully, but these errors were encountered: