-
Notifications
You must be signed in to change notification settings - Fork 30.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
module: add releaseLoadedModule
#50618
Conversation
This is a new utility as discussed in nodejs#49442 which provides the ability to release a loaded module. Releasing a module means that it can be garbage collected (if it is abandoned) and reloaded (by importing it again with the same specifier).
Review requested:
|
@GeoffreyBooth as discussed. This also clears the resolve cache, and therefore the parameters must be (specifier, parent) instead of a resolved URL. The name |
Or if you call |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also needs documentation.
lib/internal/modules/esm/loader.js
Outdated
/** | ||
* Evict saved result of `resolve` and `load` for the given import parameters. | ||
*/ | ||
evict(specifier, parentURL, importAttributes = {}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
evict(specifier, parentURL, importAttributes = {}) { | |
evict(specifier, parentURL, importAttributes = kEmptyObject) { |
lib/internal/modules/esm/loader.js
Outdated
const requestKey = this.#resolveCache.serializeKey(specifier, importAttributes); | ||
let didEvict = this.#resolveCache.delete(requestKey, parentURL); | ||
if (this.loadCache.delete(resolved.url, importAttributes.type)) { | ||
// nb: Careful with short-circuits here, we want to run both deletes unconditionally. | ||
didEvict = true; | ||
} | ||
return didEvict; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we perhaps evict from the load cache first? I know these calls are sync, but the load hook is not; if evict
is called while a module is being loaded (so after it’s resolved but before load
has finished and it’s been added to the load cache) then a module could end up deleted from the resolve cache but not from the load cache. Whereas if you delete from the load cache first, and only on success of that do you continue on to delete from the resolve cache, we avoid leaving the caches in a “broken” state where the resolution stays cached by the load got uncached.
This raises a broader question: do we need to delete from the resolve cache? What are the use cases for this API, and could they be fulfilled by deleting only from the load cache? As in, when would we need to redo the resolution? Perhaps after new module customization hooks were registered, that would affect resolution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deleting the resolution is nice because otherwise updates to package.json cannot be hot loaded and would force an app restart. But deleting the resolve cache in the manner in my diff is definitely a half measure. I think a more reliable feature would need to record all the resolutions to a given loaded module, so that when the loaded module is evicted all resolutions to it are deleted as well. The abstract specification method HostLoadImportedModule
is decomposed into our resolve
and load
stages-- each with their own cache. To actually clear that abstract method requires more machinery which I'm not sure is worth it.
lib/module.js
Outdated
const { SourceMap } = require('internal/source_map/source_map'); | ||
|
||
Module.findSourceMap = findSourceMap; | ||
Module.register = register; | ||
Module.releaseLoadedModule = releaseLoadedModule; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This cannot be available to the main thread as its effects are forbidden by ECMAScript spec. The best we can do is to have it available only in the loader thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain? Is this just a way of saying that hooks need to be registered before users are allowed to use this? Why?
How would this work from the other thread since the caches are on the main thread? Cross thread communication?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reference, the relevant specification is https://tc39.es/ecma262/#sec-HostLoadImportedModule:
- If this operation is called multiple times with the same (referrer, specifier) pair and it performs FinishLoadingImportedModule(referrer, specifier, payload, result) where result is a normal completion, then it must perform FinishLoadingImportedModule(referrer, specifier, payload, result) with the same result each time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we're aware that this API explicitly violates spec if used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It means that the spec gives some guarantees that await import("a")
will always return the same value from the same module (as long as the module resolution has succeeded at least once)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For instance, as an alternative, we could choose an approach where an explicitly opt-in command line flag is required to enable this kind of non-standard ESM behaviors.
node --esm-module-ext
This makes it absolutely clear that the individual running the node.js process is deciding on the non-standard behavior as opposed to some arbitrary library they are depending on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes it absolutely clear that the individual running the node.js process is deciding on the non-standard behavior as opposed to some arbitrary library they are depending on.
I don’t see the need for this. You might as well call the flag --allow-me-to-violate-spec
. Why should a user need to opt into this via flag when they’re already opting into it by using the API, either explicitly or via a dependency that they knowingly added? If I were the author of Vite, I would be annoyed; I would need to choose between the following:
- Leak memory.
- Tell users to always run Vite with
--allow-me-to-violate-spec
with an explanation that doing so prevents a memory leak. (Sounds like a bad spec, doesn’t it?) - Run Node in a child process with this flag set, so that it’s hidden from the user.
The third option seems like the obvious choice, and now we’re back to the user not having explicitly opted into this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps that decision should be revisited, then - otherwise an onslaught of security vulnerabilities is likely to be forthcoming where random packages use the loader api, and this one, to do literally anything they want to an application.
The built-in node:*
module namespace is chocked full of security vectors and npm is already a huge hazard. Tinkering with module resolution is hardly a major concern given that node:fs
and node:child_process
exist. The real risk is ecosystem fragmentation, like what we have with CommonJS and ESM. This is comparable to delete require.cache[key]
under CommonJS which wasn't even widely used in that ecosystem.
HMR is already possible; that’s what Vite is doing
Vite's HMR implementation is a client (browser)-side feature. For SSR you have to use await vite.ssrLoadModule('/main.js')
which operates outside of nodejs's module graph as I understand it. I might be wrong here because I personally don't use Vite.
dynohot implements import.meta.hot
on the server-side as a nodejs loader. To my knowledge it's the only implementation that truly operates under nodejs's module graph. I implemented the es262 linking algorithm to specification and use some code transformations to tie it all together. I'd love to see such a feature in nodejs but without support from v8 you need a code transformation, otherwise there's no way to update the linking of an already resolved and imported name. It seems like code transpilation is something that is outside the scope of node's runtime (and thus we have the loader API).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why should a user need to opt into this via flag when they’re already opting into it by using the API, either explicitly or via a dependency that they knowingly added?
Based on experience, I would wager very few users know what most of their dependencies are doing.
More importantly, I'm not saying no to this, I'm saying I'd like us to make sure we're we're being careful to not introduce new headaches we might regret later. I guess what I'm missing is a higher level description of the use cases, how these various bits fit in with that, and a example that ties it all together. Has such a thing been written up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The more I think on it, I'm feeling that it's tough to justify the feature.
Anyway I pushed a change for consideration. By walking the resolution graph we can ensure that all resolved references to the file are evicted, though it may be more expensive on larger applications. It's probably not that bad though since it shouldn't be a common case. There may also still be a race condition after a module is resolved but before it's been loaded. There is also another inconsistency since we can't clear the |
I'd like to use this to test a library that branches at load-time based on A CLI flag (e.g. Edit: This could be further limited by putting the API on an object that can only be instantiated once, ensuring that deps can't use it even if the flag is set, provided the user instantiates the unloader before loading the deps beside its control. |
Idea: what if this api is available to loaders only, and alao the loaders guarantee that no one will ever load this url again ( the url post resolve). In this case, we're still somewhat following the spec. We can even have nodejs check this by adding a tombstone to the released module in the cache. I would hazard a guess that mocking loaders and hmr loaders need to anyway change the module url (by adding some query parameter) to enable the functionality, so it would work in most cases. |
If it’s available to hooks (on the hooks thread) then it’s available everywhere, because the That said, I’ve never been concerned about users being able to violate spec if they choose to do so. Hooks can already enable spec violations, and native add-ons certainly can. As long as Node never violates spec by default, I think it’s fine to provide APIs for users to do so if they wish. I don’t know what “adding a tombstone” means, but if there’s some way to make the API less likely to enable spec violations, sure, we might as well do so. |
Creating a tombstone doesn't really fix anything because you are still leaking memory, though much less. In an HMR case you'll be reloading modules over and over again and the list of tombstones will grow forever. @pygy can you better explain what you're trying to do? The only use-case for this feature I can think of is for hot module reloading. It sounds like your use case is better served with a query parameter but I can't say for sure without knowing more. |
@laverdet I want to evict a file and its dependencies. AFAIK using query params only reload the target file, not its deps. |
This PR also only evicts a single module. Unless you want to parse the file to acquire all its imports and evict those too. You're only increasing the surface area of spooky double-imports that way. OR you could write a 4 line resolve loader that appends |
I'm currently evicting all dependencies manually (using the POC you posted in the sibling issue). It is indeed brittle, but the module is small enough for this to be manageable. I'm not familiar with the loader API... What you suggest sounds appealing, I'll look into it, thanks. |
I think what we're seeing in #49442 is an XY Problem. The question users are asking is "how do I delete module cache entries" but the question that they should be asking is "how do I hot reload" or "how do I write unit tests" or "how do I retry a failed module". Actually I think this PR will end up harming the ecosystem because people don't understand the utility and will inevitably misuse it. The only valid use of this utility is to relieve a memory issue in hot reloaders. Furthermore I think the memory issue can be resolved within dynohot with nodejs/loaders#116 |
Sweet, thanks! With this tweaks it also supports let id = 0;
export async function resolve(specifier, context, nextResolve) {
const result = await nextResolve(specifier, context);
if (context.parentURL) {
const url = new URL(result.url);
const parentUrl = new URL(context.parentURL);
const reload = url.searchParams.get("reload") === ""
? String(id++)
: parentUrl.searchParams.get("reload");
if (reload !== null) {
url.searchParams.set("reload", reload);
return {
...result,
url: `${url}`,
};
}
}
return result;
} // index.js
const y = await import("./foo.js?reload")
const z = await import("./foo.js?reload")
const w = await import("./foo.js?reload=34") // foo.js
export * from './dep' // dep.js
console.log(import.meta.url) Output:
This looks sufficiently useful to exist as an npm module, if you don't mind I'll publish it tomorrow (tagging you as the author). |
This is a new utility as discussed in #49442 which provides the ability to release a loaded module. Releasing a module means that it can be garbage collected (if it is abandoned) and reloaded (by importing it again with the same specifier).