Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESM module reloading and module graph #51

Open
boneskull opened this issue Dec 6, 2019 · 23 comments
Open

ESM module reloading and module graph #51

boneskull opened this issue Dec 6, 2019 · 23 comments

Comments

@boneskull
Copy link
Collaborator

boneskull commented Dec 6, 2019

In a nutshell, we need to be able to "unload" an ESM module, which is not supported. Tooling needs this for reloading files (e.g. in a tool's "watch" mode) or mocking modules (using proxyquire, rewiremock, etc.)

Ref: nodejs/node#49442

@coreyfarrell
Copy link
Member

To refine what I was suggesting in the meeting maybe we could create a standard but flagged way to do this (TC39 proposal?). For example import.unload('specifier') would be unavailable by default but running node.js or the browser with a flag could expose the unload API. The assumption being that webdriver and/or puppeteer could use the flag when initializing the browser for test purposes.

@theKashey
Copy link

I should comment here months ago, but it's never too late. HMR is supported by all major bundlers, a bit defferently in every case, so there is no real "standard".

So which parts are needed for HotModuleReplacements:

  • a module got updated (not a nodejs responsibility)
  • all module parents shall be "notified" about it
    • there is no "parents", only "parent" (💡 webpack provides "parents", but not "parent")
    • the only way to find all your parents is to traverse tree from top to bottom, constructing child->parent relationship from parent->child ones. That's is O(n) operation, but a whole tree has to be analysed to find all parents for a single module (example).
  • some parent should eventually "accept" (aka module.accept) the update, and stop update propagation. This moment is a cornerstone in HMR - who could accept the update, and who cannot(or shall not).
  • once the update is accepted there are two possibilities:
    • "burn" all modules in between of updated module and the point of acceptance. In other words - burn from the bottom, till some point
  • "burn" all modules below the point of acceptance. In other words - burn down from some point
  • module cache is cleared and ready to import new modules.
  • open question on how "the point of acceptance" might retrofit update
    • webpack (<4, or commonjs) requires manually requery dependency
let someModule = require('./module');
module.hot.accept('....', () =>  {
  // cache has been cleared, just re-require module
  someModule = require('./module');
});
  • webpack (4+, ESM) wraps every import with getter, so it could update reference to a new module exports behind the scenes. No sure node could afford this pattern.

As a result, the main issue to resolve is actually how to "accept" and "reconcile" the update.

  • explicit require is not possible with ESM
  • magic getters could not be established for every import, just in case they might be "hot replaced".

I am thinking - it is possible to have an explicit hot import - import stuff from 'hot:./module'

However, this is not letting to implement some custom update logic, which might be needed, and supported by bundlers... and which I never seen used.

@boneskull
Copy link
Collaborator Author

to be clear, I think what we're talking about is not the browser, or how a bundler must work around the limitations, but rather what Node.js can add (beyond the ES spec) to support what tools running in Node.js need and expect from require. "what we need" is what I'm trying to hammer down.

if Node.js declines to support these use cases--I don't think it will, but if it did--it would place a significant barrier to using non-transpiled ESM in Node.js code. unfortunately, it seems like it's not a barrier that devs see until they've already started driving forward using native ESM in Node apps

@boneskull boneskull changed the title tracking "hot module reloading" for ESM in core ESM module reloading and module graph May 15, 2020
@boneskull
Copy link
Collaborator Author

Per our discussion, this is what's needed:

  1. A module needs to know which module imported it.
    • If module a imports module b, we should be able to query an API with module b and see that module a is its "parent".
    • Module b's "parent" should not be "the module that imported module b first", which is essentially how Node.js' CJS system works.
  2. We need to be able to prune (unload, remove from cache) an entire subtree of modules, with an optional target (a module "in the direction" of another module).
    • For example, module a imports module b, which imports module c and d. Module c also imports module d. We should be able, starting with c, to unload d (and its children), but not unload the d that was required by module b.
    • @theKashey I'm not sure I explained this right.

@ljharb
Copy link
Member

ljharb commented May 15, 2020

Modules (CJS or ESM) don't have parents in that way, because they're evaluated only once. In other words, the entire concept of "parent" is wrong and flawed, CJS never should have had it, and ESM shouldn't get it. What are the use cases that think they need it?

@boneskull
Copy link
Collaborator Author

I'll let @theKashey explain.

@boneskull
Copy link
Collaborator Author

(if he doesn't want to: in a nutshell, I think it has to do with being able to mock one "use" of a module, but not another)

@theKashey
Copy link

Frankly speaking, parent is needed only for speed optimization - as a parent is aware of it's children - children should know all parents.

The child->parent information could be build from the existing parent->child one, but it takes time, and needed for almost every graph operation.
Like if you want to evict a module - you should delete a reference to from the module it uses, as long as it would or lead to 1) memory leak, or to 2) two different module graphs coexisting in one time.

So parent information is needed to maintain module graph integrity in a constant time, right now it requires a full traverse of existing graph to reconstruct child->parent connection, and still let's you create an isolated "areas", where a child has a reference to a deleted parent, or a parent has a reference to a deleted child.

@theKashey
Copy link

I think it has to do with being able to mock one "use" of a module, but not another

This is also a good case. The golden rule of "mock only what you own", or direct dependencies. However, it does not require access to the module cache, as it should occur on, let call it, "import time". In cjs terms - after require, but before Module._load; and the result should not be stored in the module cache at all.

@ljharb
Copy link
Member

ljharb commented May 15, 2020

Right, but you'd do that with import maps or something like it - not at runtime in the module that's being imported.

I certainly see a use case for listing all parents of a module - ie, all importers - but that information would need to be statically determinable prior to the module's evaluation, and would thus not include dynamic importers, only static ones. I'm still unclear about the usefulness of that at runtime (as opposed to the imo more typical approach that would generate a dependency graph in advance, and then use it at runtime).

@theKashey
Copy link

In "runtime" it this information might be not so useful, as long as HMR is not expected to be an "often" event, so it's ok to use any other "slow" path.
However, in tests it's better to be faster.

that information would need to be statically determinable

But not always.

const module = import(`./dynamic-module-${letter}`)

From my point of views importers might be statically determined on a build time, but not all those "detected" importers would actually "import" the module at some point of time. So their list is non-static by design. (and thus is it possible to skip other "static" requirements?)

@ljharb
Copy link
Member

ljharb commented May 16, 2020

@theKashey and if that module was previously loaded, it would never be evaluated again, so any "parent" it was able to see would have been the first importer, not the import() callsite you're interested in.

It simply makes no sense for a module to evaluate with any knowledge of its own consumers.

@theKashey
Copy link

Firstly we are not talking about evaluation, but something after it - already established module graph. 👉Module reloading can happen only after the loading.
Secondary in a few rare cases (it's more about old code patterns) you still need to know who is your parent - for example proxyquire uses it to resolve module paths relatively to the parent - aka Module._resolveFilename(file, ->parent<-).
This is also a reason why proxyquire uses a self-eviction - it should work a bit different for the every invocation.

@ljharb
Copy link
Member

ljharb commented May 16, 2020

If it's from an already established module graph, then it makes total sense to expose the graph in some way! However, I don't think it should be called a "parent" since module graphs are not parent-child relationships.

@theKashey
Copy link

theKashey commented May 16, 2020

We probably just went to deep and talking about details we don't own, and so don't need. There is no need to give us an access to some standard module system - it at least require throughful development of that structure, and will concrete some realisation.
In normal development we have a clear separation between public API and internal realisation, and actually the absence of public API for module graph - is the real problem.

All we need is:

  • unload a module - not delete require.cache)
  • unload a module and garbage collect all orphaned modules (wipe modules "behind it") - not traverse module.children
  • unload a module and do something with it's consumers (parents), including removing those consumers as well (propagate "update event till someone accepts the change - not traverse module.parent(s)

A week ago I've found that the majority of cache-delete operations(99.9%) actually leading to the memory leak. Ie deleting a module leaves reference to it in a parent, or a children; keeping references to children in the module as well. And the root cause - actually undocumented, and not expected to be "hacked", module graph structure.

@Ayfri
Copy link

Ayfri commented Jun 6, 2020

Nothing has change with Node.js 14 ?

@bcoe
Copy link

bcoe commented Jan 22, 2021

@searls 👋 we've had this issue open for quite some time, that we'd like to make sure that it's possible to reload ESM modules in Node (to provide features like proxying.)

I'd noticed that quibble supports ESM modules. Are you finding that the platform already provides all the functionality that you need?

@searls
Copy link

searls commented Jan 23, 2021

@searls 👋 we've had this issue open for quite some time, that we'd like to make sure that it's possible to reload ESM modules in Node (to provide features like proxying.)

I'd noticed that quibble supports ESM modules. Are you finding that the platform already provides all the functionality that you need?

Thanks for reaching out! @giltayar is our resident ESM expert—what say you?

@giltayar
Copy link

You can find all the sordid technical details here: https://dev.to/giltayar/mock-all-you-want-supporting-es-modules-in-the-testdouble-js-mocking-library-3gh1

The TL;DR is this: while you cannot unload a module, you can use loaders to ensure that each time you want, a new version of the module will be loaded on import. For all practical purposes, this is the same as deleting the module from the cache, except for the fact that memory for the old version is not removed. Given that this is for development purposes, I don't believe this to be a major concern.

Note that using a loader means adding --loader <loader> to the command line. Which is fine, but if you want two loaders (e.g. testdouble and mocha-watch together), then you're out of luck, as multiple loaders has not yet been implemented, let alone defined. So mocking and watching together won't work.

@Ayfri
Copy link

Ayfri commented Jan 24, 2021

Also you can use the same method as Deno, adding a #number that indicates that that is a new iteration of the module ?

@giltayar
Copy link

@Ayfri I use a query parameter, but it's the same method.

Where does demo use this?

@Ayfri
Copy link

Ayfri commented Jan 24, 2021

denoland/deno#6946

@hpx7
Copy link

hpx7 commented Jan 29, 2021

I'm still trying to figure out how to reload a module anytime it or any of its dependencies are modified.

The closest I've been able to get is this:

let mod = await import("./mod.js");
fs.watch(require.resolve("./mod.js"), async () => {
  mod = await import(`./mod.js#${Math.random()}`);
  console.log("reloaded", mod);
});

While that correctly reloads the module when it is directly modified, it doesn't reload when a dependency of the module is modified. For example, if mod.js imports from ./foo.js, I want to reload mod.js if foo.js is modified. Is there some way to accomplish this with the current tooling?

EDIT: I'm already using the ts-node/esm loader so I can't use another one

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants