ESM Loader Hooks (20.8) should support a more testable, less stateful design #50042

ggoodman · 2023-10-04T19:12:31Z

What is the problem this feature will solve?

The current ESM Loader Hooks are defined as modules having the following named exports: initialize(), resolve() and load(). The initialize() function is passed data specified when calling register(). The intent is that the other two hooks can thus also leverage this data in extensible, future-proof ways.

The problem that this introduces is that the ESM Load Hook module is now a stateful singleton because it encourages designs wherein initialization state is shared at the module level. This means that any tests looking to exercise the loader module needs to deal with ESM cache invalidation. It also makes it slightly less ergonomic (and type-safe!) to set and access any needed shared state.

What is the feature you are proposing to solve the problem?

I propose a slight change (or addition) to the proposed design. The change I'd like to propose is that the initialize() function should be able to produce the object implementing the other two components of the interface: resolve() and load().

In this way:

Tests looking to test the loader's implementation can easily produce multiple instances of the loader for testing different inputs and scenarios in isolation.
State can be passed in a simple, immutable and type-safe manner between the initialize() call and the resolve() and load() hooks.
A Loader could then be implemented as a class that is instantiated and returned by the initialize() function. The class can hold shared state in a private field or instance properties. Alternatively, the initialize() closure could also be used to hold this state in a way that the two other hooks can easily access.

export async function initialize(data) {
  return new MyLoader(data);
}

class MyLoader {
  #port;
  
  constructor(data) {
    this.#port = data.port;
  }
  
  resolve(specifier, context, nextResolve) {
    this.#port.postMessage(['resolve', specifier, context]);
    // No-op, defer to built-in
    return nextResolve(specifier, context);
  }
  
  load(url, context, nextLoad) {
    this.#port.postMessage(['load', url, context]);
    // No-op, defer to built-in
    return nextLoad(url, context);
  } 
}

What alternatives have you considered?

No response

The text was updated successfully, but these errors were encountered:

This commit allows the `initialize()` hook to optionally return an object having the `resolve()` and `load()` hooks as properties. This allows state passed into `initialize()` to be shared with the `resolve()` and `load()` hooks either via closure or class instance. In addition to developer ergonomics, supporting this model will make it easier to write tests against a loader module. The existing design forces state to be shared at the module level which puts the burden of invalidating the ESM module cache on anyone hoping to write isolated tests against a loader module. Fixes: nodejs#50042

GeoffreyBooth · 2023-10-04T19:55:23Z

@nodejs/loaders

This commit allows the `initialize()` hook to optionally return an object having the `resolve()` and `load()` hooks as properties. This allows state passed into `initialize()` to be shared with the `resolve()` and `load()` hooks either via closure or class instance. In addition to developer ergonomics, supporting this model will make it easier to write tests against a loader module. The existing design forces state to be shared at the module level which puts the burden of invalidating the ESM module cache on anyone hoping to write isolated tests against a loader module. Fixes: nodejs#50042

ggoodman · 2023-10-04T20:22:52Z

BTW, I threw #50044 together to explore what a back-compatible approach would look like for this.

This commit allows the `initialize()` hook to optionally return an object having the `resolve()` and `load()` hooks as properties. This allows state passed into `initialize()` to be shared with the `resolve()` and `load()` hooks either via closure or class instance. In addition to developer ergonomics, supporting this model will make it easier to write tests against a loader module. The existing design forces state to be shared at the module level which puts the burden of invalidating the ESM module cache on anyone hoping to write isolated tests against a loader module. Fixes: nodejs#50042

ggoodman · 2023-10-19T20:24:58Z

@GeoffreyBooth, here are some higher-level observations about the current loader API:

The load hook doesn't have access to a hooked resolve function. This means a load hook can't, for example, attempt to resolve a tsconfig.json or other config file using the loader's own machinery. It must use a parallel approach.
The resolve hook doesn't have access to a hooked load function. Similar to the above, the resolve hook might want to load the content of a config file that would influence module resolution. Using the loader's own machinery would be advantageous over having a parallel construction.
The initialize, load and resolve hooks are top-level functions whose only option for extensibility is through new arguments. By extensibility, I mean unforeseen use-cases or data that needs to be passed into the hook. This is not strictly bad but feels like a missed opportunity.
The use of a module instance whose exports form the loader interface means:
a. Sharing state between the initialize and resolve / load hooks must rely on module-level state that is hard to evict. This makes testing and de-registering loaders unnecessarily complicated.
b. It is difficult to cleanly add type-safety to a loader since the interface is the module. We don't have facilities for asserting that an entire modules's exports adhere to a desired interface.
Subjectively, the shortCircuit vs nextResolve / nextLoad interaction is very confusing and clunky. Why do we need shortCircuit?

I've started speccing out what I think an alternative design might look like and as I'm attempting to port the CoffeScript loader from the docs over, I'm seeing some of the motivation for the nextResolve and nextLoad functions. I have run out of time and brain-power trying to make that work today. Hope to pick it up tomorrow.

GeoffreyBooth · 2023-10-20T01:27:44Z

@ggoodman I’m warming to the idea of initialize returning something whose resolve and load properties define those hooks. That can be an elegant way to provide a loader “interface” as you’re describing, and if we still permit the top-level exports then it’s not breaking. This API has been around for a few years now and we’ve gotten some strong pushback the previous times we’ve changed it, so I think at this point all changes should be additive unless there’s a strong motivation otherwise. “We could’ve thought of something better” isn’t quite strong enough in my opinion. But we can support both styles.

As for load having access to resolve and vice versa, they do have access in the sense that either hook’s body can call import() which would kick off a new module resolution and loading pipeline. They can’t access the resolve portion or the load portion of that pipeline individually, but I’m not sure how it would make sense to do so; how would that work with chained hooks?

And for your #5, chaining is the reason we need those things. We want the ability to register multiple sets of hooks, like for example just as people today often do --require tsconfig-paths/register --require ts-node/register we want to support the equivalent via --import/register().

isaacs · 2023-10-20T03:40:02Z

@ggoodman

a. Sharing state between the initialize and resolve / load hooks must rely on module-level state that is hard to evict. This makes testing and de-registering loaders unnecessarily complicated.
b. It is difficult to cleanly add type-safety to a loader since the interface is the module. We don't have facilities for asserting that an entire modules's exports adhere to a desired interface.

You might find node-tap's loader interfaces informative here.

I've been putting these hook methods in a hooks.mts file (so that I can export one set with initialize/resolve/load, and another with globalPreload/resolve/load), and testing with tap's t.mockImport(). This makes it pretty much the same difficulty to test as it would be to test a regular function, since each invocation has a new unique context and even its own mocked dependencies.

The exported types in @types/node cover the loader hooks interface pretty comprehensively, and if you have type checking on in your tests, I'm not sure how much more type-tested it could really be. Maybe in a perfect world, node would have a way to verify the types at run time, but that's not how ts works. 🤷‍♂️

giltayar · 2023-10-20T06:15:21Z

If it was just to help with testing, I would not have been for this change, as there are other ways to deal with tests here.

But!

We still have an open issue regarding communication with hook (e.g. an api that changes thing in the hook), when there are multiple worker threads.

If we call initialize for each worker thread, and each worker thread has their own port and resolve and initialize, then this is a problem solved as each worker thread will have its own port, and the hook will have their own state per worker thread.

So not am I only not for this idea, I think it's a must have to solve an open issue in loaders.

ggoodman · 2023-10-20T12:58:36Z

and testing with tap's t.mockImport(). This makes it pretty much the same difficulty to test as it would be to test a regular function

@isaacs That sounds quite awesome TBH because it's much higher fidelity than the sorts of tests I'm proposing. At the same time, it relies on using a smaller subset of test frameworks that support this sort of isolation. I don't think Node's own test runner even has this sort of advanced functionality yet.

ggoodman · 2023-10-20T13:01:28Z

We still have an open issue regarding communication with hook (e.g. an api that changes thing in the hook), when there are multiple worker threads.

If we call initialize for each worker thread, and each worker thread has their own port and resolve and initialize, then this is a problem solved as each worker thread will have its own port, and the hook will have their own state per worker thread.

Agreed @giltayar, I don't think the current design has a good story for instantiation. In other words, the forced reliance on module-level state means that the same loader can't be instantiated multiple times with different configs.

Of course, there are probably some tricky ways to get around this but the current API seems unnecessarily difficult for such use-cases.

ggoodman · 2023-10-20T14:17:58Z

Happy to rebase #50044 and spike out the docs changes if there seems to be momentum in favour of the change.

GeoffreyBooth · 2023-10-20T21:05:35Z

If we call initialize for each worker thread, and each worker thread has their own port and resolve and initialize, then this is a problem solved as each worker thread will have its own port, and the hook will have their own state per worker thread.

Can the PR add some tests for this? This would be good to have coverage around.

We might want to add an example in the docs for worker threads if supporting them is something that hook authors will need to explicitly enable.

As for the initialize return value, I would just like to be able to continue adding to it in the future. So for now maybe it supports only { resolve: Function; load: Function } but if we discover new needs for it in the future, we can add more properties.

Happy to rebase #50044 and spike out the docs changes if there seems to be momentum in favour of the change.

Yes please!

This commit allows the `initialize()` hook to optionally return an object having the `resolve()` and `load()` hooks as properties. This allows state passed into `initialize()` to be shared with the `resolve()` and `load()` hooks either via closure or class instance. In addition to developer ergonomics, supporting this model will make it easier to write tests against a loader module. The existing design forces state to be shared at the module level which puts the burden of invalidating the ESM module cache on anyone hoping to write isolated tests against a loader module. Fixes: nodejs#50042

github-actions · 2024-04-18T01:22:50Z

There has been no activity on this feature request for 5 months. To help maintain relevant open issues, please add the never-stale Mark issue so that it is never considered stale label or close this issue if it should be closed. If not, the issue will be automatically closed 6 months after the last non-automated comment.
For more information on how the project manages feature requests, please consult the feature request management document.

github-actions · 2024-05-18T01:25:09Z

There has been no activity on this feature request and it is being closed. If you feel closing this issue is not the right thing to do, please leave a comment.

For more information on how the project manages feature requests, please consult the feature request management document.

ggoodman added the feature request Issues that request new features to be added to Node.js. label Oct 4, 2023

github-project-automation bot added this to Node.js feature requests Oct 4, 2023

github-project-automation bot moved this to Pending Triage in Node.js feature requests Oct 4, 2023

ggoodman mentioned this issue Oct 4, 2023

module: initialize hook returns load, resolve #50044

Open

2 tasks

GeoffreyBooth mentioned this issue Oct 21, 2023

Node.js Loaders Team Meeting 2023-10-24 nodejs/loaders#167

Closed

github-actions bot added the stale label Apr 18, 2024

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale May 18, 2024

avivkeller removed this from Node.js feature requests Jun 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ESM Loader Hooks (20.8) should support a more testable, less stateful design #50042

ESM Loader Hooks (20.8) should support a more testable, less stateful design #50042

ggoodman commented Oct 4, 2023

GeoffreyBooth commented Oct 4, 2023

ggoodman commented Oct 4, 2023

ggoodman commented Oct 19, 2023 •

edited

Loading

GeoffreyBooth commented Oct 20, 2023

isaacs commented Oct 20, 2023

giltayar commented Oct 20, 2023

ggoodman commented Oct 20, 2023

ggoodman commented Oct 20, 2023

ggoodman commented Oct 20, 2023

GeoffreyBooth commented Oct 20, 2023

github-actions bot commented Apr 18, 2024

github-actions bot commented May 18, 2024

ESM Loader Hooks (20.8) should support a more testable, less stateful design #50042

ESM Loader Hooks (20.8) should support a more testable, less stateful design #50042

Comments

ggoodman commented Oct 4, 2023

What is the problem this feature will solve?

What is the feature you are proposing to solve the problem?

What alternatives have you considered?

GeoffreyBooth commented Oct 4, 2023

ggoodman commented Oct 4, 2023

ggoodman commented Oct 19, 2023 • edited Loading

GeoffreyBooth commented Oct 20, 2023

isaacs commented Oct 20, 2023

giltayar commented Oct 20, 2023

ggoodman commented Oct 20, 2023

ggoodman commented Oct 20, 2023

ggoodman commented Oct 20, 2023

GeoffreyBooth commented Oct 20, 2023

github-actions bot commented Apr 18, 2024

github-actions bot commented May 18, 2024

ggoodman commented Oct 19, 2023 •

edited

Loading