Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

esm: implement the getFileSystem hook #41076

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

arcanis
Copy link
Contributor

@arcanis arcanis commented Dec 3, 2021

This is the implementation of nodejs/loaders#44, as discussed at the last Loaders meeting (cc @nodejs/loaders). I leave it as draft because I expect to get feedback that could change the design, but it's already in a working state and I'd appreciate reviews. This PR includes:

  • A new getFileSystem hook lets user interact with the Node resolution by letting them override the code I/O dependencies of the resolution algorithms. Four functions have been ported, in both async and sync versions (look below for a more detailed sync/async discussion).

    • Unlike my initial plans for a readFile hook, which would have implied to eventually create other hooks for statFile etc, this implementation takes a different approach by defining a single hook which must return an interface of utilities. This approach works better in my opinion:

      • It's more scalable; more filesystem utilities can be added without having to make them full-blown hooks, which would imo needlessly complexify the design (if only in terms of documentation; with getFileSystem, the filesystem utilities can be documented in a different place than the proper hooks).

      • Filesystem utilities are very connected to each other. As a result, it stands to reason some hook author would want to return one set of utilities or another based on various conditions (for example we could imagine a loader that would allow disabling its features by returning the default filesystem utilities if a certain environment variable is set). If each utility was its own hook, this would be convoluted to achieve.

      • It also addresses the concerns @GeoffreyBooth had by avoiding to entangle the hook calls: the getFileSystem hook is only called when Node boots, not from within other hooks (the filesystem utilities themselves will be called during the hooks execution, of course, but it seems to be this shouldn't be a problem as those functions should be assumed to have no visible side effect).

    • The hook already implements the four utilities currently required for the Node resolution. I remember we discussed doing only readFile as a demo before adding the other ones, but I realized when implementing it that some things would appear over-engineered if I didn't show an example that used multiple utilities (for example, having getFileSystem that returns an object would make no sense if we only needed readFile - but we don't, which makes a design like getFileSystem more valuable).

    • I made the utilities have the exact same return values as the Node core functions they replaced (in particular for internalModuleStat and InternalModuleReadJSON, which have unusual return types). I'm open to ideas as to how the utility signatures should be for the final version.

    • I however normalised all utilities' inputs so that they all take a URL instance as parameter. In many case it just meant reusing url instances, and only in a few places I needed to instantiate new ones.

  • The demo/ folder is of course temporary and just meant to discuss how the hook looks like in practice. For this example, I implemented a loader that, given a file foo.mjs (or any other name), simulates the existence of a foo-sha512.mjs file that exports the hash of the original file). I picked this example because it's easy to conceptualise, and cannot be achieved by the other hooks.

    • I considered writing a "load modules from http" loader, but it's a little more difficult due to the Node algorithms currently relying a lot on synchronous operations, even in the esm pipeline. It would still have been possible with workers and atomics, but I felt like this would be better avoided for a simple demo use case.

    • I didn't write tests or documentation yet; those will be written once I un-draft the PR. For now, the demo/ folder can be seen as the reference on how to use and experiment with the PR.

Async vs sync

I initially implemented all the filesystem utilities as async functions, but quickly noticed that the Node resolution (even in the esm pipeline) was relying on them being sync. After some thoughts, I decided to implement both versions, even if some of them aren't in use right now. There are a few reasons for that:

  • I didn't want to rework the entirety of the Node resolution pipeline. Changing sync functions to become async was a no-goal for me, especially considering the other points that follow.

  • The loader implementation cost of having to write two different versions of a same hook can be offset by Node automatically reusing the sync version of an utility if the same loader declares it (check my demo loader for an example).

  • There's a decent case that the cjs resolution pipeline could benefit from the getFileSystem hooks as well, which would allow us to close internalModuleStat & internalModuleReadJSON prevent resolution override #33423. This is something I'd like to tackle once the current task is done, so I don't think it'd hurt to document both sync and async utilities and suggest implementers implement both.

@nodejs-github-bot nodejs-github-bot added lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. labels Dec 3, 2021
Comment on lines +46 to +47
exports.getFileSystem = () =>
esmLoader.fileSystem;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was required because there seems to be some dependency cycle between esm_loader and the modules used during the resolution, causing destructuring to fail.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I think that might be because there are 2 instances of ESMLoader:

  1. used "internally" during startup (ex to load loaders etc)
  2. used for everything else after

See lines 71–86 below

Copy link
Contributor

@guybedford guybedford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a good start... for those of us who aren't involved in the meetings it would be good to have an overview of the design decisions and plans wrt CJS integration.


const keys = ObjectKeys(implementation);
for (let t = 0; t < keys.length; ++t)
copy[keys[t]] = FunctionPrototypeBind(implementation[keys[t]], null);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps make the implementation a class so this is just instantiation?

One benefit of that would be treating the realpathcache as an instance property so that would give it a proper lifetime.

That does mean we have to think a bit more carefully about the lifetimes of the hook though too - this function is only called once though currently right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just looked at that, I think an object should be preferred over a class: first because we don't really use class features (in particular we can't use this since it'd break destructuring), but more importantly because it'd prevent the default fs functions from being observable via ObjectKeys. Given that it's currently the source of truth for the loader code to know what are the supported fs functions, using a class would require further special casing which I think makes the object form preferable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see pros and cons to both, but re-creating native functionality sounds slightly more on the con-side.

You could create an instance (the default instance / singleton) and get the keys from it:

class FSLoaderUtils { /* … */ }

const defaultFSLoaderUtils = new FSLoaderUtils();
const defaultFSLoaderUtilNames = ObjectKeys(defaultFSLoaderUtils);

function defaultGetFSLoaderUtils() {
  return {
    defaultFSLoaderUtils,
    defaultFSLoaderUtilNames,
  };
}

@@ -45,6 +49,9 @@ const {
initializeImportMeta
} = require('internal/modules/esm/initialize_import_meta');
const { defaultLoad } = require('internal/modules/esm/load');
const {
defaultGetFileSystem
} = require('internal/modules/esm/get_file_system');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the plan to only use this in the esm/loader.js and not in the CJS loader? That seems like it might limit the applicability of these hooks, especially when CJS interfaces with eg exports resolutions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment the other hooks are only used in the esm code path so I didn't want to change that since it's not required strictly speaking. A followup PR however will be to consider getFileSystem for cjs.


readJsonSync(p) {
return packageJsonReader.read(fileURLToPath(p));
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we can't just have readJson as a well-defined abstraction on top of readFile? Package virtualization can still work by returning serialized JSON - which is likely needed anyway with threading workflows.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't make perf analysis on internalReadJson, but I'd intuitively agree that a JS abstraction over readFile (that would just apply a regex to get the containsKey value) would fast enough without requiring a dedicated native function. I can experiment with that. Or do you have something else in mind?

return fs.realpathSync(p, {
[internalFS.realpathCacheKey]: realpathCache
});
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory realpath can be implemented on top just an lstat and stat implementation, although that would be a slightly more complex refactoring of the realpath algorithm. It might well simplify the model though in just having three base-level FS primitive hooks to virtualize.

Copy link
Contributor Author

@arcanis arcanis Dec 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little worried about that - it'd likely be much slower, and the semantic loss could bring unforeseen issues. I'd not feel confident dropping it in this PR 🤔


// If the loader specifies a sync hook but omits the async one we
// leverage the sync version by default, so that most hook authors
// don't have to write their implementations twice.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make sure to throw though if a specific loader implements an async hook but not a sync hook for the same method?

@arcanis
Copy link
Contributor Author

arcanis commented Dec 3, 2021

Will look at the other comments but to address this one quickly:

for those of us who aren't involved in the meetings it would be good to have an overview of the design decisions and plans wrt CJS integration.

There is no plan at the moment (I don't think I even mentioned it in a meeting yet), except that I plan to plan something 😄

My motivation would be to fix #33423 one way or another, which I think could be made easy with the getFileSystem hook in this PR. When/if this PR lands, I'll start investigating this topic and open a thread with my findinds in the Loaders repo.

Copy link
Member

@JakobJingleheimer JakobJingleheimer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having trouble seeing how this will work once chaining is added—I expect it to have the same problem as one of the other related proposals: Without some kind of map, one loader could/likely will pull the rug out from under others.

  • loaderA has getFileSystem, resolve, and load
  • loaderB has getFileSystem, resolve, and load
  • loaderC has resolve and load, and sometimes getFileSystem

loaderA would likely expect its getFileSystem, loaderB would likely expect its getFileSystem, and so on. And they (rightfully) have no idea what any other getFileSystem does.

However, they ALL get the last-wins product of all getFileSystems.

Also, how are these exposed to custom loaders? (They're currently not passed via esmLoader.resolve or esmLoader.load, just Node's default*s)

Or have I missed something?


On a more nitty-gritty note, I think the "build" and "get" naming needs some smithing: nothing is being built in the same sense that people will think, nor is anything retrieved. Perhaps "compose"?

// my-loader.mjs
export function composeFileSystemUtilities(existingUtilities) {}
class ESMLoader {
  composeFSUtils() {}
}

demo/loader-sha512.mjs Show resolved Hide resolved

const realpathCache = new SafeMap();

const internalFS = require('internal/fs/utils');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think fsUtils might be a better name for this. Otherwise it sounds like internal vs external/public versions, but I think the important distinction is that these are utilities.

const { fileURLToPath } = require('url');
const { internalModuleStat } = internalBinding('fs');

const implementation = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this whole thing is the implementation

Suggested change
const implementation = {
const defaultFileSystem = {

But I think this might be better as a class (like Guy suggests)

Comment on lines +28 to +30
async readJson(p) {
return packageJsonReader.read(fileURLToPath(p));
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can/should this be used to read json files that are not a package.json? If not, I think the name should be more explicit:

Suggested change
async readJson(p) {
return packageJsonReader.read(fileURLToPath(p));
},
async readPackageJson(p) {
return packageJsonReader.read(fileURLToPath(p));
},

@@ -81,6 +88,14 @@ class ESMLoader {
defaultResolve,
];

/**
* @private
* @property {Function[]} resolvers First-in-first-out list of resolver hooks
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @property {Function[]} resolvers First-in-first-out list of resolver hooks
* @property {Function[]} fileSystemBuilders First-in-first-out list of file system utilities compositors

function defaultGetFileSystem(defaultGetFileSystem) {
const copy = ObjectCreate(null);

const keys = ObjectKeys(implementation);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These can't/don't change, and they're also used in ESMLoader, so I think this should be moved outside defaultGetFileSystem() and be exported to avoid needlessly re-compiling this list.

buildFileSystem() {
// Note: makes assumptions as to how chaining will work to demonstrate
// the capability; subject to change once chaining's API is finalized.
const fileSystemFactories = [...this.#fileSystemBuilders];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why shallow-clone this list?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't clear which guarantees are made as to whether addCustomLoaders can/may ever be called from userland code, so I prefer to make the code handle that in a consistent way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be wrong, but I believe it is not possible to access esmLoader directly from user-land without something very naughty like --expose-internals. But also, esmLoader#fileSystemBuilders is private, so it's inaccessible anyway.


for (let i = 1; i < fileSystemFactories.length; ++i) {
const currentFileSystem = finalFileSystem;
const fileSystem = fileSystemFactories[i](() => currentFileSystem);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why make this a function that returns the current FS utils rather than just passing the current utils? (Ex this doesn't protect against mutating currentFileSystem)

Copy link
Contributor Author

@arcanis arcanis Dec 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly to match the signature of other hooks, which provide the next hook function rather than their results (granted, they accept inputs whereas this one doesn't, so it could be possible, but by consistency I feel more comfortable keeping the same pattern).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eventually this should do a copy operation rather than direct reference to avoid people purposefully doing mutation becoming breaking change worries.

// If the loader specifies a sync hook but omits the async one we
// leverage the sync version by default, so that most hook authors
// don't have to write their implementations twice.
for (let j = 0; j < asyncKeys.length; ++j) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I woulda chosen k for "key" 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a key, though, but the index of a key. I looked at the existing codebase to pick the iterator name and saw many references to i,j

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merely a side comment—I wasn't suggesting a change. Sorry for the confusion. I use k as an index in a set of keys

for (const key of keys) // key = 'foo'

vs

for (let k = 0; k < keys.length; k++) // key = keys[k] = 'foo'

Comment on lines +46 to +47
exports.getFileSystem = () =>
esmLoader.fileSystem;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I think that might be because there are 2 instances of ESMLoader:

  1. used "internally" during startup (ex to load loaders etc)
  2. used for everything else after

See lines 71–86 below

@arcanis
Copy link
Contributor Author

arcanis commented Dec 8, 2021

loaderA would likely expect its getFileSystem, loaderB would likely expect its getFileSystem, and so on. And they (rightfully) have no idea what any other getFileSystem does.

They shouldn't expect their own getFileSystem since it's a hook. Sure, it could be a footgun if a loader was implemented with something, say, like this:

export function getFileSystem() {
  // ...
}

export function load() {
  const hookFs = getFileSystem();
  // ...
}

But it'd be a bug easily fixed, and the audience of people using getFileSystem will hopefully be expert enough to understand and address the issue once spotted. In a way, it's the same problem as if they were to call their own resolve hook from within load: nothing prevents it, but should/can it really be prevented?

Also, how are these exposed to custom loaders? (They're currently not passed via esmLoader.resolve or esmLoader.load, just Node's default*s)

I didn't want to pass them directly in this PR because I could see two options:

  • Either they are a parameter of the resolve/load hooks, as you suggest:
    export async function load({hookFs}) {
      const content = await hookFs.readFile('...');
    }
  • Or they are leveraged by "resolution helpers", similar to how the Node resolution functions currently use them:
    import resolution from 'node:resolution';
    
    export async function load() {
      const content = await resolution.readFile('...');
    }

Given that this second option seems to rely on the outcome of discussion around a standard set of resolution helpers, I felt it could be beneficial to defer this capability to a follow-up (so for now, only the Node resolution pipeline leverages those hooks).

@bmeck
Copy link
Member

bmeck commented Dec 8, 2021

I once again think this is a good case to stop shoe horning things into parameters of hooks. These are ambient APIs to call and using them as parameters is just confusing. I much prefer the node:resolution approach to expose all hooks, not just resolution or provide something when a loader is initialized (global or via some initialization hook similar to preload code).

@DerekNonGeneric DerekNonGeneric added the wip Issues and PRs that are still a work in progress. label Dec 11, 2021
@bmeck
Copy link
Member

bmeck commented Dec 13, 2021

Going to read this, but it might get messy when we move the ESM loader off thread. Sorry it might take a while just to think on the topic.


const DATA_URL_PATTERN = /^[^/]+\/[^,;]+(?:[^,]*?)(;base64)?,([\s\S]*)$/;

async function defaultGetSource(url, { format } = {}, defaultGetSource) {
const parsed = new URL(url);
let source;
if (parsed.protocol === 'file:') {
source = await readFileAsync(parsed);
source = await esmLoader.getFileSystem().readFile(parsed);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be preferable to have this passed in as a param instead of grabbing off of the application context esmLoader (there can be multiple Loaders).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In which case does that happen? It could be a blocker for the node:resolution accessors I mentioned earlier, as it relies on loader utils being singletons.

If that's the case of the "loader loader" that @JakobJingleheimer mentions there, since from what I understand they aren't used at the same time, perhaps the global access would be fine?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I THINK their sequence is fully serial: the node-land esmLoaderNodeland finishes its work and disappears into the void, and then esmLoaderUserland steps in and starts its work.

Aside from Bradley's concern of bloating params, I think it would be better for this to be passed (and also makes it easier to test). One of our Loader team's side/wish-list projects is to make ESMLoader pure, and this goes against that. Not necessarily a show-stopper, but something to mention: If we do decide to make it global and preclude fully pure, we should do it consciously.

@JakobJingleheimer
Copy link
Member

JakobJingleheimer commented Dec 19, 2021

loaderA would likely expect its getFileSystem, loaderB would likely expect its getFileSystem, and so on. And they (rightfully) have no idea what any other getFileSystem does.

They shouldn't expect their own getFileSystem since it's a hook.

[…]

But it'd be a bug easily fixed, and the audience of people using getFileSystem will hopefully be expert enough to understand and address the issue once spotted.

Hmm, if you don't mind humouring me, let's discuss at Tuesday's meeting (cc @GeoffreyBooth): I still don't understand how it would be easily fixed or how it could result in anything but rug-pulling:

For example, let's say:

  • loaderA::getFileSystem replaces some one of the utilities with an implementation that makes a fetch to some remote FS (where, let's say a package.json is found).
  • loaderB::getFileSystem replaces that same utility with an implementation that makes a fetch to a different remote FS.

loaderA is depending on the remote FS's package.json for determining format. That is now broken and it has no idea (it could actually "work" but return an entirely wrong format and break in even more baffling ways). I can't see any way to fix that without loaderA and loaderB knowing about each other or wrapping them in loaderAB that does both (effectively mapping them).

Copy link
Contributor

@aduh95 aduh95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move the demo/ to test/? This would also need some doc update I think.

Comment on lines +35 to +36
ERR_UNKNOWN_MODULE_FORMAT,
ERR_LOADER_MISSING_SYNC_FS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: ASCII order

Suggested change
ERR_UNKNOWN_MODULE_FORMAT,
ERR_LOADER_MISSING_SYNC_FS
ERR_LOADER_MISSING_SYNC_FS,
ERR_UNKNOWN_MODULE_FORMAT,

Comment on lines +12 to +13
const fs = require('fs');
const fsPromises = require('internal/fs/promises').exports;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to use destructuring here I think, otherwise monkey-patching of fs module would impact this module.

Comment on lines +267 to +269
fileSystem[asyncKey] = async (...args) => {
return fileSystem[syncKey](...args);
};
Copy link
Contributor

@aduh95 aduh95 Dec 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should not make our user expect a promise here, if/when we end up getting rid of the sync methods in this hook, loader hook authors should expect the previous hook may have passed a synchronous function:

Suggested change
fileSystem[asyncKey] = async (...args) => {
return fileSystem[syncKey](...args);
};
fileSystem[asyncKey] = fileSystem[syncKey];

and if we want it to return a promise, we still need to avoid the spread operator on arrays (which relies on globally mutable Array.prototype[Symbol.iterator]):

Suggested change
fileSystem[asyncKey] = async (...args) => {
return fileSystem[syncKey](...args);
};
fileSystem[asyncKey] =
async (...args) => ReflectApply(fileSystem[syncKey], fileSystem, args);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imo it'd be best to attempt to keep a consistent API - otherwise loader authors would be effectively forced to do:

const content = await Promise.resolve(fs.readFile(path));

The extra Promise.resolve wrapper wouldn't look very idiomatic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another solution is perhaps to just drop this "automatically bind sync functions into async slots", since it comes with its own drawbacks (it can be seen as a footgun, since it makes all async filesystem operations silently turn sync).

I'm not that attached to it, so dropping it would be fine by me.

Copy link
Contributor

@aduh95 aduh95 Dec 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otherwise loader authors would be effectively forced to do:

I don't think that this is correct, you can await non-Promise objects:

console.log({}); // {}
console.log(await {}); // {}
console.log(await Promise.resolve({})); // {}

I don't feel strongly about this, but imho it wouldn't be too bad as an API.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes: being able to await a non-promise was explicitly for this scenario (can find the citation if needed).

Comment on lines +249 to +251
typeof url === 'string' ?
pathToFileURL(url) :
url
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conversion path -> URL -> path here is unfortunate :/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed - I figured consistency would be important, but we could specify that the hook paths are either URLs or regular strings. I'm however a bit worried it'd just push to userland the responsibility of calling pathToFileURL.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There're A LOT of path → URL → path in ESMLoader/hooks.

I was thinking to propose anywhere a url currently is (within ESMLoader/hooks), it be replaced with a URL instance.

@GeoffreyBooth GeoffreyBooth added loaders Issues and PRs related to ES module loaders loaders-agenda Issues and PRs to discuss during the meetings of the Loaders team labels Dec 21, 2021
@GeoffreyBooth
Copy link
Member

Thank you for doing this, and sorry for not reviewing this sooner. This is very impressive work.

I have questions about use cases and when to use these hooks versus the load hook etc., but I think nodejs/loaders#44 is probably the better place for that discussion, to keep this thread focused on this PR. But in broad strokes we need to figure out how these new hooks (in my mind, each filesystem method replaced by getFileSystem is essentially its own hook) interact with the existing hooks and with the chaining design. So before we dig deeper in finer details of this PR's implementation, we should update https://github.com/nodejs/loaders/blob/main/doc/design/overview.md to add these hooks and explain how they fit in, and how they can be chained. A PR against that file (or against the loaders repo generally, to add a new design doc file) would be one way to hash all that out.

I didn’t write tests or documentation yet

I understand the desire to dive into the code and get something working, and present a proof-of-concept. You’ve certainly achieved that. I found that for many of the earlier loaders PRs, the most fruitful discussion took place against the proposed docs of a new PR; so much so that I’d suggest any new loaders-related PR start with the docs for whatever’s being added, as a way of discussing what the proposed API will be; and then once that’s nailed down, the PR just implements it. Of course the API will change somewhat when the code is finally written, but you’ll be working from a place where you’ve already achieved consensus that we do want roughly these changes and to solve whatever use cases are in the examples written in the new docs. In this case the ambitions are so great that I think we should start a step further back, with design docs in the loaders repo, and then we can proceed to a PR with docs as intended in any final PR. It may very well look just like what this PR contains; I’m not trying to suggest otherwise. But I think it would make this easier to review and would help ensure that parallel efforts fit together.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lib / src Issues and PRs related to general changes in the lib or src directory. loaders Issues and PRs related to ES module loaders needs-ci PRs that need a full CI run. wip Issues and PRs that are still a work in progress.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

internalModuleStat & internalModuleReadJSON prevent resolution override
9 participants