Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

esm: do not lazy instantiate loaders to avoid infinite loop #47599

Closed
wants to merge 4 commits into from

Conversation

aduh95
Copy link
Contributor

@aduh95 aduh95 commented Apr 17, 2023

There are still failing tests to address.

Fixes: #47566

@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/loaders
  • @nodejs/modules

@nodejs-github-bot nodejs-github-bot added esm Issues and PRs related to the ECMAScript Modules implementation. needs-ci PRs that need a full CI run. process Issues and PRs related to the process subsystem. labels Apr 17, 2023
@GeoffreyBooth GeoffreyBooth changed the title esm: do not lazy instanciate loaders to avoid infinite loop esm: do not lazy instantiate loaders to avoid infinite loop Apr 17, 2023
@aduh95 aduh95 force-pushed the esm-infinite-loop branch 2 times, most recently from ce48cc0 to efe037e Compare April 18, 2023 16:42
@aduh95 aduh95 marked this pull request as ready for review April 18, 2023 16:42
Comment on lines +96 to +100
'NativeModule internal/modules/esm/loader',
'NativeModule internal/process/esm_loader',
'NativeModule internal/modules/esm/assert',
'NativeModule internal/modules/esm/module_map',
'NativeModule internal/modules/esm/translators',
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we OK with this change? /cc @joyeecheung

Copy link
Member

@joyeecheung joyeecheung Apr 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the description in the issue:

I think this is happening because the loader is going to try to spawn a worker thread from within the worker thread, which is also going to do the same thing, and that's how we end up with Mr Meeseeks situation until the system is out of memory or the process terminates. I'm working on a fix.

I am not very convinced that this is the right fix. It seems some refactoring or synchronization should be done to avoid this instead. The ESM loader is full of circular dependencies and weird TDZs so I am not quite surprised that this happens, as that has already resulted in a deadlock in that off-thread PR before. I don't think eager loading actually make those problematic dependencies go away, and I suspect some other internal code changes that alter the dependency in internal/modules/esm could re-introduce this again even if we eager-initialize them in the outmost main scripts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that off-thread has landed I would be happy to combine some of the ESM files to make them snapshotable. I assume the goal is to add as much of the ESM loader as possible to the snapshot? I think it’s safe to say that ESM code is common enough at this point that most projects are likely to have at least some ESM code loaded between the main app and/or its dependencies.

Not that we should do this refactor in this PR, but I’m assuming it’s something we’d like to do?

Copy link
Member

@joyeecheung joyeecheung Apr 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that might just happen to be an alternative fix to the bug. Regardless of whether that's the right fix, I still don't think what this PR does the right fix - there are several comments in the loader assuming that there can only be only worker being spawned. If the code in the loader implementation is not enough to guard against infinite worker spawning, then it's a bug in the loader implementation (which I suspect is a result of circular dependency + TDZ invalidating the guards). The loader had been lazy-loadable without infinite worker spawning, the fact that it now does is a regression of the loader implementation. Eager-loading the loader only sweeps the bug even further deeper under the rug.

To quote myself from 4 months ago: #45828 (comment)

This refactoring is just reduces the technical debt (circular dependencies, TDZs, synchronous side-effect/runtime state queries at module loading time etc.) in the code case. The sooner it lands, the less likely that future code have to pile more technical debt onto the codebase (unless they repeat the refactoring themselves). #43772 is fine in that regard, as it is not adding any more of those, so it would not be difficult for that PR to rebase onto this or the other way around. But the current code in #44710 is still doing those things, I'd argue that code that does those stuff really should not land in the code base in the first place. We should not count on future refactoring to pay off the technical debt, or we still end up with the minefield we have now in lib/internal/modules/esm.

And I suspect that we are getting hit by the technical debt again and we are trying to run away from actually cleaning up the minefield again.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what fix do you recommend @joyeecheung? If I understand you correctly, you agree that adding more of the ESM loader into the snapshot is a fix and is also something we want to do anyway—but you think that this particular bug should be fixed first before taking that step, to avoid sweeping this bug under the rug?

I think what @aduh95 is aiming for here is to try to be explicit about when we spawn the worker thread. Maybe we get that right first and then deal with the snapshotting?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you agree that adding more of the ESM loader into the snapshot is a fix

That wasn't what I was saying. I was suggesting that this could be caused by circular dependencies in the ESM loader, which is also a blocker of the snapshotability of the ESM (there are other blockers).

I think what @aduh95 is aiming for here is to try to be explicit about when we spawn the worker thread. Maybe we get that right first and then deal with the snapshotting?

My point is that I am not convinced getting this right requires eager-loading the ESM loader. The assumption that fixing this requires eager-loading the ESM loader might be as incorrect as the assumption that supporting import() in CJS requires eager-loading the ESM loader. The code here only moves the timing of which the entire ESM loader (custom and/or default) is created, unconditional to wether the graph involves ESM or not. I don't think that's explicit about when the worker thread should be spawned, the worker spawning code is still buried within the hooks creation code which is buried in the loader code, so it's still implicit when the worker is actually created.

Eager-loading the ESM loader leads to an unnecessary penalty to startup performance of CJS-only graphs and I don't think we should do that until the ESM loader can be snapshotted (to reduce the performance penalty). So if we have to go down the eager-loading ESM loader path, I would say including them into the snapshot is a blocker - but then I don't think that's the only path. As I said before, this more likely requires synchronization within the loader code itself to avoid the infinite spawning. What this PR currently does does not seem to introduce synchronization to the loader code, it only works around the bug by creating the loader before anything else, which isn't robust.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Point taken, I still think the current implementation that lazy instantiate the loader is a mistake, I'll try to see if I can find another way without introducing non-snapshotable module in the bootstrap path. Help welcome if someone has another implementation candidate.

Copy link
Member

@joyeecheung joyeecheung Apr 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can tell by just looking at the code (without being able to reproduce this locally), the suspicious code is:

  1. In the internal loader worker, the custom loaders are handled in initializeHooks(), called by the internal worker script
    const customLoaderPaths = getOptionValue('--experimental-loader');
  2. Then again in the internal worker, some code could attempt to invoke createModuleLoader() again if they access require('internal/process/esm_loader').esmLoader (e.g. for handling dynamic import)
    const userLoaderPaths = getOptionValue('--experimental-loader');
    , which has no idea that in this worker, the custom loaders are already handled, and it just creates an unnecessary CustomizedModuleLoader which creates more workers.

I think the proper synchronization should be done in createModuleLoader instead to make sure that when it's invoked within a internal loader worker - it can know that by just looking at some variable set by the worker main script, or the hooks proxy can use a different internal argument (instead of reusing --experimental-loader) to pass the loaders to the worker - it does not create that worker by constructing a CustomizedModuleLoader again. From what I can tell it probably should just return the default loader in the internal loader worker. I am not quite sure what this PR is currently doing but it looks like it's trying to achieve the same thing in a rather convoluted way - by introducing an intricate initialization order that somehow ends up leading to a mocked version of the loader in the esmLoader ??= statement when it's accessed within the internal worker. Which is...not very robust, and I doubt if the mock actually works for dynamic imports initiated in the custom loaders. Also from what I can tell I am not convinced this fix really needs eager-loading ESM loader, that looks like a by-product of the introduction of the intricate initialization order, which doesn't seem to be the right fix in the first place.

Copy link
Member

@joyeecheung joyeecheung Apr 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have an alternate fix here: #47620 with a test that locally reproduces the bug for me. I am not sure if I am supposed to invest energy to push it forward soon (still recovering from COVID), I think I should keep myself away from the computer for the day :3 feel free to take over if I disappear from GitHub.

@aduh95 aduh95 force-pushed the esm-infinite-loop branch from efe037e to 758fd93 Compare April 18, 2023 19:45
@aduh95 aduh95 added the request-ci Add this label to start a Jenkins CI on a PR. label Apr 18, 2023
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Apr 18, 2023
@nodejs-github-bot
Copy link
Collaborator

}, importRequests: [] };
},
initIfNeeded() {
// TODO: we could try to avoid loading ESM loader on CJS-only codebase
Copy link
Member

@joyeecheung joyeecheung Apr 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should initialize the ESM loader in CJS-only code base. At the very least the ESM internals must be snapshotable before we consider this (I suspect if we somehow make that happen, e.g. #45828, which requires resolving some of the the problematic circular dependencies, the bug here already might go away..)

Copy link
Contributor Author

@aduh95 aduh95 Apr 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to, because dynamic import() uses ESM loader. As I said in the TODO comment, we could instantiate the ESM loader only on the first dynamic import.

Copy link
Member

@joyeecheung joyeecheung Apr 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could instantiate the ESM loader only on the first dynamic import.

We have already been doing it in the main branch:

const cascadedLoader = getCascadedLoader();
this would only be a TODO if we first regress by eager loading the ESM loader.

Copy link
Member

@JakobJingleheimer JakobJingleheimer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At one point we had talking about handling all loading via ESMLoader; might that be the solution here?


let esmLoader;
let init = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let init = false;
let initialized = false;

return module.exports.init();
},
init(loader = undefined) {
assert(!init);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

won't this throw? (I would think if (init) return)

Comment on lines +30 to +34
initIfNeeded() {
// TODO: we could try to avoid loading ESM loader on CJS-only codebase
return module.exports.init();
},
init(loader = undefined) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit convoluted, and not clear when which pieces should be called. Could it be baked into the esmLoader getter?


module.exports = {
get esmLoader() {
return esmLoader ??= createModuleLoader(true);
return esmLoader ??= { __proto__: null, cjsCache: new SafeMap(), import() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cjsCache? What is this? Is this trying to track the callers that led to the creation of this ESM loader?

@@ -195,4 +195,43 @@ describe('Loader hooks', { concurrency: true }, () => {
assert.strictEqual(code, 0);
assert.strictEqual(signal, null);
});

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let’s add a test for the simplest case, ./node --loader ./test/fixtures/empty.js.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests that deal with REPL are simply not convenient and hard to not have flaky. Let's not do that?

Comment on lines +96 to +100
'NativeModule internal/modules/esm/loader',
'NativeModule internal/process/esm_loader',
'NativeModule internal/modules/esm/assert',
'NativeModule internal/modules/esm/module_map',
'NativeModule internal/modules/esm/translators',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that off-thread has landed I would be happy to combine some of the ESM files to make them snapshotable. I assume the goal is to add as much of the ESM loader as possible to the snapshot? I think it’s safe to say that ESM code is common enough at this point that most projects are likely to have at least some ESM code loaded between the main app and/or its dependencies.

Not that we should do this refactor in this PR, but I’m assuming it’s something we’d like to do?

@aduh95
Copy link
Contributor Author

aduh95 commented Apr 20, 2023

Superseded by #47620.

@aduh95 aduh95 closed this Apr 20, 2023
@aduh95 aduh95 deleted the esm-infinite-loop branch April 20, 2023 07:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
esm Issues and PRs related to the ECMAScript Modules implementation. needs-ci PRs that need a full CI run. process Issues and PRs related to the process subsystem.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Using custom loaders results in a load loop / OOM
5 participants