Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

import hook incompatible with tsx #12011

Closed
3 tasks done
nwalters512 opened this issue May 13, 2024 · 28 comments · Fixed by #12388
Closed
3 tasks done

import hook incompatible with tsx #12011

nwalters512 opened this issue May 13, 2024 · 28 comments · Fixed by #12388
Labels
Package: node Issues related to the Sentry Node SDK Type: Bug

Comments

@nwalters512
Copy link
Contributor

Is there an existing issue for this?

How do you use Sentry?

Sentry Saas (sentry.io)

Which SDK are you using?

@sentry/node

SDK Version

8.0.0

Framework Version

No response

Link to Sentry event

No response

SDK Setup

Sentry.init({
  dsn: 'https://examplePublicKey@o0.ingest.sentry.io/0',
});

Steps to Reproduce

I've created a minimal reproduction here: https://github.com/nwalters512/sentry-v8-tsx-error-repro

  1. Clone the repository
  2. Install dependencies with yarn
  3. Run yarn tsx src/index.ts
  4. Observe that the process fails with the error TypeError [ERR_INVALID_URL_SCHEME]: The URL must be of scheme file

Several observations that are hopefully helpful to y'all:

  • This only occurs when "allowJs": true is set in tsconfig.json. As Sentry isn't concerned with this, that made me think this problem might actually be in tsx. However...

  • When adding instrumentation directly with the @opentelemetry/* packages, everything works fine. This is reproducible by making the following change to src/index.ts:

    -import './instrument-sentry.js';
    +import './instrument-opentelemetry.js';

    Given this, I'm strongly inclined to believe that this is an issue with the way in which Sentry is using OpenTelemetry.

  • This only breaks for core Node modules like util, fs, etc. Importing other modules works fine. For instance, the following change to src/index.ts makes it work without erroring:

    -await import('util');
    +await import('zod');
  • This only breaks for dynamic imports. For instance, the following change to src/index.ts makes it work without erroring:

    -await import('util');
    +import 'util';

Expected Result

I would expect the process to complete successfully and log util imported!.

Actual Result

The process errors out while importing util and does not print util imported!.

@timfish
Copy link
Collaborator

timfish commented May 14, 2024

The full stack trace is:

TypeError [ERR_INVALID_URL_SCHEME]: The URL must be of scheme file
    at new NodeError (node:internal/errors:406:5)
    at fileURLToPath (node:internal/url:1393:11)
    at finalizeResolution (node:internal/modules/esm/resolve:234:42)
    at moduleResolve (node:internal/modules/esm/resolve:845:10)
    at defaultResolve (node:internal/modules/esm/resolve:1043:11)
    at nextResolve (node:internal/modules/esm/hooks:833:28)
    at y (file:///Users/tim/Documents/Repositories/sentry-v8-tsx-error-repro/node_modules/tsx/dist/esm/index.mjs?1715684968701:2:2079)
    at j (file:///Users/tim/Documents/Repositories/sentry-v8-tsx-error-repro/node_modules/tsx/dist/esm/index.mjs?1715684968701:2:3198)
    at nextResolve (node:internal/modules/esm/hooks:833:28)
    at resolve (/Users/tim/Documents/Repositories/sentry-v8-tsx-error-repro/node_modules/@opentelemetry/instrumentation/node_modules/import-in-the-middle/hook.js:238:23) {
  code: 'ERR_INVALID_URL_SCHEME'
}

And here is the bottom of the call stack.

Maybe the loader hook added by import-in-the-middle is somehow clashing with the loader hooks added by tsx?

@timfish
Copy link
Collaborator

timfish commented May 14, 2024

Looking at the stack trace, although import-in-the-middle is at the bottom, further up we see frames from tsx:

    at y (file:///Users/tim/Documents/Repositories/sentry-v8-tsx-error-repro/node_modules/tsx/dist/esm/index.mjs?1715684968701:2:2079)
    at j (file:///Users/tim/Documents/Repositories/sentry-v8-tsx-error-repro/node_modules/tsx/dist/esm/index.mjs?1715684968701:2:3198)

I guess it's not handling the params the iitm adds to the url?

So I've opened an issue there:
privatenumber/tsx#554

@timfish timfish changed the title Sentry breaks dynamic import of core Node modules when running in tsx import hook incompatible with tsx May 14, 2024
@nwalters512
Copy link
Contributor Author

@timfish thanks for investigating this; your small reproduction that used just import-in-the-middle was very helpful. I'm relatively confident that the issue here is in fact with import-in-the-middle and the fact that it appends query parameters to the URL it returns from its resolvers. That is, consider this line:

https://github.com/DataDog/import-in-the-middle/blob/00b01fff1f5b69dd25e307593ec54d1d8abb4844/hook.js#L259

Changing that to just url: url.url makes your reproduction complete without error (though I have no idea what effect that has on import-in-the-middle actually working).

Specifically, it looks like the issue is that this URL with ?iitm=true ends up as the parentURL in a subsequent resolution, which you can see by adding console.log(specifier, context) immediately after https://github.com/privatenumber/tsx/blob/e2afc60bbcc299e09a5bf0e0c8909b6766b633a2/src/esm/hook/resolve.ts#L39:

file:///Users/nathan/git/tsx/node_modules/.pnpm/import-in-the-middle@1.7.4/node_modules/import-in-the-middle/lib/register.ts {
  conditions: [ 'node', 'import', 'node-addons' ],
  importAttributes: {},
  parentURL: 'node:util?iitm=true'
}

@getsantry getsantry bot moved this to Waiting for: Product Owner in GitHub Issues with 👀 3 May 14, 2024
@timfish
Copy link
Collaborator

timfish commented May 14, 2024

Tracing the resolving of the otel hooking, it looks like ?iitm=true is used so that iitm knows that the resolving is actually coming from its hook rather than the original location.

Adding the query to the parent URL should be fine as this is still a valid URL.

For example, the following code runs fine under both node and tsx:

import { register } from "node:module";

register(
  new URL(
    `data:application/javascript;base64,${Buffer.from(
      `
    export async function resolve(specifier, context, nextResolve) {
      if (context.parentURL) {
        context.parentURL += "?some=query";
      }
      console.log("resolve", specifier, context);
      return nextResolve(specifier, context);
    }
`
    ).toString("base64")}`
  ),
  import.meta.url
);

await import("node:util");

and outputs:

resolve node:util {
  conditions: [ 'node', 'import', 'node-addons' ],
  importAssertions: {},
  parentURL: 'file:///Users/tim/Documents/repro/test.ts?some=query'
}

node:util?iitm=true is also a valid URL so I'm not fully convinced that's causing the issue yet!

@timfish
Copy link
Collaborator

timfish commented May 14, 2024

Changing that to just url: url.url makes your reproduction complete without error (though I have no idea what effect that has on import-in-the-middle actually working).

Ah, I just re-read this and it makes sense that this is the cause.

@nwalters512
Copy link
Contributor Author

I think you make have just realized this, but I'll finish this comment just so we're both on the same page. Your example is subtly different from what's happening in the failure case. In your example, parentURL is indeed still a valid file:// URL. However, in the failure case, it is not. This was easy to prove to myself:

const { fileURLToPath } = require('url');
fileURLToPath('node:util?foo=bar');

The above errors out exactly the same as the failure case with import-in-the-middle:

Uncaught TypeError [ERR_INVALID_URL_SCHEME]: The URL must be of scheme file
    at fileURLToPath (node:internal/url:1463:11) {
  code: 'ERR_INVALID_URL_SCHEME'
}

@getsantry getsantry bot moved this to Waiting for: Product Owner in GitHub Issues with 👀 3 May 14, 2024
@timfish
Copy link
Collaborator

timfish commented May 14, 2024

You're right, but fileURLToPath('node:util') also throws with the same error!

I think the issue isn't the query string, it's more that import-in-the-middle results in the parentURL being the node built-in which I guess this should never normally be possible because it's not a valid URL.

@timfish
Copy link
Collaborator

timfish commented May 14, 2024

For example, when the minimal reproduction is run through plain old Node:

import { register } from "node:module";

register("import-in-the-middle/hook.mjs", import.meta.url);
await import("node:util");

The iitm resolve function with this added:

    console.log('resolve', [specifier, context.parentURL])

Results in:

resolve [
  'node:util',
  'file:///Users/tim/Documents/Repositories/sentry-v8-tsx-error-repro/test.js'
]
resolve [
  'file:///Users/tim/Documents/Repositories/sentry-v8-tsx-error-repro/node_modules/import-in-the-middle/lib/register.js',
  'node:util?iitm=true'
]
resolve [ 
  'node:util?iitm=true', 
  'node:util?iitm=true' 
]

@nwalters512
Copy link
Contributor Author

All great observations! I managed to make a very isolated reproduction: https://github.com/nwalters512/register-hook-playground

It uses neither tsx nor import-in-the-middle; it copies the bare minimum amount of behavior from import-in-the-middle to be able to reproduce:

node:internal/modules/run_main:129
    triggerUncaughtException(
    ^
TypeError [ERR_INVALID_URL_SCHEME]: The URL must be of scheme file
    at fileURLToPath (node:internal/url:1463:11)
    at finalizeResolution (node:internal/modules/esm/resolve:266:42)
    at moduleResolve (node:internal/modules/esm/resolve:933:10)
    at defaultResolve (node:internal/modules/esm/resolve:1157:11)
    at nextResolve (node:internal/modules/esm/hooks:866:28)
    at resolve (file:///Users/nathan/git/register-hook-playground/hook2.mjs:4:21)
    at nextResolve (node:internal/modules/esm/hooks:866:28)
    at Hooks.resolve (node:internal/modules/esm/hooks:304:30)
    at MessagePort.handleMessage (node:internal/modules/esm/worker:196:24)
    at [nodejs.internal.kHybridDispatch] (node:internal/event_target:825:20) {
  code: 'ERR_INVALID_URL_SCHEME'
}

It turns on that the query params were a red herring of sorts. The presence of it actually triggers another behavior in import-in-the-middle that produces synthetic source code for the module being loaded (in this case, node:util?iitm=true):

https://github.com/DataDog/import-in-the-middle/blob/00b01fff1f5b69dd25e307593ec54d1d8abb4844/hook.js#L266

This puts us in a situation that Node would normally never expect to encounter: parentURL refers to a builtin (node:import). You can reproduce this even more simply with the following hook code (in my repro as hook-simpler.mjs):

export async function resolve(specifier, context, parentResolve) {
  specifier = 'file:///dost-not-exist.mjs';
  context.parentURL = 'node:util';
  const url = await parentResolve(specifier, context, parentResolve);
}

This is obviously very contrived, but it reproduces the problem very simply: Node's parentResolve chokes when specifier is a a file:// URL and context.parentURL is not a file:// URL. This wasn't reproducible in your above example because Node seems to short-circuit if specifier is a core module, it never even tries to use context.parentURL in that case.

I'm feeling pretty confident that this is a bug in import-in-the-middle, as that's the one that ends up generating synthetic code for node:util that ends up trying to import some file:// thing. Do you want me to move this issue there?

@timfish
Copy link
Collaborator

timfish commented May 14, 2024

What I don't yet fully understand is why iitm works with Node as it's using the same parentURLs.

For example these are the logs from resolve in import-in-the-middle calling down to the default resolver:

defaultResolve [
  'node:util',
  {
    conditions: [ 'node', 'import', 'node-addons' ],
    importAttributes: {},
    parentURL: 'file:///Users/tim/Documents/Repositories/sentry-v8-tsx-error-repro/test.js'
  }
]
result [Object: null prototype] { url: 'node:util' }
defaultResolve [
  'file:///Users/tim/Documents/Repositories/sentry-v8-tsx-error-repro/node_modules/import-in-the-middle/lib/register.js',
  {
    conditions: [ 'node', 'import', 'node-addons' ],
    importAttributes: {},
    parentURL: 'node:util?iitm=true'
  }
]
result [Object: null prototype] {
  url: 'file:///Users/tim/Documents/Repositories/sentry-v8-tsx-error-repro/node_modules/import-in-the-middle/lib/register.js',
  format: 'commonjs'
}
defaultResolve [
  'node:util?iitm=true',
  {
    conditions: [ 'node', 'import', 'node-addons' ],
    importAttributes: {},
    parentURL: 'node:util?iitm=true'
  }
]
result [Object: null prototype] { url: 'node:util' }

It appears to handle the node: URLs without error!

From the stack trace, we can see that when tsx is involved, it's called after iitm, ie. iitm > tsx > default.

However, my console logs from tsx resolve aren't being outputted before the exception so it's really hard to see if/how the parameters are being modified before they hit the default resolver.

My best guess is that they're not getting flushed from the loader hooks thread and there's no obvious way to debug it otherwise.

@timfish
Copy link
Collaborator

timfish commented May 21, 2024

Maybe it would be good idea for Sentry to fork iitm and launch it as their own package

The plan is for it to move to the Node org:
nodejs/admin#858

@lilouartz
Copy link

Maybe it would be good idea for Sentry to fork iitm and launch it as their own package

The plan is for it to move to the Node org: nodejs/admin#858

Node org is equally slow to make releases.

If this is going to be a bottleneck to Sentry v8 adoption, you want to have more control over it.

It is small enough package to make little difference even if there are multiple versions of it maintained, i.e. a copy can be incubated under Sentry and if Node.js wants to merge upstream, let them do it.

@getsantry getsantry bot moved this to Waiting for: Product Owner in GitHub Issues with 👀 3 May 21, 2024
@mydea
Copy link
Member

mydea commented May 21, 2024

Maybe it would be good idea for Sentry to fork iitm and launch it as their own package

The plan is for it to move to the Node org: nodejs/admin#858

Node org is equally slow to make releases.

If this is going to be a bottleneck to Sentry v8 adoption, you want to have more control over it.

It is small enough package to make little difference even if there are multiple versions of it maintained, i.e. a copy can be incubated under Sentry and if Node.js wants to merge upstream, let them do it.

Normally, I'd agree, it would be nice if we could fork it. But sadly, in this case we can't do this because import-in-the-middle is a dependency of all the other opentelemetry instrumentation, which would not use our fork :( So we need to make fixes upstream there and live with the slower cadence 😬

@timfish

This comment was marked as outdated.

@nwalters512
Copy link
Contributor Author

@timfish what ultimately fixed things for me was actually #12043; we don't use performance instrumentation, so the error went away when Sentry stopped trying to unnecessarily instrument modules. This is of course not a general solution, and yours might very well be a reasonable bandaid for anyone who is using both tsx and Sentry tracing while we wait for import-in-the-middle maintainers to do their thing!

@timfish
Copy link
Collaborator

timfish commented May 24, 2024

There are multiple outstanding PRs for import-in-the-middle (including the one from @nwalters512 🙏) that combined hopefully fix a wide range of issues.

Until they make it to a release, I've combined these into a patch that can be used with patch-package:
#12242 (comment)

If anyone can confirm that this fixes their issues that would be great!

@AbhiPrasad AbhiPrasad added this to the v8 Instrumentation Bugs milestone May 27, 2024
bengl pushed a commit to nodejs/import-in-the-middle that referenced this issue May 31, 2024
This PR fixes the issue that's been described at length in the following
issues:

- getsentry/sentry-javascript#12011
- nodejs/node#52987

To summarize, `import-in-the-middle` can cause problems in the following
situation:

- A Node core module (`node:*`) is imported
- Another loader is added before `import-in-the-middle`
- That loader tries to resolve files that don't exist on disk

In practice, this occurs when using `import-in-the-middle` with code
that's being executed with
[`tsx`](https://github.com/privatenumber/tsx). `tsx` will try to resolve
`.js` files by also looking for the same file path but with a `.ts`
extension. In many cases, including the synthetic code that
`import-in-the-middle` generates to import `lib/register.js`, such a
corresponding `.ts` file does not exist.

The actual error arises from Node, which assumes that `parentURL` will
always be a `file://` URL when constructing an `ERR_MODULE_NOT_FOUND`
error; see the linked issue above. In the above scenario, the `.ts` file
that is being resolved does not exist, so such an error is created, and
`parentURL === 'node:*'`, so the failing case is triggered. It seems
like Node is receptive to changing that behavior, but in the meantime, I
was hoping to land this patch so that one doesn't have to wait for and
upgrade to a new version of Node.

The fix works by short-circuiting the resolution of `lib/register.js` so
that the other loader (that tries to resolve non-existent paths) is
never tried.
@lilouartz
Copy link

Does the fact that this PR is closed means that now Sentry is compatible with tsx?

@AbhiPrasad
Copy link
Member

This issue is still open #12357, but this specific issue is fixed.

Just needs to be released - that will come out tomorrow most likely.

@mydea
Copy link
Member

mydea commented Jun 7, 2024

Hello, we've just released v8.8.0, which should hopefully resolve this ESM problem. Let us know if you updated and are still running into problems!

@timfish
Copy link
Collaborator

timfish commented Jun 7, 2024

This issue might be fixed, but unfortunately tsx always adds it's loaders before others passed via --import so it's still not compatible:
#12357
privatenumber/tsx#571

import-in-the-middle parses the loaded code to find the imports and with the current order of the loaders, import-in-the-middle is passed TypeScript code which it cannot parse.

billyvg pushed a commit that referenced this issue Jun 10, 2024
resolves #12242
(although there are still some follow ups)

https://github.com/open-telemetry/opentelemetry-js/releases/tag/v1.25.0

I think this lockfile looks correct, but lmk if this feels off.

resolves #12011
resolves #12059
resolves #12154
resolves #12237
resolves nodejs/import-in-the-middle#77 cc
@mohd-akram
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Package: node Issues related to the Sentry Node SDK Type: Bug
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

6 participants