Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

repl is loaded unconditionally, preventing uncaught exception capture #2024

Closed
isaacs opened this issue May 22, 2023 · 7 comments · Fixed by #2025
Closed

repl is loaded unconditionally, preventing uncaught exception capture #2024

isaacs opened this issue May 22, 2023 · 7 comments · Fixed by #2025

Comments

@isaacs
Copy link
Contributor

isaacs commented May 22, 2023

Search Terms

  • repl
  • domain
  • node:domain
  • ERR_DOMAIN_CANNOT_SET_UNCAUGHT_EXCEPTION_CAPTURE
  • setUncaughtExceptionCaptureCallback

Expected Behavior

Should be able to use the process.setUncaughtExceptionCaptureCallback() method in programs loaded using the ts-node/esm loader.

Actual Behavior

Use of process.setUncaughtExceptionCaptureCallback() will trigger the following error in ts-node programs if an uncaught exception occurs, making it impossible to use node's officially sanctioned global error handling mechanism. Rather than reporting the error that was thrown, you get a much less useful crash report like this:

Error [ERR_DOMAIN_CANNOT_SET_UNCAUGHT_EXCEPTION_CAPTURE]: The `domain` module is in use, which is mutually exclusive with calling process.setUncaughtExceptionCaptureCallback()
    at __node_internal_captureLargerStackTrace (node:internal/errors:490:5)
    at new NodeError (node:internal/errors:399:5)
    at process.setUncaughtExceptionCaptureCallback (node:domain:134:15)
    at process.domainErrorHandler (file:///Users/isaacs/dev/tapjs/tapjs/node_modules/async-hook-domain/src/index.ts:140:12)
    at process.emit (node:events:513:28)
    at process.emit (node:domain:489:12)
    at process.emit.sharedData.processEmitHook.installedValue (/Users/isaacs/dev/tapjs/tapjs/node_modules/@cspotcode/source-map-support/source-map-support.js:745:40)
    at SignalExit.#processEmit (/Users/isaacs/dev/tapjs/tapjs/node_modules/signal-exit/src/index.ts:286:17)
    at process.#process.emit (/Users/isaacs/dev/tapjs/tapjs/node_modules/signal-exit/src/index.ts:224:31)
    at SignalExit.#processEmit (file:///Users/isaacs/dev/tapjs/tapjs/node_modules/signal-exit/src/index.ts:286:17)
    at process.#process.emit (file:///Users/isaacs/dev/tapjs/tapjs/node_modules/signal-exit/src/index.ts:224:31)
    at process._fatalException (node:internal/process/execution:146:13)
----------------------------------------
Error: require(`domain`) at this point
    at node:domain:130:28
    at BuiltinModule.compileForInternalLoader (node:internal/bootstrap/loaders:334:7)
    at requireBuiltin (node:internal/bootstrap/loaders:365:14)
    at node:repl:136:16
    at BuiltinModule.compileForInternalLoader (node:internal/bootstrap/loaders:334:7)
    at BuiltinModule.compileForPublicLoader (node:internal/bootstrap/loaders:270:10)
    at loadBuiltinModule (node:internal/modules/cjs/helpers:56:9)
    at Module._load (node:internal/modules/cjs/loader:934:15)
    at Module.require (node:internal/modules/cjs/loader:1141:19)
    at require (node:internal/modules/cjs/helpers:110:18) {
  code: 'ERR_DOMAIN_CANNOT_SET_UNCAUGHT_EXCEPTION_CAPTURE'
}

Steps to reproduce the problem

// ts-node-uncaught-exception-handler-error.ts
setTimeout(() => {
  process.setUncaughtExceptionCaptureCallback(er => {
    console.error('caught an error', er)
    process.setUncaughtExceptionCaptureCallback(null)
  })
})

setTimeout(() => {
  throw new Error('oops')
}, 100)

Expected behavior:

$ node ts-node-uncaught-exception-handler-error.ts
caught an error Error: oops
    at Timeout._onTimeout (/Users/isaacs/dev/tapjs/async-hook-domain/ts-node-uncaught-exception-handler-error.ts:9:9)
    at listOnTimeout (node:internal/timers:569:17)
    at process.processTimers (node:internal/timers:512:7)

Actual behavior:

$ node --loader=ts-node/esm ts-node-uncaught-exception-handler-error.ts
(node:16706) ExperimentalWarning: Custom ESM Loaders is an experimental feature and might change at any time
(Use `node --trace-warnings ...` to show where the warning was created)
Error [ERR_DOMAIN_CANNOT_SET_UNCAUGHT_EXCEPTION_CAPTURE]: The `domain` module is in use, which is mutually exclusive with calling process.setUncaughtExceptionCaptureCallback()
    at new NodeError (node:internal/errors:399:5)
    at process.setUncaughtExceptionCaptureCallback (node:domain:134:15)
    at Timeout._onTimeout (/Users/isaacs/dev/tapjs/async-hook-domain/ts-node-uncaught-exception-handler-error.ts:2:11)
    at listOnTimeout (node:internal/timers:569:17)
    at processTimers (node:internal/timers:512:7)
----------------------------------------
Error: require(`domain`) at this point
    at node:domain:130:28
    at BuiltinModule.compileForInternalLoader (node:internal/bootstrap/loaders:334:7)
    at requireBuiltin (node:internal/bootstrap/loaders:365:14)
    at node:repl:136:16
    at BuiltinModule.compileForInternalLoader (node:internal/bootstrap/loaders:334:7)
    at BuiltinModule.compileForPublicLoader (node:internal/bootstrap/loaders:270:10)
    at loadBuiltinModule (node:internal/modules/cjs/helpers:56:9)
    at Module._load (node:internal/modules/cjs/loader:934:15)
    at Module.require (node:internal/modules/cjs/loader:1141:19)
    at require (node:internal/modules/cjs/helpers:110:18) {
  code: 'ERR_DOMAIN_CANNOT_SET_UNCAUGHT_EXCEPTION_CAPTURE'
}

(Same behavior with ts-node ts-node-uncaught-exception-handler-error.ts.)

Minimal reproduction

see above

The path to requiring node:domain is:

  • esm.mjs, const esm = require('./dist/esm');
  • dist/esm.js (from src/esm.ts) import { register, RegisterOptions, Service } from './index';
  • src/index.ts line 38 export { createRepl, CreateReplOptions, ReplService, EvalAwarePartialHost } from './repl';
  • src/repl.ts line 4 `import { Recoverable, ReplOptions, REPLServer, start as nodeReplStart } from 'repl';
  • node:repl line 136 const domain = require('domain');

Suggestion: require ./repl lazily, since it is rarely needed, and has this rather unfortunate side effect.

Specifications

  • ts-node v10.9.1
  • node v18.16.0 and 20.2.0
  • compiler v5.0.4
{
  "include": ["src/*.ts"],
  "compilerOptions": {
    "declaration": true,
    "declarationMap": true,
    "inlineSources": true,
    "esModuleInterop": true,
    "forceConsistentCasingInFileNames": true,
    "moduleResolution": "node16",
    "resolveJsonModule": true,
    "sourceMap": true,
    "strict": true,
    "target": "es2022"
    "module": "esnext",
    "outDir": "dist-tmp/mjs"
  }
}
  • package.json:
{}
  • Operating system and version: Darwin moxy.lan 22.4.0 Darwin Kernel Version 22.4.0: Mon Mar 6 20:59:28 PST 2023; root:xnu-8796.101.5~3/RELEASE_ARM64_T6000 arm64

  • If Windows, are you using WSL or WSL2?: N/A

@isaacs
Copy link
Contributor Author

isaacs commented May 22, 2023

As far as I can tell, everything that uses the repl loads it from repl.js, so there seems to be little need to have it exported from the index. However, the tests are built expecting the repl functionality to be exposed there on the tsNodeUnderTest context, so it's not immediately clear what the best way is to fix this.

@cspotcode
Copy link
Collaborator

Is this a node bug? Is require('repl') triggering side-effects?

Reminds me of #1587 (comment)

@isaacs
Copy link
Contributor Author

isaacs commented May 23, 2023

Well, yes, it is, but it's more that it uses domains, and domains trigger side effects. Refactoring repl to not use domains is on the todo list, I believe, but not super high priority. If there's a way to not require('repl') when ts-node isn't actually creating a repl, that would likely be a much more expedient solution.

@isaacs
Copy link
Contributor Author

isaacs commented May 23, 2023

Also, if node 16 (or even 18) support is intended, then there's really no way around repl being domain based. There's no way that'll be back ported. Even fixing in node v20 is pretty unlikely.

So, if this can't be addressed in ts-node, I probably need to find (or more likely, write) an alternative --loader to use for node-tap, and forking ts-node is a big bullet I'd rather not bite. There's a lot going on there.

@cspotcode
Copy link
Collaborator

cspotcode commented May 23, 2023

Fair enough, we do want to support node 16 and 18 until they're EOLed. I'm still confused why require('domain'); triggers the side-effect instead of actually using domains, but that's beside the point.

I think the only things we use from require('repl') are Recoverable and start, and neither is called within hot loops, so probably fine to lazy-load them in the 2x places they're used? That might be the easiest way to preserve the API require('ts-node').createRepl() https://typestrong.org/ts-node/api/index.html#createRepl

It looks like we lazy-load source-map-support so might be as simple as copy-pasting that pattern, but really any pattern sounds fine.

import type * as _sourceMapSupport from '@cspotcode/source-map-support';

const sourceMapSupport = require('@cspotcode/source-map-support') as typeof _sourceMapSupport;

@isaacs
Copy link
Contributor Author

isaacs commented May 23, 2023

I'm still confused why require('domain'); triggers the side-effect instead of actually using domains, but that's beside the point.

brief history of domains

In 2011, before a spec for Promise semantics were universally accepted, let alone a first class part of the language, we in Node were really feeling the shortcomings of callback- and eventemitter-based programming for larger scale systems. There was process.on('uncaughtException'), but it was a clumsy tool; it didn't capture much about the thread of continuations that led to a given error, and prevented getting a clean core dump when exceptions were unhandled.

Ryan Dahl had a very nebulous idea for a way to group a family of event emitters, async requests, handles, and timers, whereby anything spawned from something in a given group would be a part of that group, and the user could arbitrarily start a new group as a child of the current one. He spent the better part of a year talking, writing, and discussing this idea, and generally getting people pretty excited about how it would solve what was then one of the most pressing issues in node.

Then, he quit the project, and left Node to become a professional hipster for a while, and tossed the keys to me on the way out.

We on the core team (and I personally) felt the need to deliver something that would make good on these promises, and it seemed like a good and useful idea. The problem was that there wasn't really a clear specification, and everyone had very different ideas about exactly what domains were supposed to do or how they were supposed to behave.

After a few months of whittling this down to something more cohesive, we were left with a choice. Do it right, which would be a lot of threading through libuv's timers and the V8 engine and so forth, or just throw something together that would check all the boxes. And the spec that emerged was something that could clearly be done in userland with the tools we had, just by adding some hooks in the JS layer EventEmitter class and process object. Since it was just an experiment, and we figured it'd end up being scrapped and built properly soon anyway, we went that direction, and shipped it.

The problem with the design immediately became obvious. EventEmitter is used everywhere in node, and adding tracking behavior to every 'error' event made performance noticeably worse. Also, EE is synchronous, and more like a remote method call, whereas timers and handles are async, and fundamentally different in their semantics, so having one thing tracking both of them in the same category was super weird. Also, having it mix the concepts of "error handling" and "continuation tracking" turned out to be a suboptimal mix of concerns. (And, some peoples' expectations of what domains would do was closer to v8 Isolates, full separation of the JS VM space, which it never could have done.)

We couldn't just rip it out though, mostly because New Relic, then one of Node's biggest and most vocal proponents, built their whole APM node client on top of it, and there wasn't really any alternative. (There were a few others as well, experimental stuff got relied on a lot more casually back then.) That alternative came (for APM at least) with async_hooks, which was zero cost when not used. To address the performance impacts of even having to check if process or EE might be tracked by domains, we only applied those changes if domain was loaded, and deprecated the module as soon as async_hooks was released, with the expectation that some better error handling mechanism could be built on it for the repl (which used domains to catch errors and report them without crashing the process, in a somewhat cleaner way than it had been, using on('uncaughtException'), and is probably the only actually valuable use of domains).

Isolates (and the advent of async/await, and node's consistent painstaking use of primordials) addressed most of the other issues that domains were intended to alleviate, and afaik, repl's reliance on domain is one of the last reasons why it's still around.

@isaacs
Copy link
Contributor Author

isaacs commented May 23, 2023

It looks like we lazy-load source-map-support so might be as simple as copy-pasting that pattern, but really any pattern sounds fine.

PR coming shortly!

isaacs added a commit to isaacs/ts-node that referenced this issue May 23, 2023
Actually starting the repl will still put the process into domain-mode,
but this at least allows programs to use `ts-node` or
`--loader=ts-node/esm` without losing the ability to use
process.setUncaughtExceptionCaptureCallback().

The problem should ideally be fixed (or mitigated) in node core, but
this is still worthwhile for the benefit of supporting current node
versions.

Re: nodejs/node#48131
Fix: TypeStrong#2024
isaacs added a commit to isaacs/ts-node that referenced this issue May 24, 2023
Actually starting the repl will still put the process into domain-mode,
but this at least allows programs to use `ts-node` or
`--loader=ts-node/esm` without losing the ability to use
process.setUncaughtExceptionCaptureCallback().

The problem should ideally be fixed (or mitigated) in node core, but
this is still worthwhile for the benefit of supporting current node
versions.

Re: nodejs/node#48131
Fix: TypeStrong#2024
cspotcode added a commit that referenced this issue May 30, 2023
* fix: lazy-load repl to avoid domain side effects

Actually starting the repl will still put the process into domain-mode,
but this at least allows programs to use `ts-node` or
`--loader=ts-node/esm` without losing the ability to use
process.setUncaughtExceptionCaptureCallback().

The problem should ideally be fixed (or mitigated) in node core, but
this is still worthwhile for the benefit of supporting current node
versions.

Re: nodejs/node#48131
Fix: #2024

* Update src/repl.ts

---------

Co-authored-by: Andrew Bradley <cspotcode@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants