Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistently perform bootstrap and encode Brotli config for improved caching and reduced code complexity #1836

Conversation

devversion
Copy link
Contributor

@devversion devversion commented Jul 13, 2022

Note to self: consider closing #1177 when this is merged, if it does indeed apply to worker threads -- @cspotcode


  1. Cleans up the BootstrapState for the new approach, removing all the detection flags of whether we are in the CLI.
  2. Introduces a new type for the bootstrap state that can be encoded into the Brotli config.
    • This is a subset of the bootstrap state with explicit fields that can be cached
    • This makes it obvious which stuff can be cached/encoded, and also encourages thinking before new stuff is passed through.
    • Also it reduces the bytes encoded in the payload
  3. getEntryPointInfo (coming from the previous ESM fix PR) no longer takes the full bootstrap state. It just fine-grained takes what it needs. This is needed because only the necessary properties/state is cached, so it cannot expect the full parsed arguments (see 2)
  4. As we chatted, we now always encode the Brotli state (regardless of ESM or with ESM) into process.execArgv. This ensures that phase1-3 do not need to be repeated when forked (and also makes the mental model easy and more expected).
    • A shared function is used for the child process script and execArgv computation. This logic can be used for setting execArgv and for getting the args for going directly from bin.js into the child process when ESM is needed.
  5. The existing index.spec.ts forking recursive execArgv test had to be updated:
    • So that is now knows that forked processes always go to the child entry-point with Brotli config (see 4)
    • So that we also test the forking execArgv for ESM (aside from the worker actual fork tests we have)
  6. Some of the forking tests I added with my first PR have been renamed and adjusted to not rely on --cwd but just use the working directory when spawning ts-node (like cd dir && ts-node).

@devversion devversion force-pushed the always-encode-brotli-and-fix-forking branch from 7c0678f to 3812d56 Compare July 13, 2022 18:46
@devversion
Copy link
Contributor Author

cc. @cspotcode. I'm super open to not always go to the child entry-point (when e.g. ESM is not enabled), but I believe this is worth considering as it just makes all of the boostrapping more readable / obvious (while we talk about feature creep and code smell)

@cspotcode
Copy link
Collaborator

cspotcode commented Jul 13, 2022

Thanks, I'll have to give this some thought.

It's a breaking change to use a child process for all users. It affects PIDs and signal handling and whatnot. And it's a performance regression for users who only need CJS.

Some of us in the ecosystem believe that the child process will eventually be unnecessary, once node adds the appropriate features.
nodejs/node#33903
sindresorhus/import-from#9
gulpjs/rechoir#43 (comment)
A heated discussion cropped up a few hours ago:
nodejs/node#43818

...so I kinda want to treat the child process as a hack that's only done when absolutely necessary.

To get some of the code quality benefits you describe, I suppose we could pass the brotli payload in-process even when we don't spawn a child. So like, create the payload, and pass it in-memory to the next phase of bootstrapping.

That gives us some of the safety and code readability guarantees you describe without imposing the breaking change / perf hit on CJS consumers.

@devversion
Copy link
Contributor Author

@cspotcode Yeah, it's trivial to just jump into the final phase in-memory (from the same process). Good points you raised there. I will make this change and rebase the PR. Should be ready for your review/thoughts tomorrow.

@devversion devversion force-pushed the always-encode-brotli-and-fix-forking branch 4 times, most recently from a7c46a4 to cbdf2c2 Compare July 14, 2022 11:50
@codecov
Copy link

codecov bot commented Jul 14, 2022

Codecov Report

Merging #1836 (9c0a5a9) into main (97f9afd) will decrease coverage by 0.00%.
The diff coverage is 93.58%.

Impacted Files Coverage Δ
src/child/spawn-child-with-esm.ts 82.35% <82.35%> (ø)
src/bin.ts 88.29% <95.74%> (-0.17%) ⬇️
src/child/child-entrypoint.ts 87.50% <100.00%> (-5.36%) ⬇️
src/child/child-exec-args.ts 100.00% <100.00%> (ø)
src/child/argv-payload.ts 83.33% <0.00%> (-16.67%) ⬇️

@devversion devversion force-pushed the always-encode-brotli-and-fix-forking branch 3 times, most recently from a7e6486 to 6f0ee70 Compare July 14, 2022 13:15
@cspotcode
Copy link
Collaborator

Only skimming the code, so I may have misinterpreted. But FWIW it's ok to make the breaking change to stop resolving entrypoint relative to our --cwd. Not sure if that simplifies things.

@cspotcode
Copy link
Collaborator

Also feel free to push a long commit history, no need to force-push if you don't want to. We will squash merge it to main anyway.

@devversion
Copy link
Contributor Author

@cspotcode Thanks for looking. Sounds good. I will not squash the fixup commits (just a habit of other repos). Regarding --cwd. It's low-effort to keep it working as part of this PR, but it will be easy to remove this in a follow-up. thx

@cspotcode
Copy link
Collaborator

Sounds good. FWIW I'm going to start merging breaking changes to main, so we're fully in breakage mode.

@devversion
Copy link
Contributor Author

Nice! just for your information: This PR is now green and ready for review when you find some time. No rush, thanks!

@cspotcode
Copy link
Collaborator

Thanks, can I get a quick bulleted list of the changes and their motivation, to assist review? In another comment, or by updating the issue description?

ts-node --esm avoids loading the typescript compiler in the parent process, which is good for performance. Loading TS compiler is slow. Does this PR preserve that optimization?

@devversion
Copy link
Contributor Author

devversion commented Jul 14, 2022

@cspotcode This PR:

  1. Cleans up the BootstrapState for the new approach, removing all the detection flags of whether we are in the CLI.
  2. Introduces a new type for the bootstrap state that can be encoded into the Brotli config.
    • This is a subset of the bootstrap state with explicit fields that can be cached
    • This makes it obvious which stuff can be cached/encoded, and also encourages thinking before new stuff is passed through.
    • Also it reduces the bytes encoded in the payload
  3. getEntryPointInfo (coming from the previous ESM fix PR) no longer takes the full bootstrap state. It just fine-grained takes what it needs. This is needed because only the necessary properties/state is cached, so it cannot expect the full parsed arguments (see 2)
  4. As we chatted, we now always encode the Brotli state (regardless of ESM or with ESM) into process.execArgv. This ensures that phase1-3 do not need to be repeated when forked (and also makes the mental model easy and more expected).
    • A shared function is used for the child process script and execArgv computation. This logic can be used for setting execArgv and for getting the args for going directly from bin.js into the child process when ESM is needed.
  5. The existing index.spec.ts forking recursive execArgv test had to be updated:
    • So that is now knows that forked processes always go to the child entry-point with Brotli config (see 4)
    • So that we also test the forking execArgv for ESM (aside from the worker actual fork tests we have)
  6. Some of the forking tests I added with my first PR have been renamed and adjusted to not rely on --cwd but just use the working directory when spawning ts-node (like cd dir && ts-node).

is this sufficient information?

ts-node --esm avoids loading the typescript compiler in the parent process, which is good for performance. Loading TS compiler is slow. Does this PR preserve that optimization?

This is something I'm not sure if worth doing. It would introduce some complexity again, for something that you also acknowledged to be just a workaround in the long-term. I do realize that loading TS is usually a little slow, but is it that big of a deal, people noticing in the meanwhile? Also because it already loads twice if people enable esm in the tsconfig. The complexity will be that we now would also need to pass all parseArgv result into the initial child entry-point (needed for phase3), and so on..

…caching/reduced complexity

Additionally, this PR streamlines the boostrap mechanism to always
call into the child script, resulting in reduced complexity, and also
improved caching for user-initiated forked processes. i.e. the tsconfig
resolution is not repeated multiple-times because forked processes
are expected to preserve the existing ts-node project. More details
can be found here TypeStrong#1831.

Fixes TypeStrong#1812.
…proved caching/reduced complexity

Do not use `Nodenext` as `esnext` is sufficent and makes it work with
all TS versions we have in the matrix.
@devversion devversion force-pushed the always-encode-brotli-and-fix-forking branch from 44ab40b to fbc7754 Compare July 14, 2022 18:49
…proved caching/reduced complexity

Re-add fast-path for ESM when we detect it before phase3. And simplify
code.
@devversion devversion force-pushed the always-encode-brotli-and-fix-forking branch from fbc7754 to 9c0a5a9 Compare July 14, 2022 18:53
@devversion
Copy link
Contributor Author

devversion commented Jul 14, 2022

@cspotcode Please find the bullet-point list you requested in the above comment. I also added the fast-path to avoid loading the TS compiler when we detect --esm before loading the tsconfig. I was able to actually re-add this without making the code more complicated as I reworked some of the types/structs I created with the initial commits. See the latest fixup.

I also rebased on top of your breaking changes in main

@devversion
Copy link
Contributor Author

Hey @cspotcode, just wanted to check if there is any way I can help move this forward. No rush, just a friendly check-in

@devversion
Copy link
Contributor Author

devversion commented Aug 26, 2022

Friendly ping on this again. Happy to make any changes. also happy to discuss this if this doesn't look reasonable.

@devversion
Copy link
Contributor Author

It's been a while. I'm ready to rebase this, or close this, but would like to get it off my plate. Please give an update.

@devversion
Copy link
Contributor Author

@cspotcode It's unfortunate that I need to close this— due to no response at all. It would have been totally fine to say that it's not worth the review effort given lack of time etc..

In general- I think the current code in main could definitely benefit from some improved type-safety and avoiding unnecessary work when working. At least we were able to merge in #1814 to fix forking separately.

@devversion devversion closed this Mar 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants