-
-
Notifications
You must be signed in to change notification settings - Fork 14.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NodeJS + Linux - IPC via webpack thread-loader is incorrectly received #243374
Comments
Note: just tried commit a05f2b1 (libuv v1.44.2) and it works fine. |
Note: just tried commit 9d2e951 (libuv v1.45.0) and it crashed. |
Note: just tried commit f53e153 (the parent of 9d2e951) and it worked successfully - which suggests that libuv 1.45.0 is broken in nix. cc libuv maintainer @cstrahan libuv 1.45.0 changeloghttps://github.com/libuv/libuv/blob/v1.45.0/ChangeLog#L1-L291 2023.05.19, Version 1.45.0 (Stable) Changes since version 1.44.2:
|
Investigating my local install of node 18:
So that is probably why the non-nix version works - because it's using 1.44.2; whereas the nix version for 18.16.1 uses libuv 1.45.0. It was only a minor version bump - so nothing should have broken? |
Investigating further - it appears that nodejs builds each nodejs versions with specific versions of libuv. eg https://github.com/nodejs/node/blob/v18.16.1/deps/uv/include/uv/version.h#L33-L35 It looks like Node v20.3.0 it is the first version to be built against libuv v1.45.0 |
I tried Node v20.3.1 (libuv 1.45.0) from nvm and it worked fine. I also tried 87478fd (node v20.4.0 w/ libuv 1.46.0) and it crashed. So I'm back to it being a nix build issue rather than a libuv problem. |
Thanks a lot for reporting and investigating this. --- a/pkgs/development/web/nodejs/v18.nix
+++ b/pkgs/development/web/nodejs/v18.nix
@@ -1,8 +1,8 @@
-{ callPackage, openssl, python3, enableNpm ? true }:
+{ callPackage, quictls, python3, enableNpm ? true }:
let
buildNodejs = callPackage ./nodejs.nix {
- inherit openssl;
+ openssl = quictls;
python = python3;
}; |
Apparently it's due io_uring support in libuv nodejs/node#49911. |
ah that would make sense - and 18.16.1 breaks with nix because nix was building it with libuv v1.45.0 instead of 1.44.2 like the original binary. so I guess than node v18.18.1 which includes the fix should work fine as well. I can't test this any more because we actually went the route of rewriting |
This should be fixed by #265974 |
Describe the bug
This bug is a bit complicated. A few weeks ago I upgraded my company's codebase to NodeJS 18 (bumping to this commit 9c70578). Everything went fine during the upgrade - except we've recently found a very weird crash in a very specific case.
We bundle our large monorepo with webpack and thread-loader. After upgrading to the above commit - the build started to crash out with the following error:
We tried reproducing this bug locally on our laptops and we couldn't.
When we SSH'd into one of the linux CI boxes and ran the same commands - we could reproduce it.
This told us it was something OS-specific.
We symptom the problem deeper and in a nutshell it is as follows:
During a build webpack (via
thread-loader
) spawns two child workers via node'schild_process
. It uses node's builtin message passing streams to communicate via JSON blobs between the processes. Because communication is buffered,thread-loader
precedes the blob with a 32-bit int dictating the length of the following message - which allows the receiver to know when to keep reading and when to stop.When serialised as bytes of a utf-8 string - a small enough BE 32-bit integer is lead with 0 bytes (eg
53148
can be hex encoded as00 00 cf 9c
).So we figured out that what's happening is that the first message is being interrupted by the next message - so the payload looks like:
<int>{..partial json..<int>{..json..}
- or put another way:<int>{..partial json..\0\0\xcf\x9c{..json..}
.Null bytes are disallowed by JSON parsers - leading the JSON parser to crash when it reaches the start of the 2nd integer.
Originally we thought it was an out-of-date package or something weird with our webpack customisations - but after updating packages and cutting back as much as we can - the issue still persisted. We investigated
thread-loader
and found that whilst it is complicated, it doesn't use any native npm packages - just node's APIs.Next we thought that it might be a breakage specifically in Node18 - so we updated the nix config to Node16 and it still crashed in exactly the same way. I went to try Node14 on the commit but Node14 isn't a cached build. Because I didn't want to sit through the ~3h local build I installed v14 via nvm and ran the build outside nix and it worked. For sanity check I then went to bisect to narrow down which major version broke things and to my surprise the build passed successfully on all node versions outside of nix - I tried 14, 15, 16, 17 and 18 (installed from nvm) and they all worked.
This narrowed the problem space down to it specifically being the nix build of NodeJS on Linux. Next we thought that the nix commit might be a broken build - so we tried updating to a newer commit (fc4810b) and it crashed in the same way.
Which further narrowed it down - it must be a dependency of NodeJS within nix that's broken. To validate that it wasn't our overlay that broke something I isolated NodeJS and it was still broken.
I'm currently at a bit of a loss as to what the problem might be. I'm going to try a few old commits to see if I can get a bisect of sorts - but it's going to take a while given how long NodeJS takes to build.
I was hoping that with my description one of the maintainers or experts might have an idea that points towards a fix.
Steps To Reproduce
Sadly I don't have an exact reproduction for you - I haven't been able to properly isolate it. AFAICT it needs quite a large codebase (>50k files) running through a webpack build.
I'll see if I can create one whilst I wait for builds to complete.
Expected behavior
NodeJS + webpack + thread loader works on linux
Screenshots
N/A
Additional context
N/A
Notify maintainers
@marsam @cko @gilligan @cillianderoiste
Metadata
Please run
nix-shell -p nix-info --run "nix-info -m"
and paste the result.The text was updated successfully, but these errors were encountered: