Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node.js can become unkillable even with kill -9 on Linux 6.1.113 or 6.6.57 #355567

Closed
UlyssesZh opened this issue Nov 13, 2024 · 3 comments
Closed
Labels
0.kind: bug Something is broken 2.status: duplicate This is a duplicate of another issue or PR 6.topic: kernel The Linux kernel 6.topic: nodejs

Comments

@UlyssesZh
Copy link
Member

Describe the bug

A Node.js program can get stuck after running for a while.

Only reproduceable on Linux kernel 6.1.113 or 6.6.57 (or later versions of 6.1 or 6.6).

Steps To Reproduce

Warning

If you reproduce the bug, your computer cannot be shut down normally because the unkillable process prevents it.

In configuration.nix, set the kernel version to 6.1 or 6.6:

# boot.kernelPackages = pkgs.linuxPackages_6_1;
boot.kernelPackages = pkgs.linuxPackages_6_6;

Also, install the node package by adding it to either users.users.xxx.packages or environment.systemPackages.

Set the root nixpkgs to a rev no earlier than f142a76 (if using Linux 6.6) or 0dffe83 (if using Linux 6.1) (they are the exact first bad commits that I found by bisecting nixpkgs). You can do this by running

sudo nix-channel --add https://github.com/NixOS/nixpkgs/archive/0dffe83747699260fe128a8ccd3d58ed279acd59.tar.gz nixos
sudo nixos-rebuild boot --upgrade

(or by simply upgrading to the latest nixos-24.05).

Reboot. Run uname --kernel-release and check that kernel version satisfies 6.1.113 <= version < 6.2 || 6.6.57 <= version < 6.7.

Run the following command (the Node.js version is 20.15.1, but I think it should be reproduceable with any Node.js 20.x):

node --input-type=module <<'JAVASCRIPT'
import { readdir, readFile } from "fs/promises";
let i = 0;
setInterval(async () => {
  console.log(i++);
  (await readdir("/proc")).forEach(pid => +pid > 0 && readFile(`/proc/${pid}/cmdline`, "utf8"));
}, 5000)
JAVASCRIPT

Wait for a while until it outputs around 100 numbers or more.

Hit Ctrl+C to try to stop it. If it cannot be killed this way, you reproduce the bug. It is unkillable even with kill -9 if you try that.

Expected behavior

The process gets killed by Ctrl+C.

Additional context

I first talked about this symptom in #351774 (comment).

I thought this is an upstream bug of arRPC, so I opened OpenAsar/arrpc#120. There are some people commenting in that issue saying that they experience the same problem and that they all use NixOS. Therefore, I thought this may be a bug that only affects NixOS.

I tried to reproduce the bug on Arch Linux VM (installing the kernel via the package linux-lts), and it was not reproduced.

I then read the source code of arRPC and tried to find a short program that can reproduce this bug. The node command pasted above is the reproducing command that I found.

Notify maintainers

Metadata

  • system: "x86_64-linux"
  • host os: Linux 6.6.59, NixOS, 24.05 (Uakari), 24.05.6258.dba414932936
  • multi-user?: yes
  • sandbox: yes
  • version: nix-env (Nix) 2.18.8
  • channels(root): "nixos-24.05, nixos-hardware, nixos-unstable"
  • nixpkgs: /nix/var/nix/profiles/per-user/root/channels/nixos

Add a 👍 reaction to issues you find important.

@UlyssesZh UlyssesZh added 0.kind: bug Something is broken 6.topic: kernel The Linux kernel 6.topic: nodejs labels Nov 13, 2024
@Shawn8901
Copy link
Contributor

Shawn8901 commented Nov 13, 2024

Could you double check if it's possibly some instance of #353709 I understood from that issue that the causing code is from nodes and triggers a bug in the kernel (at least the lts<6.6.60), not sure about older than 6.6, but maybe it's worth a shot if it goes away on most recent 6.6ers to cross check if that is actually the same root cause

@Shawn8901
Copy link
Contributor

Tested with on a xanmod-6.6.60 with the provided example and that exists fine on my machine with nodejs v20.18.0, can you double check with a said 6.6.60?

@UlyssesZh
Copy link
Member Author

Yes 6.6.60 fixes it.

@FliegendeWurst FliegendeWurst added the 2.status: duplicate This is a duplicate of another issue or PR label Dec 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.kind: bug Something is broken 2.status: duplicate This is a duplicate of another issue or PR 6.topic: kernel The Linux kernel 6.topic: nodejs
Projects
None yet
Development

No branches or pull requests

3 participants