Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node 18.13 HTTP/2 live lock #46234

Closed
ronag opened this issue Jan 17, 2023 · 10 comments
Closed

Node 18.13 HTTP/2 live lock #46234

ronag opened this issue Jan 17, 2023 · 10 comments
Labels
http2 Issues or PRs related to the http2 subsystem.

Comments

@ronag
Copy link
Member

ronag commented Jan 17, 2023

This is a tricky one but we have started noticing processes getting stuck on 100% cpu (live lock) since updating to Node 18.3.

Doing a debug breakpoint in the process shows us a lot of time is spent in:

(gdb) bt
#0  0x0000000000b9a6e4 in node::http2::Http2Session::OnStreamClose(nghttp2_session*, int, unsigned int, void*) ()
#1  0x0000000000af2e3c in node::Environment::RunAndClearNativeImmediates(bool) ()
#2  0x0000000000af3446 in node::Environment::CheckImmediate(uv_check_s*) ()
#3  0x000000000165abb9 in uv__run_check (loop=loop@entry=0x5278e80 <default_loop_struct>) at ../deps/uv/src/unix/loop-watcher.c:67
#4  0x00000000016532f0 in uv_run (loop=0x5278e80 <default_loop_struct>, mode=UV_RUN_DEFAULT) at ../deps/uv/src/unix/core.c:420
#5  0x0000000000aafa2d in node::SpinEventLoop(node::Environment*) ()
#6  0x0000000000bb11f4 in node::NodeMainInstance::Run() ()
#7  0x0000000000b26c44 in node::LoadSnapshotDataAndRun(node::SnapshotData const**, node::InitializationResult const*) ()
#8  0x0000000000b2a83f in node::Start(int, char**) ()
#9  0x00007f87ed3abd0a in __libc_start_main (main=0xaa5910 <main>, argc=2, argv=0x7ffd7edcaee8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffd7edcaed8) at ../csu/libc-start.c:308
#10 0x0000000000aad7ee in _start ()

We suspect https://github.com/nodejs/node/blob/main/doc/changelogs/CHANGELOG_V18.md#18.13.0 but it's difficult to know for sure.

We've rolled back to 18.12.1 and will be observing if it keeps happening.

@udnes99
Copy link

udnes99 commented Jan 17, 2023

Also experienced issues after upgrading to 18.13.0. Specifically when running within a container deployed to Google Kubernetes Engine. We actually started to experience a 100% CPU freeze using Node 16. However, we did not figure out which version of Node 16 that caused trouble, and our fix was ironically to upgrade to node 18 (which worked until recently).

@ronag
Copy link
Member Author

ronag commented Jan 17, 2023

@nodejs/http @nodejs/http2

@ronag ronag changed the title Node18.13 HTTP2 live lock Node 18.13 HTTP/2 live lock Jan 17, 2023
@santigimeno
Copy link
Member

It looks it could be: dee882e94f though it'd be great having a way to reproduce it.

@mcollina
Copy link
Member

I concur.

@udnes99
Copy link

udnes99 commented Jan 18, 2023

I went through the changelogs, and discovered the same commit was introduced in Node 16.19.0, which is the time at which we started noticing issues, bur were unable to repro. Great job with debugging!

@ronag
Copy link
Member Author

ronag commented Jan 18, 2023

@santigimeno Could you make a revert PR?

santigimeno added a commit to santigimeno/node that referenced this issue Jan 18, 2023
This reverts commit dee882e.

As it's causing: nodejs#46234.
nodejs#42713 to be reopened.
@mcollina
Copy link
Member

I suggest we have a regression test for this before backporting the revert across all lines.

santigimeno added a commit to santigimeno/node that referenced this issue Jan 18, 2023
This reverts commit dee882e.
Moved the test that demonstrated what this commit was fixing to the
`known_issues` folder.

Fixes: nodejs#46234
@santigimeno
Copy link
Member

Revert PR is here: #46249 and moved to test to the known_issues folder

@VoltrexKeyva VoltrexKeyva added the http2 Issues or PRs related to the http2 subsystem. label Jan 18, 2023
@alexpusch
Copy link

We encountered this ugly issue as well - node process hangs sporadically after node 18.3 upgrade. Took me a while to trace it here.
Our http2 stream use is probably Google Pubsub grcp client

Trott pushed a commit to Trott/io.js that referenced this issue Feb 18, 2023
This reverts commit dee882e.
Moved the test that demonstrated what this commit was fixing to the
`known_issues` folder.

Fixes: nodejs#46234
Trott pushed a commit to Trott/io.js that referenced this issue Feb 19, 2023
This reverts commit dee882e.
Moved the test that demonstrated what this commit was fixing to the
`known_issues` folder.

Fixes: nodejs#46234
Trott pushed a commit to Trott/io.js that referenced this issue Feb 19, 2023
This reverts commit dee882e.
Moved the test that demonstrated what this commit was fixing to the
`known_issues` folder.

Fixes: nodejs#46234
Trott pushed a commit to Trott/io.js that referenced this issue Feb 19, 2023
This reverts commit dee882e.
Moved the test that demonstrated what this commit was fixing to the
`known_issues` folder.

Fixes: nodejs#46234
@killagu
Copy link
Contributor

killagu commented Feb 21, 2023

2023-02-21 15:48:41 [INFO] Http2Session client (858) stream 7 closed with code: 0
2023-02-21 15:48:41 [INFO] Http2Session client (858) stream 7 closed with code: 0
2023-02-21 15:48:41 [INFO] Http2Session client (858) stream 7 closed with code: 0
2023-02-21 15:48:41 [INFO] Http2Session client (858) stream 7 closed with code: 0
2023-02-21 15:48:41 [INFO] Http2Session client (858) stream 7 closed with code: 0
2023-02-21 15:48:41 [INFO] Http2Session client (858) stream 7 closed with code: 0
2023-02-21 15:48:41 [INFO] Http2Session client (858) stream 7 closed with code: 0
2023-02-21 15:48:41 [INFO] Http2Session client (858) stream 7 closed with code: 0
2023-02-21 15:48:41 [INFO] Http2Session client (858) stream 7 closed with code: 0
2023-02-21 15:48:41 [INFO] Http2Session client (858) stream 7 closed with code: 0
2023-02-21 15:48:41 [INFO] Http2Session client (858) stream 7 closed with code: 0
2023-02-21 15:48:41 [INFO] Http2Session client (858) stream 7 closed with code: 0
2023-02-21 15:48:41 [INFO] Http2Session client (858) stream 7 closed with code: 0
2023-02-21 15:48:41 [INFO] Http2Session client (858) stream 7 closed with code: 0
2023-02-21 15:48:41 [INFO] Http2Session client (858) stream 7 closed with code: 0
2023-02-21 15:48:41 [INFO] Http2Session client (858) stream 7 closed with code: 0
2023-02-21 15:48:41 [INFO] Http2Session client (858) stream 7 closed with code: 0

Node v16.19.1 has the same problem. Session close never done.

juanarbol pushed a commit that referenced this issue Mar 5, 2023
This reverts commit dee882e.
Moved the test that demonstrated what this commit was fixing to the
`known_issues` folder.

Fixes: #46234
PR-URL: #46721
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Debadree Chatterjee <debadree333@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Richard Lau <rlau@redhat.com>
targos pushed a commit that referenced this issue Mar 13, 2023
This reverts commit dee882e.
Moved the test that demonstrated what this commit was fixing to the
`known_issues` folder.

Fixes: #46234
PR-URL: #46721
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Debadree Chatterjee <debadree333@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Richard Lau <rlau@redhat.com>
BethGriggs pushed a commit that referenced this issue Mar 27, 2023
This reverts commit dee882e.
Moved the test that demonstrated what this commit was fixing to the
`known_issues` folder.

Fixes: #46234
PR-URL: #46721
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Debadree Chatterjee <debadree333@gmail.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Richard Lau <rlau@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
http2 Issues or PRs related to the http2 subsystem.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants