Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault in v8::internal::compiler::(anonymous namespace)::MayAlias (10.15.2 from Debian Buster) #31484

Closed
TimWolla opened this issue Jan 23, 2020 · 26 comments
Labels
v8 engine Issues and PRs related to the V8 dependency.

Comments

@TimWolla
Copy link

  • Version:
root@*snip*:~# node -v
v10.15.2
root@*snip*:~# dpkg -l nodejs
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name           Version        Architecture Description
+++-==============-==============-============-==================================================
ii  nodejs         10.15.2~dfsg-2 amd64        evented I/O for V8 javascript - runtime executable
  • Platform:

Debian Buster on amd64.

root@*snip*:~# uname -a
Linux *snip* 4.19.0-6-amd64 #1 SMP Debian 4.19.67-2+deb10u1 (2019-09-20) x86_64 GNU/Linux
  • Subsystem:

I'm seeing a more or less regular crash of a non-trivial application within libnode.so.64. It's running on a low-traffic staging system. Find below an excerpt from the syslog of the affected machine. Times are in UTC.

Jan 23 16:45:12 *snip* kernel: [502165.519232] node[21284]: segfault at 8 ip 00007feca7142b6b sp 00007feca1bcb110 error 4 in libnode.so.64[7feca6d02000+abe000]
Jan 23 16:45:12 *snip* kernel: [502165.531802] Code: 48 89 e5 41 54 53 48 83 ec 10 64 48 8b 04 25 28 00 00 00 48 89 45 e8 31 c0 b8 01 00 00 00 48 39 f7 74 5c 48 8b 47 08 48 89 f3 <48> 8b 76 08 49 89 fc 48 8d 7d e0 48 89 45 e0 e8 31 a9 bc ff 84 c0
Jan 23 17:13:28 *snip* kernel: [503861.549372] node[21318]: segfault at 8 ip 00007f1bd3ed8b6b sp 00007f1bcb7fd110 error 4 in libnode.so.64[7f1bd3a98000+abe000]
Jan 23 17:13:28 *snip* kernel: [503861.565373] Code: 48 89 e5 41 54 53 48 83 ec 10 64 48 8b 04 25 28 00 00 00 48 89 45 e8 31 c0 b8 01 00 00 00 48 39 f7 74 5c 48 8b 47 08 48 89 f3 <48> 8b 76 08 49 89 fc 48 8d 7d e0 48 89 45 e0 e8 31 a9 bc ff 84 c0
Jan 23 18:09:58 *snip* kernel: [507250.954363] node[21363]: segfault at 8 ip 00007f9b01d95b6b sp 00007f9afd820110 error 4 in libnode.so.64[7f9b01955000+abe000]
Jan 23 18:09:58 *snip* kernel: [507250.962099] Code: 48 89 e5 41 54 53 48 83 ec 10 64 48 8b 04 25 28 00 00 00 48 89 45 e8 31 c0 b8 01 00 00 00 48 39 f7 74 5c 48 8b 47 08 48 89 f3 <48> 8b 76 08 49 89 fc 48 8d 7d e0 48 89 45 e0 e8 31 a9 bc ff 84 c0
Jan 23 18:38:34 *snip* kernel: [508967.449433] node[21430]: segfault at 8 ip 00007fc0ee66cb6b sp 00007fc0e90f5110 error 4 in libnode.so.64[7fc0ee22c000+abe000]
Jan 23 18:38:34 *snip* kernel: [508967.461002] Code: 48 89 e5 41 54 53 48 83 ec 10 64 48 8b 04 25 28 00 00 00 48 89 45 e8 31 c0 b8 01 00 00 00 48 39 f7 74 5c 48 8b 47 08 48 89 f3 <48> 8b 76 08 49 89 fc 48 8d 7d e0 48 89 45 e0 e8 31 a9 bc ff 84 c0
Jan 23 19:07:10 *snip* kernel: [510683.485209] node[21480]: segfault at 8 ip 00007f7527a1db6b sp 00007f7522ca7110 error 4 in libnode.so.64[7f75275dd000+abe000]
Jan 23 19:07:10 *snip* kernel: [510683.493478] Code: 48 89 e5 41 54 53 48 83 ec 10 64 48 8b 04 25 28 00 00 00 48 89 45 e8 31 c0 b8 01 00 00 00 48 39 f7 74 5c 48 8b 47 08 48 89 f3 <48> 8b 76 08 49 89 fc 48 8d 7d e0 48 89 45 e0 e8 31 a9 bc ff 84 c0
Jan 23 20:04:52 *snip* kernel: [514145.480506] node[26613]: segfault at 8 ip 00007fe9488dcb6b sp 00007fe93fffe110 error 4 in libnode.so.64[7fe94849c000+abe000]
Jan 23 20:04:52 *snip* kernel: [514145.501037] Code: 48 89 e5 41 54 53 48 83 ec 10 64 48 8b 04 25 28 00 00 00 48 89 45 e8 31 c0 b8 01 00 00 00 48 39 f7 74 5c 48 8b 47 08 48 89 f3 <48> 8b 76 08 49 89 fc 48 8d 7d e0 48 89 45 e0 e8 31 a9 bc ff 84 c0

After becoming aware of the issue I made sure that the process could dump core and I also installed the relevant debug symbols. My understanding is that node crashes within v8's JIT (?) compiler:

Core was generated by `/usr/bin/node dist/server.js'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  v8::internal::compiler::(anonymous namespace)::MayAlias (a=0x221fc90, b=0x0) at ../deps/v8/src/compiler/node.h:267
267	  Type type() const { return type_; }
[Current thread is 1 (Thread 0x7fe93ffff700 (LWP 26613))]
(gdb) list
262	
263	  // Only NodeProperties should manipulate the op.
264	  void set_op(const Operator* op) { op_ = op; }
265	
266	  // Only NodeProperties should manipulate the type.
267	  Type type() const { return type_; }
268	  void set_type(Type type) { type_ = type; }
269	
270	  // Only NodeMarkers should manipulate the marks on nodes.
271	  Mark mark() const { return mark_; }
(gdb) bt
#0  v8::internal::compiler::(anonymous namespace)::MayAlias (a=0x221fc90, b=0x0) at ../deps/v8/src/compiler/node.h:267
#1  0x00007fe9488dcc3c in v8::internal::compiler::(anonymous namespace)::MayAlias (b=0x1f2e570, a=0x221fc90) at ../deps/v8/src/utils.h:355
#2  v8::internal::compiler::(anonymous namespace)::MayAlias (a=0x221fc90, b=0x1f2e570) at ../deps/v8/src/compiler/load-elimination.cc:37
#3  0x00007fe9488dde0d in v8::internal::compiler::LoadElimination::AliasStateInfo::MayAlias (this=0x7fe93fffe230, other=0x1f2e570) at ../deps/v8/src/compiler/load-elimination.cc:679
#4  0x00007fe9488dea2d in v8::internal::compiler::LoadElimination::AbstractField::Kill (this=this@entry=0x7fe9340792e0, alias_info=..., name=name@entry=..., zone=zone@entry=0x1ea7490)
    at ../deps/v8/src/compiler/load-elimination.cc:372
#5  0x00007fe9488deb7c in v8::internal::compiler::LoadElimination::AbstractState::KillFields (this=this@entry=0x7fe93407a250, object=object@entry=0x221fc90, name=..., zone=0x1ea7490)
    at ../deps/v8/src/compiler/load-elimination.cc:653
#6  0x00007fe9488e04f6 in v8::internal::compiler::LoadElimination::ReduceStoreField (this=0x7fe93fffea20, node=0x219eea0) at ../deps/v8/src/zone/zone-allocator.h:37
#7  0x00007fe94891e72b in v8::internal::compiler::(anonymous namespace)::SourcePositionWrapper::Reduce (this=0x229ba18, node=0x219eea0) at ../deps/v8/src/compiler/pipeline.cc:651
#8  0x00007fe94885adc8 in v8::internal::compiler::GraphReducer::Reduce (this=this@entry=0x7fe93fffe920, node=node@entry=0x219eea0) at ../deps/v8/src/compiler/graph-reducer.cc:85
#9  0x00007fe94885b195 in v8::internal::compiler::GraphReducer::ReduceTop (this=this@entry=0x7fe93fffe920) at ../deps/v8/src/compiler/graph-reducer.cc:152
#10 0x00007fe94885b388 in v8::internal::compiler::GraphReducer::ReduceNode (this=this@entry=0x7fe93fffe920, node=<optimized out>) at ../deps/v8/src/compiler/graph-reducer.cc:56
#11 0x00007fe94885b48d in v8::internal::compiler::GraphReducer::ReduceGraph (this=this@entry=0x7fe93fffe920) at ../deps/v8/src/compiler/graph.h:79
#12 0x00007fe94892b19e in v8::internal::compiler::LoadEliminationPhase::Run (this=this@entry=0x7fe93fffeca0, data=0x1fd3f98, temp_zone=temp_zone@entry=0x1ea7490) at ../deps/v8/src/compiler/pipeline.cc:1457
#13 0x00007fe94892c6b3 in v8::internal::compiler::PipelineImpl::Run<v8::internal::compiler::LoadEliminationPhase> (this=0x1fd4118) at ../deps/v8/src/compiler/zone-stats.h:29
#14 v8::internal::compiler::PipelineImpl::OptimizeGraph (this=this@entry=0x1fd4118, linkage=0x220c510) at ../deps/v8/src/compiler/pipeline.cc:1912
#15 0x00007fe94892c710 in v8::internal::compiler::PipelineCompilationJob::ExecuteJobImpl (this=0x1fd3de0) at ../deps/v8/src/compiler/pipeline.cc:847
#16 0x00007fe9487fcf91 in v8::internal::OptimizedCompilationJob::ExecuteJob (this=this@entry=0x1fd3de0) at ../deps/v8/src/compiler.cc:223
#17 0x00007fe9487f7fab in v8::internal::OptimizingCompileDispatcher::CompileNext (this=this@entry=0x1e8ae70, job=0x1fd3de0) at ../deps/v8/src/compiler-dispatcher/optimizing-compile-dispatcher.cc:118
#18 0x00007fe9487f826d in v8::internal::OptimizingCompileDispatcher::CompileTask::RunInternal (this=0x1eb8a90) at ../deps/v8/src/compiler-dispatcher/optimizing-compile-dispatcher.cc:69
#19 0x00007fe9485b1162 in node::BackgroundRunner (data=0x1e1f480) at ../src/node_platform.cc:42
#20 0x00007fe945563fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#21 0x00007fe9454944cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

I have the core dump on record and can provide additional information from the core file on request.

The application is launched using systemd. Find below the unit file for your reference:

Unit File (click to expand)
root@*snip*:~# systemctl cat *snip*
# /etc/systemd/system/*snip*.service
#
# Ansible managed
#

[Unit]
Description=*snip*
Requires=mysql.service
After=mysql.service
Wants=*snip*

[Service]
Restart=always
User=*snip*
WorkingDirectory=/srv/*snip*/
ExecStart=/usr/bin/node dist/server.js

Environment=*snip*
Environment=*snip*
Environment=*snip*
Environment=*snip*
Environment=*snip*
EnvironmentFile=/etc/*snip*/environment

StateDirectory=*snip*
StateDirectoryMode=0700
CacheDirectory=*snip*
CacheDirectoryMode=0700
ReadWritePaths=*snip*
ReadWritePaths=/tmp/
# PrivateTmp=true

ProtectSystem=strict
ProtectHome=true
ProtectKernelTunables=true
ProtectKernelModules=true
ProtectControlGroups=true
SystemCallFilter=~@cpu-emulation @keyring @module @obsolete @raw-io
NoNewPrivileges=true
LimitCORE=infinity

[Install]
WantedBy=multi-user.target

# /etc/systemd/system/*snip*.service.d/override.conf
[Service]
Environment=DEBUG=*snip*:*
ReadWritePaths=/srv/*snip*/storage/

Find below a list of open files from the process after it was restarted automatically by systemd:

lsof -p $pid (click to expand)
root@*snip*:~# lsof -p 9946
COMMAND  PID    USER   FD      TYPE             DEVICE SIZE/OFF    NODE NAME
node    9946 *snip*  cwd       DIR                8,1     4096  149468 /usr/share/*snip*
node    9946 *snip*  rtd       DIR                8,1     4096       2 /
node    9946 *snip*  txt       REG                8,1    14264   22337 /usr/bin/node
node    9946 *snip*  mem       REG                8,1    93000     955 /usr/lib/x86_64-linux-gnu/libresolv-2.28.so
node    9946 *snip*  mem       REG                8,1    26960     946 /usr/lib/x86_64-linux-gnu/libnss_dns-2.28.so
node    9946 *snip*  mem       REG                8,1    55792     947 /usr/lib/x86_64-linux-gnu/libnss_files-2.28.so
node    9946 *snip*  mem       REG                8,1    89016     944 /usr/lib/x86_64-linux-gnu/libnsl-2.28.so
node    9946 *snip*  mem       REG                8,1  1824496     937 /usr/lib/x86_64-linux-gnu/libc-2.28.so
node    9946 *snip*  mem       REG                8,1   146968     954 /usr/lib/x86_64-linux-gnu/libpthread-2.28.so
node    9946 *snip*  mem       REG                8,1   100712     912 /usr/lib/x86_64-linux-gnu/libgcc_s.so.1
node    9946 *snip*  mem       REG                8,1  1579448     941 /usr/lib/x86_64-linux-gnu/libm-2.28.so
node    9946 *snip*  mem       REG                8,1  1570256    3681 /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25
node    9946 *snip*  mem       REG                8,1    35808     956 /usr/lib/x86_64-linux-gnu/librt-2.28.so
node    9946 *snip*  mem       REG                8,1    14592     940 /usr/lib/x86_64-linux-gnu/libdl-2.28.so
node    9946 *snip*  mem       REG                8,1 27190832   21975 /usr/lib/x86_64-linux-gnu/libicudata.so.63.1
node    9946 *snip*  mem       REG                8,1  1886216   21980 /usr/lib/x86_64-linux-gnu/libicuuc.so.63.1
node    9946 *snip*  mem       REG                8,1  2984248   21976 /usr/lib/x86_64-linux-gnu/libicui18n.so.63.1
node    9946 *snip*  mem       REG                8,1   593696    4776 /usr/lib/x86_64-linux-gnu/libssl.so.1.1
node    9946 *snip*  mem       REG                8,1  3031904    4767 /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
node    9946 *snip*  mem       REG                8,1   158128   14137 /usr/lib/x86_64-linux-gnu/libnghttp2.so.14.17.1
node    9946 *snip*  mem       REG                8,1    71544   22287 /usr/lib/x86_64-linux-gnu/libcares.so.2.2.0
node    9946 *snip*  mem       REG                8,1   178096   22309 /usr/lib/x86_64-linux-gnu/libuv.so.1.0.0
node    9946 *snip*  mem       REG                8,1   121280    2762 /usr/lib/x86_64-linux-gnu/libz.so.1.2.11
node    9946 *snip*  mem       REG                8,1 22266144   22322 /usr/lib/x86_64-linux-gnu/libnode.so.64
node    9946 *snip*  mem       REG                8,1   165632     932 /usr/lib/x86_64-linux-gnu/ld-2.28.so
node    9946 *snip*    0r      CHR                1,3      0t0    7292 /dev/null
node    9946 *snip*    1u     unix 0x00000000687bd8d8      0t0 1745496 type=STREAM
node    9946 *snip*    2u     unix 0x00000000687bd8d8      0t0 1745496 type=STREAM
node    9946 *snip*    3u  a_inode               0,13        0    7223 [eventpoll]
node    9946 *snip*    4r     FIFO               0,12      0t0 1745504 pipe
node    9946 *snip*    5w     FIFO               0,12      0t0 1745504 pipe
node    9946 *snip*    6r     FIFO               0,12      0t0 1745505 pipe
node    9946 *snip*    7w     FIFO               0,12      0t0 1745505 pipe
node    9946 *snip*    8u  a_inode               0,13        0    7223 [eventfd]
node    9946 *snip*    9u  a_inode               0,13        0    7223 [eventpoll]
node    9946 *snip*   10r     FIFO               0,12      0t0 1745506 pipe
node    9946 *snip*   11w     FIFO               0,12      0t0 1745506 pipe
node    9946 *snip*   12u  a_inode               0,13        0    7223 [eventfd]
node    9946 *snip*   13u  a_inode               0,13        0    7223 [eventpoll]
node    9946 *snip*   14r     FIFO               0,12      0t0 1745507 pipe
node    9946 *snip*   15w     FIFO               0,12      0t0 1745507 pipe
node    9946 *snip*   16u  a_inode               0,13        0    7223 [eventfd]
node    9946 *snip*   17r      CHR                1,3      0t0    7292 /dev/null
node    9946 *snip*   18u     IPv4            1745532      0t0     TCP localhost:55926->localhost:mysql (ESTABLISHED)
node    9946 *snip*   19u     IPv6            1745544      0t0     TCP *:*snip* (LISTEN)
node    9946 *snip*   20u     IPv4            1745536      0t0     TCP localhost:47794->localhost:*snip* (ESTABLISHED)
node    9946 *snip*   21u     IPv6            1745545      0t0     UDP *:*snip* 
node    9946 *snip*   22u     IPv4            1745554      0t0     TCP localhost:55934->localhost:mysql (ESTABLISHED)
node    9946 *snip*   23u     IPv4            1745556      0t0     TCP localhost:55936->localhost:mysql (ESTABLISHED)
node    9946 *snip*   24u     IPv4            1745558      0t0     TCP localhost:55938->localhost:mysql (ESTABLISHED)
node    9946 *snip*   25u     IPv4            1745560      0t0     TCP localhost:55940->localhost:mysql (ESTABLISHED)
node    9946 *snip*   26u     IPv4            1745562      0t0     TCP localhost:55942->localhost:mysql (ESTABLISHED)
node    9946 *snip*   27u     IPv4            1745564      0t0     TCP localhost:55944->localhost:mysql (ESTABLISHED)
node    9946 *snip*   28u     IPv4            1746253      0t0     TCP localhost:55952->localhost:mysql (ESTABLISHED)
node    9946 *snip*   29u     IPv4            1746255      0t0     TCP localhost:55954->localhost:mysql (ESTABLISHED)
node    9946 *snip*   30u     IPv4            1746776      0t0     TCP localhost:56042->localhost:mysql (ESTABLISHED)
@addaleax addaleax added the v8 engine Issues and PRs related to the V8 dependency. label Jan 23, 2020
@addaleax
Copy link
Member

Is there any chance you could share a reproduction? Can you try other Node.js versions and compare the result (including, ideally, verifying that the 10.15.2 binary from https://nodejs.org/en/ or e.g. nvm also exhibits the same issue)?

@TimWolla
Copy link
Author

Is there any chance you could share a reproduction?

No. As I said it's a fairly complex application, containing a bunch of internal business logic. I started trying to rip out JavaScript code that I assume not to be executed within the staging system to reduce the footprint of the application, but so far without any real success. The biggest issue that I am not yet able to reliably trigger the segmentation fault. With an unmodified application it takes roughly 30 minutes to 1 hour to crash with the current traffic on that system.

I can offer to poke around within the core file. I might be able to provide the core file to a nodejs core developer in private. I would need to check with my manager and to clean up the configuration to not leak any private information, such as TLS certificates, though.

Can you try other Node.js versions and compare the result (including, ideally, verifying that the 10.15.2 binary from https://nodejs.org/en/ or e.g. nvm also exhibits the same issue)?

Yes, I can try that. I just downloaded https://nodejs.org/download/release/v10.15.2/node-v10.15.2-linux-x64.tar.gz onto the machine, adjusted the unit file and restarted the service. For my record: As of 23:53:34 UTC the new binary is running with an unmodified application.

@TimWolla
Copy link
Author

TimWolla commented Jan 24, 2020

For my record: As of 23:53:34 UTC the new binary is running with an unmodified application.

That was quick. I started a traffic generator in parallel and the application crashed at 23:59:14 UTC.

Find below the stack trace with the node binary downloaded from nodejs.org:

Core was generated by `/original-node/node-v10.15.2-linux-x64/bin/node dist/server.js'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000000000cfe112 in v8::internal::compiler::(anonymous namespace)::MayAlias(v8::internal::compiler::Node*, v8::internal::compiler::Node*) [clone .part.54] ()
[Current thread is 1 (Thread 0x7f269affd700 (LWP 24775))]
(gdb) bt
#0  0x0000000000cfe112 in v8::internal::compiler::(anonymous namespace)::MayAlias(v8::internal::compiler::Node*, v8::internal::compiler::Node*) [clone .part.54] ()
#1  0x0000000000d0188c in v8::internal::compiler::LoadElimination::AbstractField::Kill(v8::internal::compiler::LoadElimination::AliasStateInfo const&, v8::internal::MaybeHandle<v8::internal::Name>, v8::internal::Zone*) const ()
#2  0x0000000000d03e84 in v8::internal::compiler::LoadElimination::ReduceStoreField(v8::internal::compiler::Node*) ()
#3  0x0000000000d08ba5 in v8::internal::compiler::LoadElimination::Reduce(v8::internal::compiler::Node*) ()
#4  0x0000000000d4a9c8 in v8::internal::compiler::(anonymous namespace)::SourcePositionWrapper::Reduce(v8::internal::compiler::Node*) ()
#5  0x0000000000c7ac5e in v8::internal::compiler::GraphReducer::Reduce(v8::internal::compiler::Node*) ()
#6  0x0000000000c7b051 in v8::internal::compiler::GraphReducer::ReduceTop() ()
#7  0x0000000000c7b4e1 in v8::internal::compiler::GraphReducer::ReduceNode(v8::internal::compiler::Node*) ()
#8  0x0000000000d56bb6 in v8::internal::compiler::LoadEliminationPhase::Run(v8::internal::compiler::PipelineData*, v8::internal::Zone*) ()
#9  0x0000000000d57b10 in v8::internal::compiler::PipelineImpl::OptimizeGraph(v8::internal::compiler::Linkage*) ()
#10 0x0000000000d57b50 in v8::internal::compiler::PipelineCompilationJob::ExecuteJobImpl() ()
#11 0x0000000000c109b1 in v8::internal::OptimizedCompilationJob::ExecuteJob() ()
#12 0x0000000000c0b3f0 in v8::internal::OptimizingCompileDispatcher::CompileTask::RunInternal() ()
#13 0x0000000000bc65f6 in v8::internal::CancelableTask::Run() ()
#14 0x0000000000962bb9 in node::BackgroundRunner(void*) ()
#15 0x00007f26a1292fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#16 0x00007f26a11c14cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Edit: After saving the core file and restarting the traffic generator the application crashed at 00:03:13 UTC.

@bnoordhuis
Copy link
Member

Can you open the core dump with gdb and post the output of info registers and disassemble? Thanks.

@TimWolla
Copy link
Author

Sure. Disassembly cut at some location after the current instruction that still fit on my terminal. I can give more disassembly if it is helpful.

From the Debian binary

(gdb) info registers
rax            0x4670001           73859073
rbx            0x0                 0
rcx            0x0                 0
rdx            0x27                39
rsi            0x0                 0
rdi            0x221fc90           35781776
rbp            0x7fe93fffe130      0x7fe93fffe130
rsp            0x7fe93fffe110      0x7fe93fffe110
r8             0x4                 4
r9             0x7fe93407a250      140639577023056
r10            0x7fe93407eea8      140639577042600
r11            0x1                 1
r12            0x221fc90           35781776
r13            0x1f2e570           32695664
r14            0x213d3f8           34853880
r15            0x7fe934079698      140639577020056
rip            0x7fe9488dcb6b      0x7fe9488dcb6b <v8::internal::compiler::(anonymous namespace)::MayAlias(v8::internal::compiler::Node*, v8::internal::compiler::Node*)+43>
eflags         0x10206             [ PF IF RF ]
cs             0x33                51
ss             0x2b                43
ds             0x0                 0
es             0x0                 0
fs             0x0                 0
gs             0x0                 0
k0             0x0                 0
k1             0x0                 0
k2             0x0                 0
k3             0x0                 0
k4             0x0                 0
k5             0x0                 0
k6             0x0                 0
k7             0x0                 0
(gdb) disassemble
Dump of assembler code for function v8::internal::compiler::(anonymous namespace)::MayAlias(v8::internal::compiler::Node*, v8::internal::compiler::Node*):
   0x00007fe9488dcb40 <+0>:	push   %rbp
   0x00007fe9488dcb41 <+1>:	mov    %rsp,%rbp
   0x00007fe9488dcb44 <+4>:	push   %r12
   0x00007fe9488dcb46 <+6>:	push   %rbx
   0x00007fe9488dcb47 <+7>:	sub    $0x10,%rsp
   0x00007fe9488dcb4b <+11>:	mov    %fs:0x28,%rax
   0x00007fe9488dcb54 <+20>:	mov    %rax,-0x18(%rbp)
   0x00007fe9488dcb58 <+24>:	xor    %eax,%eax
   0x00007fe9488dcb5a <+26>:	mov    $0x1,%eax
   0x00007fe9488dcb5f <+31>:	cmp    %rsi,%rdi
   0x00007fe9488dcb62 <+34>:	je     0x7fe9488dcbc0 <v8::internal::compiler::(anonymous namespace)::MayAlias(v8::internal::compiler::Node*, v8::internal::compiler::Node*)+128>
   0x00007fe9488dcb64 <+36>:	mov    0x8(%rdi),%rax
   0x00007fe9488dcb68 <+40>:	mov    %rsi,%rbx
=> 0x00007fe9488dcb6b <+43>:	mov    0x8(%rsi),%rsi
   0x00007fe9488dcb6f <+47>:	mov    %rdi,%r12
   0x00007fe9488dcb72 <+50>:	lea    -0x20(%rbp),%rdi
   0x00007fe9488dcb76 <+54>:	mov    %rax,-0x20(%rbp)
   0x00007fe9488dcb7a <+58>:	callq  0x7fe9484a74b0 <_ZNK2v88internal8compiler4Type5MaybeES2_@plt>
   0x00007fe9488dcb7f <+63>:	test   %al,%al
   0x00007fe9488dcb81 <+65>:	je     0x7fe9488dcbc0 <v8::internal::compiler::(anonymous namespace)::MayAlias(v8::internal::compiler::Node*, v8::internal::compiler::Node*)+128>
   0x00007fe9488dcb83 <+67>:	mov    (%rbx),%rdx
   0x00007fe9488dcb86 <+70>:	movzwl 0x10(%rdx),%edx
   0x00007fe9488dcb8a <+74>:	cmp    $0x3a,%dx
   0x00007fe9488dcb8e <+78>:	je     0x7fe9488dcc20 <v8::internal::compiler::(anonymous namespace)::MayAlias(v8::internal::compiler::Node*, v8::internal::compiler::Node*)+224>
   0x00007fe9488dcb94 <+84>:	cmp    $0xca,%dx
   0x00007fe9488dcb99 <+89>:	je     0x7fe9488dcc40 <v8::internal::compiler::(anonymous namespace)::MayAlias(v8::internal::compiler::Node*, v8::internal::compiler::Node*)+256>
   0x00007fe9488dcb9f <+95>:	cmp    $0x27,%dx
   0x00007fe9488dcba3 <+99>:	je     0x7fe9488dcc20 <v8::internal::compiler::(anonymous namespace)::MayAlias(v8::internal::compiler::Node*, v8::internal::compiler::Node*)+224>

from the nodejs.org binary

(gdb) info registers
rax            0x4670001           73859073
rbx            0x331d740           53598016
rcx            0x331d760           53598048
rdx            0x4670001           73859073
rsi            0x3125940           51534144
rdi            0x331d740           53598016
rbp            0x7f269affc300      0x7f269affc300
rsp            0x7f269affc2e0      0x7f269affc2e0
r8             0x340cfc0           54579136
r9             0x340ce98           54578840
r10            0x0                 0
r11            0x1                 1
r12            0x0                 0
r13            0x3125940           51534144
r14            0x331d740           53598016
r15            0x7f269407f458      139803669034072
rip            0xcfe112            0xcfe112 <v8::internal::compiler::(anonymous namespace)::MayAlias(v8::internal::compiler::Node*, v8::internal::compiler::Node*) [clone .part.54]+130>
eflags         0x10202             [ IF RF ]
cs             0x33                51
ss             0x2b                43
ds             0x0                 0
es             0x0                 0
fs             0x0                 0
gs             0x0                 0
k0             0x0                 0
k1             0x0                 0
k2             0x0                 0
k3             0x0                 0
k4             0x0                 0
k5             0x0                 0
k6             0x0                 0
k7             0x0                 0
(gdb) disassemble
Dump of assembler code for function _ZN2v88internal8compiler12_GLOBAL__N_18MayAliasEPNS1_4NodeES4_.part.54:
   0x0000000000cfe090 <+0>:	push   %rbp
   0x0000000000cfe091 <+1>:	mov    %rsp,%rbp
   0x0000000000cfe094 <+4>:	push   %r12
   0x0000000000cfe096 <+6>:	push   %rbx
   0x0000000000cfe097 <+7>:	mov    %rdi,%rbx
   0x0000000000cfe09a <+10>:	sub    $0x10,%rsp
   0x0000000000cfe09e <+14>:	mov    (%rsi),%rax
   0x0000000000cfe0a1 <+17>:	movzwl 0x10(%rax),%eax
   0x0000000000cfe0a5 <+21>:	cmp    $0x3a,%ax
   0x0000000000cfe0a9 <+25>:	je     0xcfe0f0 <_ZN2v88internal8compiler12_GLOBAL__N_18MayAliasEPNS1_4NodeES4_.part.54+96>
   0x0000000000cfe0ab <+27>:	cmp    $0xca,%ax
   0x0000000000cfe0af <+31>:	je     0xcfe198 <_ZN2v88internal8compiler12_GLOBAL__N_18MayAliasEPNS1_4NodeES4_.part.54+264>
   0x0000000000cfe0b5 <+37>:	cmp    $0x27,%ax
   0x0000000000cfe0b9 <+41>:	je     0xcfe0f0 <_ZN2v88internal8compiler12_GLOBAL__N_18MayAliasEPNS1_4NodeES4_.part.54+96>
   0x0000000000cfe0bb <+43>:	mov    (%rdi),%rdx
   0x0000000000cfe0be <+46>:	movzwl 0x10(%rdx),%edx
   0x0000000000cfe0c2 <+50>:	cmp    $0x3a,%dx
   0x0000000000cfe0c6 <+54>:	je     0xcfe1e0 <_ZN2v88internal8compiler12_GLOBAL__N_18MayAliasEPNS1_4NodeES4_.part.54+336>
   0x0000000000cfe0cc <+60>:	cmp    $0xca,%dx
   0x0000000000cfe0d1 <+65>:	je     0xcfe1c8 <_ZN2v88internal8compiler12_GLOBAL__N_18MayAliasEPNS1_4NodeES4_.part.54+312>
   0x0000000000cfe0d7 <+71>:	cmp    $0x27,%dx
   0x0000000000cfe0db <+75>:	je     0xcfe1e0 <_ZN2v88internal8compiler12_GLOBAL__N_18MayAliasEPNS1_4NodeES4_.part.54+336>
   0x0000000000cfe0e1 <+81>:	mov    $0x1,%eax
   0x0000000000cfe0e6 <+86>:	add    $0x10,%rsp
   0x0000000000cfe0ea <+90>:	pop    %rbx
   0x0000000000cfe0eb <+91>:	pop    %r12
   0x0000000000cfe0ed <+93>:	pop    %rbp
   0x0000000000cfe0ee <+94>:	retq   
   0x0000000000cfe0ef <+95>:	nop
   0x0000000000cfe0f0 <+96>:	movzbl 0x17(%rsi),%eax
   0x0000000000cfe0f4 <+100>:	and    $0xf,%eax
   0x0000000000cfe0f7 <+103>:	cmp    $0xf,%eax
   0x0000000000cfe0fa <+106>:	je     0xcfe200 <_ZN2v88internal8compiler12_GLOBAL__N_18MayAliasEPNS1_4NodeES4_.part.54+368>
   0x0000000000cfe100 <+112>:	mov    0x20(%rsi),%r12
   0x0000000000cfe104 <+116>:	cmp    %r12,%rbx
   0x0000000000cfe107 <+119>:	mov    $0x1,%eax
   0x0000000000cfe10c <+124>:	je     0xcfe0e6 <_ZN2v88internal8compiler12_GLOBAL__N_18MayAliasEPNS1_4NodeES4_.part.54+86>
   0x0000000000cfe10e <+126>:	mov    0x8(%rbx),%rax
=> 0x0000000000cfe112 <+130>:	mov    0x8(%r12),%rsi
   0x0000000000cfe117 <+135>:	lea    -0x20(%rbp),%rdi
   0x0000000000cfe11b <+139>:	mov    %rax,-0x20(%rbp)
   0x0000000000cfe11f <+143>:	callq  0xdce240 <_ZNK2v88internal8compiler4Type5MaybeES2_>
   0x0000000000cfe124 <+148>:	test   %al,%al
   0x0000000000cfe126 <+150>:	je     0xcfe0e6 <_ZN2v88internal8compiler12_GLOBAL__N_18MayAliasEPNS1_4NodeES4_.part.54+86>
   0x0000000000cfe128 <+152>:	mov    (%r12),%rdx
   0x0000000000cfe12c <+156>:	movzwl 0x10(%rdx),%edx
   0x0000000000cfe130 <+160>:	cmp    $0x3a,%dx
   0x0000000000cfe134 <+164>:	je     0xcfe238 <_ZN2v88internal8compiler12_GLOBAL__N_18MayAliasEPNS1_4NodeES4_.part.54+424>
   0x0000000000cfe13a <+170>:	cmp    $0xca,%dx
   0x0000000000cfe13f <+175>:	je     0xcfe210 <_ZN2v88internal8compiler12_GLOBAL__N_18MayAliasEPNS1_4NodeES4_.part.54+384>
   0x0000000000cfe145 <+181>:	cmp    $0x27,%dx
   0x0000000000cfe149 <+185>:	je     0xcfe238 <_ZN2v88internal8compiler12_GLOBAL__N_18MayAliasEPNS1_4NodeES4_.part.54+424>
   0x0000000000cfe14f <+191>:	mov    (%rbx),%rcx
   0x0000000000cfe152 <+194>:	movzwl 0x10(%rcx),%ecx
   0x0000000000cfe156 <+198>:	cmp    $0x3a,%cx
   0x0000000000cfe15a <+202>:	je     0xcfe171 <_ZN2v88internal8compiler12_GLOBAL__N_18MayAliasEPNS1_4NodeES4_.part.54+225>
   0x0000000000cfe15c <+204>:	cmp    $0xca,%cx
   0x0000000000cfe161 <+209>:	je     0xcfe270 <_ZN2v88internal8compiler12_GLOBAL__N_18MayAliasEPNS1_4NodeES4_.part.54+480>
   0x0000000000cfe167 <+215>:	cmp    $0x27,%cx
   0x0000000000cfe16b <+219>:	jne    0xcfe0e6 <_ZN2v88internal8compiler12_GLOBAL__N_18MayAliasEPNS1_4NodeES4_.part.54+86>
   0x0000000000cfe171 <+225>:	movzbl 0x17(%rbx),%eax

Not sure why the disassembly shows mangled function names.

@bnoordhuis
Copy link
Member

Thanks. So in both cases it crashes because the second function argument is a nullptr. What happens when you start node with --noturbo_load_elimination?

Is checking with the latest v12.x or v13.x an option for you? v10.x is still at V8 6.8 and that's positively ancient by now.

The bug may have been fixed in a newer release. v8/v8@b28637b seems like a good candidate.

@TimWolla
Copy link
Author

What happens when you start node with --noturbo_load_elimination?

Started the unmodified application at 13:40:45 UTC using /original-node/node-v10.15.2-linux-x64/bin/node --noturbo_load_elimination dist/server.js.

Is checking with the latest v12.x or v13.x an option for you?

I can check within the staging system with arbitrary node versions (I already downloaded some random precompiled binaries from the Internet upon request from addaleax). For the production system I would have to check with my manager, not using the Debian Packages comes with quite a bit of operational complexity I'd like to avoid.

@TimWolla
Copy link
Author

It's 21:44:50 UTC now. The process is now running since roughly 8 hours without the crash by specifying the --noturbo_load_elimination option. Previously it would more or less reliably crash every 30 minutes to 1 hour. I guess it's safe to say that the option works around the issue. I don't expect a noticeable performance hit, so this might be the best solution for production.

While researching the command line flag I also stumbled upon --trace-turbo-load-elimination. Would you consider it useful to enable this option for debugging? I suppose it would log whenever that specific optimization kicks in?

@bnoordhuis
Copy link
Member

As a quick sanity check, let's see if the regression test from v8/v8@b28637b triggers a crash.

https://ci.nodejs.org/job/node-test-commit-v8-linux/2817/

@bnoordhuis
Copy link
Member

It does so I think we've found our culprit. I've opened #31507 to back-port the fix to v10.x.

@TimWolla
Copy link
Author

FWIW: The CI appears to be behind OAuth I did not attempt to authorize. Not sure whether it was meant for me or just for tracking it yourself.

In any case I trust your judgement here. Thanks for the quick turn around time. If you'd like me to test some special build to verify I'm happy to do so if you give me instructions.

@bnoordhuis
Copy link
Member

For us both, really. :-)

If you could build from source and test with the patch, that'd of course be great. Second-best is if you can try with the next v10.x release because I don't know how often Debian updates their node package.

@TimWolla
Copy link
Author

If you could build from source and test with the patch, that'd of course be great.

I compiled my last node.js from source back in the 0.10.x days. I'll have a look.

because I don't know how often Debian updates their node package.

Not at all. Debian Buster never got an updated nodejs since 2019-04-17 (https://tracker.debian.org/pkg/nodejs). Not even for backported security fixes which is odd. I plan to file a bug with them to get the patch backported to Debian stable once it is verified fixed, though.

@TimWolla
Copy link
Author

I compiled my last node.js from source back in the 0.10.x days. I'll have a look.

I downloaded the 10.15.2 source code, compiled it without modifications to verify that the bug is present with the self compiled version. It crashed in ~16 minutes with a bit of traffic generation. I'm currently recompiling with the patch applied and will report back.

@TimWolla
Copy link
Author

@bnoordhuis Unfortunately I have to report that 10.15.2 from source with the patch applied crashed as well (after 43 minutes). Meanwhile I've adjusted production to run with --noturbo_load_elimination. This would be an acceptable work-around for us until Debian provides something more recent (a 12.x) if you feel spending more effort to chase this crash is wasted effort.

I compiled with (g++ from Debian Buster):

$ head -n30 deps/v8/src/compiler/load-elimination.cc 
// Copyright 2016 the V8 project authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.

#include "src/compiler/load-elimination.h"

#include "src/compiler/common-operator.h"
#include "src/compiler/js-graph.h"
#include "src/compiler/node-properties.h"
#include "src/compiler/simplified-operator.h"
#include "src/heap/factory.h"
#include "src/objects-inl.h"

namespace v8 {
namespace internal {
namespace compiler {

namespace {

bool IsRename(Node* node) {
  switch (node->opcode()) {
    case IrOpcode::kFinishRegion:
    case IrOpcode::kTypeGuard:
      return !node->IsDead();
    default:
      return false;
  }
}

Node* ResolveRenames(Node* node) {
$ ./configure --prefix=/my-node/
$ make
$ make install
Core was generated by `/my-node/bin/node dist/server.js'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00005591e9c813ef in v8::internal::compiler::(anonymous namespace)::MayAlias(v8::internal::compiler::Node*, v8::internal::compiler::Node*) ()
[Current thread is 1 (Thread 0x7fc2ca25d700 (LWP 23963))]
(gdb) bt
#0  0x00005591e9c813ef in v8::internal::compiler::(anonymous namespace)::MayAlias(v8::internal::compiler::Node*, v8::internal::compiler::Node*) ()
#1  0x00005591e9c84bef in v8::internal::compiler::LoadElimination::AbstractField::Kill(v8::internal::compiler::LoadElimination::AliasStateInfo const&, v8::internal::MaybeHandle<v8::internal::Name>, v8::internal::Zone*) const ()
#2  0x00005591e9c84e8c in v8::internal::compiler::LoadElimination::AbstractState::KillFields(v8::internal::compiler::Node*, v8::internal::MaybeHandle<v8::internal::Name>, v8::internal::Zone*) const ()
#3  0x00005591e9c893c6 in v8::internal::compiler::LoadElimination::ReduceStoreField(v8::internal::compiler::Node*) ()
#4  0x00005591e9ccc75b in v8::internal::compiler::(anonymous namespace)::SourcePositionWrapper::Reduce(v8::internal::compiler::Node*) ()
#5  0x00005591e9bf85cd in v8::internal::compiler::GraphReducer::Reduce(v8::internal::compiler::Node*) ()
#6  0x00005591e9bf89b3 in v8::internal::compiler::GraphReducer::ReduceTop() ()
#7  0x00005591e9bf8d78 in v8::internal::compiler::GraphReducer::ReduceNode(v8::internal::compiler::Node*) ()
#8  0x00005591e9cd9606 in v8::internal::compiler::LoadEliminationPhase::Run(v8::internal::compiler::PipelineData*, v8::internal::Zone*) [clone .constprop.654] ()
#9  0x00005591e9cda5f4 in v8::internal::compiler::PipelineImpl::OptimizeGraph(v8::internal::compiler::Linkage*) ()
#10 0x00005591e9cda620 in v8::internal::compiler::PipelineCompilationJob::ExecuteJobImpl() ()
#11 0x00005591e9b90d71 in v8::internal::OptimizedCompilationJob::ExecuteJob() ()
#12 0x00005591e9b8aceb in v8::internal::OptimizingCompileDispatcher::CompileNext(v8::internal::OptimizedCompilationJob*) ()
#13 0x00005591e9b8aff7 in v8::internal::OptimizingCompileDispatcher::CompileTask::RunInternal() ()
#14 0x00005591e98ff705 in node::BackgroundRunner(void*) ()
#15 0x00007fc2cc42ffa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#16 0x00007fc2cc35e4cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(gdb) info registers
rax            0x4670001           73859073
rbx            0x0                 0
rcx            0x5591ed15c120      94085236244768
rdx            0x0                 0
rsi            0x0                 0
rdi            0x5591ed529140      94085240230208
rbp            0x7fc2ca25c190      0x7fc2ca25c190
rsp            0x7fc2ca25c170      0x7fc2ca25c170
r8             0x4                 4
r9             0x5591ed53d798      94085240313752
r10            0x7fc2bc07bb00      140474355006208
r11            0x1                 1
r12            0x5591ed529140      94085240230208
r13            0x7fc2bc080d80      140474355027328
r14            0x5591ed414460      94085239096416
r15            0x7fc2bc07af48      140474355003208
rip            0x5591e9c813ef      0x5591e9c813ef <v8::internal::compiler::(anonymous namespace)::MayAlias(v8::internal::compiler::Node*, v8::internal::compiler::Node*)+31>
eflags         0x10202             [ IF RF ]
cs             0x33                51
ss             0x2b                43
ds             0x0                 0
es             0x0                 0
fs             0x0                 0
gs             0x0                 0
k0             0x0                 0
k1             0x0                 0
k2             0x0                 0
k3             0x0                 0
k4             0x0                 0
k5             0x0                 0
k6             0x0                 0
k7             0x0                 0
(gdb) disassemble
Dump of assembler code for function _ZN2v88internal8compiler12_GLOBAL__N_18MayAliasEPNS1_4NodeES4_:
   0x00005591e9c813d0 <+0>:	mov    $0x1,%eax
   0x00005591e9c813d5 <+5>:	cmp    %rsi,%rdi
   0x00005591e9c813d8 <+8>:	je     0x5591e9c81450 <_ZN2v88internal8compiler12_GLOBAL__N_18MayAliasEPNS1_4NodeES4_+128>
   0x00005591e9c813da <+10>:	push   %rbp
   0x00005591e9c813db <+11>:	mov    %rsp,%rbp
   0x00005591e9c813de <+14>:	push   %r12
   0x00005591e9c813e0 <+16>:	mov    %rdi,%r12
   0x00005591e9c813e3 <+19>:	push   %rbx
   0x00005591e9c813e4 <+20>:	mov    %rsi,%rbx
   0x00005591e9c813e7 <+23>:	sub    $0x10,%rsp
   0x00005591e9c813eb <+27>:	mov    0x8(%rdi),%rax
=> 0x00005591e9c813ef <+31>:	mov    0x8(%rsi),%rsi
   0x00005591e9c813f3 <+35>:	lea    -0x18(%rbp),%rdi
   0x00005591e9c813f7 <+39>:	mov    %rax,-0x18(%rbp)
   0x00005591e9c813fb <+43>:	callq  0x5591e9d4d4a0 <_ZNK2v88internal8compiler4Type5MaybeES2_>
   0x00005591e9c81400 <+48>:	test   %al,%al
   0x00005591e9c81402 <+50>:	je     0x5591e9c81441 <_ZN2v88internal8compiler12_GLOBAL__N_18MayAliasEPNS1_4NodeES4_+113>
   0x00005591e9c81404 <+52>:	mov    (%rbx),%rdx
   0x00005591e9c81407 <+55>:	movzwl 0x10(%rdx),%edx
   0x00005591e9c8140b <+59>:	cmp    $0x3a,%dx
   0x00005591e9c8140f <+63>:	je     0x5591e9c814c8 <_ZN2v88internal8compiler12_GLOBAL__N_18MayAliasEPNS1_4NodeES4_+248>
   0x00005591e9c81415 <+69>:	cmp    $0xca,%dx
   0x00005591e9c8141a <+74>:	je     0x5591e9c81498 <_ZN2v88internal8compiler12_GLOBAL__N_18MayAliasEPNS1_4NodeES4_+200>
   0x00005591e9c8141c <+76>:	cmp    $0x27,%dx
   0x00005591e9c81420 <+80>:	je     0x5591e9c814c8 <_ZN2v88internal8compiler12_GLOBAL__N_18MayAliasEPNS1_4NodeES4_+248>
   0x00005591e9c81426 <+86>:	mov    (%r12),%rcx
   0x00005591e9c8142a <+90>:	movzwl 0x10(%rcx),%ecx
   0x00005591e9c8142e <+94>:	cmp    $0x3a,%cx
   0x00005591e9c81432 <+98>:	je     0x5591e9c81470 <_ZN2v88internal8compiler12_GLOBAL__N_18MayAliasEPNS1_4NodeES4_+160>
   0x00005591e9c81434 <+100>:	cmp    $0xca,%cx
   0x00005591e9c81439 <+105>:	je     0x5591e9c81458 <_ZN2v88internal8compiler12_GLOBAL__N_18MayAliasEPNS1_4NodeES4_+136>
   0x00005591e9c8143b <+107>:	cmp    $0x27,%cx
   0x00005591e9c8143f <+111>:	je     0x5591e9c81470 <_ZN2v88internal8compiler12_GLOBAL__N_18MayAliasEPNS1_4NodeES4_+160>
   0x00005591e9c81441 <+113>:	add    $0x10,%rsp
   0x00005591e9c81445 <+117>:	pop    %rbx
   0x00005591e9c81446 <+118>:	pop    %r12
   0x00005591e9c81448 <+120>:	pop    %rbp
   0x00005591e9c81449 <+121>:	retq   

FWIW: I ran the regression test from the pull request with Debian's node, self compiled node w/o patch and self compiled w/ patch. None of those crashed.

@bnoordhuis
Copy link
Member

Hm, that's sad news. If you're up for it, a debug build (make -j8 -C out BUILDTYPE=Debug - binary at out/Debug/node) might help catch the bug earlier or produce more meaningful stack traces.

I ran the regression test from the pull request with Debian's node, self compiled node w/o patch and self compiled w/ patch. None of those crashed.

Yes, that's expected. Node does a lot of JS bootstrapping that d8, the V8 shell, doesn't - and perturbs the environment the test runs in. Many of V8's regression tests rely on a pristine environment.

./configure --enable-d8 && make -j8 builds d8 if you want to try it out.

@TimWolla
Copy link
Author

If you're up for it, a debug build (make -j8 -C out BUILDTYPE=Debug - binary at out/Debug/node) might help catch the bug earlier or produce more meaningful stack traces.

After adding 3G of swap file to that 1G of memory VM it compiled without being killed due to an out of memory condition and spat out 4.7G of artifacts, filling up the disk to ~100%.

That said: The application is now running on a debug binary … I hope.

@TimWolla
Copy link
Author

Okay, it took 34 minutes for the Debug binary to crash. Compiled as make -C out BUILDTYPE=Debug with g++, copied just the node binary into a different folder.

Core was generated by `/my-node/node dist/server.js'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000564d776eb69a in v8::internal::compiler::Node::type (this=0x0) at ../deps/v8/src/compiler/node.h:267
267	  Type type() const { return type_; }
[Current thread is 1 (Thread 0x7ff812af3700 (LWP 1795))]
(gdb) bt
#0  0x0000564d776eb69a in v8::internal::compiler::Node::type (this=0x0) at ../deps/v8/src/compiler/node.h:267
#1  0x0000564d776eb6fd in v8::internal::compiler::NodeProperties::IsTyped (node=0x0) at ../deps/v8/src/compiler/node-properties.h:186
#2  0x0000564d776eb72a in v8::internal::compiler::NodeProperties::GetType (node=0x0) at ../deps/v8/src/compiler/node-properties.h:188
#3  0x0000564d778027e5 in v8::internal::compiler::(anonymous namespace)::MayAlias (a=0x564d7a5977d0, b=0x0) at ../deps/v8/src/compiler/load-elimination.cc:39
#4  0x0000564d77802874 in v8::internal::compiler::(anonymous namespace)::MayAlias (a=0x564d7a5977d0, b=0x564d7a25e9b0) at ../deps/v8/src/compiler/load-elimination.cc:56
#5  0x0000564d778051ec in v8::internal::compiler::LoadElimination::AliasStateInfo::MayAlias (this=0x7ff812af2320, other=0x564d7a25e9b0) at ../deps/v8/src/compiler/load-elimination.cc:679
#6  0x0000564d77803db9 in v8::internal::compiler::LoadElimination::AbstractField::Kill (this=0x7ff7fc089810, alias_info=..., name=..., zone=0x564d7a4da140) at ../deps/v8/src/compiler/load-elimination.cc:372
#7  0x0000564d778050b0 in v8::internal::compiler::LoadElimination::AbstractState::KillFields (this=0x7ff7fc08a780, object=0x564d7a5977d0, name=..., zone=0x564d7a4da140)
    at ../deps/v8/src/compiler/load-elimination.cc:654
#8  0x0000564d7780677a in v8::internal::compiler::LoadElimination::ReduceStoreField (this=0x7ff812af28f0, node=0x564d7a5fe5a0) at ../deps/v8/src/compiler/load-elimination.cc:979
#9  0x0000564d77802d26 in v8::internal::compiler::LoadElimination::Reduce (this=0x7ff812af28f0, node=0x564d7a5fe5a0) at ../deps/v8/src/compiler/load-elimination.cc:131
#10 0x0000564d7786bfcb in v8::internal::compiler::(anonymous namespace)::SourcePositionWrapper::Reduce (this=0x564d7a634648, node=0x564d7a5fe5a0) at ../deps/v8/src/compiler/pipeline.cc:651
#11 0x0000564d77735c77 in v8::internal::compiler::GraphReducer::Reduce (this=0x7ff812af2b30, node=0x564d7a5fe5a0) at ../deps/v8/src/compiler/graph-reducer.cc:85
#12 0x0000564d77736107 in v8::internal::compiler::GraphReducer::ReduceTop (this=0x7ff812af2b30) at ../deps/v8/src/compiler/graph-reducer.cc:152
#13 0x0000564d777359fe in v8::internal::compiler::GraphReducer::ReduceNode (this=0x7ff812af2b30, node=0x564d7a3d3828) at ../deps/v8/src/compiler/graph-reducer.cc:56
#14 0x0000564d77735bb6 in v8::internal::compiler::GraphReducer::ReduceGraph (this=0x7ff812af2b30) at ../deps/v8/src/compiler/graph-reducer.cc:78
#15 0x0000564d7786fb2e in v8::internal::compiler::LoadEliminationPhase::Run (this=0x7ff812af2c6f, data=0x564d7a193cd8, temp_zone=0x564d7a4da140) at ../deps/v8/src/compiler/pipeline.cc:1457
#16 0x0000564d7787473f in v8::internal::compiler::PipelineImpl::Run<v8::internal::compiler::LoadEliminationPhase> (this=0x564d7a193e58) at ../deps/v8/src/compiler/pipeline.cc:1037
#17 0x0000564d778715e9 in v8::internal::compiler::PipelineImpl::OptimizeGraph (this=0x564d7a193e58, linkage=0x564d7a1453d0) at ../deps/v8/src/compiler/pipeline.cc:1912
#18 0x0000564d7786ca87 in v8::internal::compiler::PipelineCompilationJob::ExecuteJobImpl (this=0x564d7a193b20) at ../deps/v8/src/compiler/pipeline.cc:847
#19 0x0000564d77662c2c in v8::internal::OptimizedCompilationJob::ExecuteJob (this=0x564d7a193b20) at ../deps/v8/src/compiler.cc:223
#20 0x0000564d7765882f in v8::internal::OptimizingCompileDispatcher::CompileNext (this=0x564d7a02e690, job=0x564d7a193b20) at ../deps/v8/src/compiler-dispatcher/optimizing-compile-dispatcher.cc:118
#21 0x0000564d776584c6 in v8::internal::OptimizingCompileDispatcher::CompileTask::RunInternal (this=0x564d7a228c40) at ../deps/v8/src/compiler-dispatcher/optimizing-compile-dispatcher.cc:69
#22 0x0000564d7758732f in v8::internal::CancelableTask::Run (this=0x564d7a228c40) at ../deps/v8/src/cancelable-task.h:148
#23 0x0000564d771cd011 in node::BackgroundRunner (data=0x564d79fbc640) at ../src/node_platform.cc:42
#24 0x00007ff8144c4fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#25 0x00007ff8143f34cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(gdb) info registers
rax            0x0                 0
rbx            0x7ff7fc08fa00      140703062096384
rcx            0x0                 0
rdx            0x0                 0
rsi            0x0                 0
rdi            0x0                 0
rbp            0x7ff812af2140      0x7ff812af2140
rsp            0x7ff812af2140      0x7ff812af2140
r8             0x7ff7fc08f6a0      140703062095520
r9             0x1                 1
r10            0x7c                124
r11            0x1                 1
r12            0x564d7a4d5c60      94890764360800
r13            0x7fff36c0f2df      140734112002783
r14            0x7ff812af3700      140703442089728
r15            0x0                 0
rip            0x564d776eb69a      0x564d776eb69a <v8::internal::compiler::Node::type() const+12>
eflags         0x10206             [ PF IF RF ]
cs             0x33                51
ss             0x2b                43
ds             0x0                 0
es             0x0                 0
fs             0x0                 0
gs             0x0                 0
k0             0x0                 0
k1             0x0                 0
k2             0x0                 0
k3             0x0                 0
k4             0x0                 0
k5             0x0                 0
k6             0x0                 0
k7             0x0                 0
(gdb) disassemble
Dump of assembler code for function v8::internal::compiler::Node::type() const:
   0x0000564d776eb68e <+0>:	push   %rbp
   0x0000564d776eb68f <+1>:	mov    %rsp,%rbp
   0x0000564d776eb692 <+4>:	mov    %rdi,-0x8(%rbp)
   0x0000564d776eb696 <+8>:	mov    -0x8(%rbp),%rax
=> 0x0000564d776eb69a <+12>:	mov    0x8(%rax),%rax
   0x0000564d776eb69e <+16>:	pop    %rbp
   0x0000564d776eb69f <+17>:	retq   
End of assembler dump.
(gdb) list
262	
263	  // Only NodeProperties should manipulate the op.
264	  void set_op(const Operator* op) { op_ = op; }
265	
266	  // Only NodeProperties should manipulate the type.
267	  Type type() const { return type_; }
268	  void set_type(Type type) { type_ = type; }
269	
270	  // Only NodeMarkers should manipulate the marks on nodes.
271	  Mark mark() const { return mark_; }

@TimWolla
Copy link
Author

Find attached the output of the Debug node binary running with --trace-turbo-load-elimination. It shows the last seconds of output before the crash. Any private log output by the application itself has been redacted and replaced with *snip*. I have the core dump on record (the stack matches my previous comment).

turbofan_log.txt

@bnoordhuis
Copy link
Member

Thanks, I can see the general shape of the bug now.

The fourth stack frame is a return MayAlias(a, b->InputAt(0)); that I'm reasonably sure ought to be guarded on if (!b->IsDead()). There are a couple of similar calls that I need to check.

If you're up for it, I can write a patch for you to try out.

@TimWolla
Copy link
Author

TimWolla commented Feb 1, 2020

If you're up for it, I can write a patch for you to try out.

Sure, I'm happy to test the patches you throw at me. It's not too much effort for me to test them, most of the time is spent waiting to find whether it crashes or not.

@bnoordhuis
Copy link
Member

https://github.com/bnoordhuis/io.js/commit/473868ecc2.patch

Passes make test locally. Let's see if it also passes V8's test suite: https://ci.nodejs.org/job/node-test-commit-v8-linux/2834/

@TimWolla
Copy link
Author

TimWolla commented Feb 1, 2020

I've applied that patch and recompiled node.js. The process was started with the new binary at 14:31:34 UTC. Let's wait and see.

@TimWolla
Copy link
Author

TimWolla commented Feb 1, 2020

It's 23:00:00 UTC now. The process is running fine since 8.5 hours. I guess it's safe to say that the patch indeed fixes my issue. Thanks.

bnoordhuis added a commit to bnoordhuis/io.js that referenced this issue Feb 2, 2020
This commit back-ports the implementations of IsRename() and MayAlias()
from the upstream 8.0 branch wholesale. Fixes several bugs where V8's
load elimination pass considered values to be alive when they weren't.

Fixes: nodejs#31484
@bnoordhuis
Copy link
Member

Thanks for testing. I've opened #31613.

BethGriggs pushed a commit that referenced this issue Feb 26, 2020
This commit back-ports the implementations of IsRename() and MayAlias()
from the upstream 8.0 branch wholesale. Fixes several bugs where V8's
load elimination pass considered values to be alive when they weren't.

Fixes: #31484

PR-URL: #31613
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Beth Griggs <Bethany.Griggs@uk.ibm.com>
@BethGriggs
Copy link
Member

@TimWolla, #31613 landed in v10.20.0 - so this issue is now hopefully resolved. I'm going to close this out, but please let us know if you're still seeing this issue and we can reopen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
v8 engine Issues and PRs related to the V8 dependency.
Projects
None yet
Development

No branches or pull requests

5 participants