Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

node crashes with "free(): invalid pointer" when exiting #21845

Closed
LionNatsu opened this issue Jul 17, 2018 · 8 comments
Closed

node crashes with "free(): invalid pointer" when exiting #21845

LionNatsu opened this issue Jul 17, 2018 · 8 comments

Comments

@LionNatsu
Copy link

  • Version: 10.0.0+
  • Platform: x86_64 GNU/Linux
  • Subsystem: crypto, OpenSSL

Reproduce:

$ /bin/node -e ''
free(): invalid pointer
Aborted (core dumped)

Backtrace:

#0  0x00007ffff714f647 in raise () from /usr/lib/libc.so.6
#1  0x00007ffff7151031 in abort () from /usr/lib/libc.so.6
#2  0x00007ffff719a429 in __libc_message () from /usr/lib/libc.so.6
#3  0x00007ffff71a142a in malloc_printerr () from /usr/lib/libc.so.6
#4  0x00007ffff71a8984 in free () from /usr/lib/libc.so.6
#5  0x00007ffff7f08632 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string() () from /usr/lib/libstdc++.so.6
#6  0x00007ffff71537fa in __run_exit_handlers () from /usr/lib/libc.so.6
#7  0x00007ffff71538fa in exit () from /usr/lib/libc.so.6
#8  0x00007ffff713256e in __libc_start_main () from /usr/lib/libc.so.6
#9  0x00005555559a6dfa in _start ()
@addaleax
Copy link
Member

Can you provide more information about your system? Which versions of Node.js did you test? Did you download the binaries from https://nodejs.org/en/, or was this installed through a package manager? Can you give more information about how this is related to crypto code?

@LionNatsu
Copy link
Author

LionNatsu commented Jul 17, 2018

Can you give more information about how this is related to crypto code?

Sure, I did some research on the bug for a few days. This maybe looks like a report emmm...

I can now sure this issue can be trigger by:

  1. version >= 10.0.0, or after Upgrade to OpenSSL-1.1.0h #19794
  2. statically linked OpenSSL
  3. adding PIE (position independent executable) feature flags when building node
  4. using the path to run node, but not using a single node command.

I am packaging for a Linux distribution, I built and tested the binaries.

I found this does not crash:

node -e ''

i.e. without path prefix.

And the memory mapping between two cases is different:
The node one looks normal; the crashed /bin/node one has TWO /bin/node mapped into the memory. I started them in REPL mode (i.e. remove the -e '' part), the memory mappings was different, so that means something happened while node was initialising in a very early stage.

Using gdb, I found OpenSSL called dlopen:

(gdb) bt
#0  0x00007ffff7bd1fe0 in dlopen () from /usr/lib/libdl.so.2
#1  0x00005555564af6cc in dlfcn_load (dso=0x555558975280) at ../deps/openssl/openssl/crypto/dso/dso_dlfcn.c:111
#2  0x00005555564b01f0 in DSO_load (dso=dso@entry=0x0, filename=<optimized out>, filename@entry=0x555558975240 "/var/cache/acbs/build/acbs.e9ec1l0j/node-10.5.0/node", meth=meth@entry=0x0, flags=flags@entry=4) at ../deps/openssl/openssl/crypto/dso/dso_lib.c:160
#3  0x00005555564b056f in DSO_dsobyaddr (addr=addr@entry=0x555558902d04 <base_inited>, flags=flags@entry=4) at ../deps/openssl/openssl/crypto/dso/dso_lib.c:333
#4  0x00005555564ebe7b in ossl_init_base () at ../deps/openssl/openssl/crypto/init.c:116
#5  ossl_init_base_ossl_ () at ../deps/openssl/openssl/crypto/init.c:69
#6  0x00007ffff74d5737 in __pthread_once_slow () from /usr/lib/libpthread.so.0
#7  0x000055555651ec09 in CRYPTO_THREAD_run_once (once=once@entry=0x555558902d08 <base>, init=init@entry=0x5555564ebe20 <ossl_init_base_ossl_>) at ../deps/openssl/openssl/crypto/threads_pthread.c:106
#8  0x00005555564ec0b3 in OPENSSL_init_crypto (opts=opts@entry=0, settings=settings@entry=0x0) at ../deps/openssl/openssl/crypto/init.c:523
#9  0x000055555650bcad in do_rand_lock_init () at ../deps/openssl/openssl/crypto/rand/md_rand.c:92
#10 do_rand_lock_init_ossl_ () at ../deps/openssl/openssl/crypto/rand/md_rand.c:90
#11 0x00007ffff74d5737 in __pthread_once_slow () from /usr/lib/libpthread.so.0
#12 0x000055555651ec09 in CRYPTO_THREAD_run_once (once=once@entry=0x555558902e34 <rand_lock_init>, init=init@entry=0x55555650bca0 <do_rand_lock_init_ossl_>) at ../deps/openssl/openssl/crypto/threads_pthread.c:106
#13 0x000055555650bd89 in rand_status () at ../deps/openssl/openssl/crypto/rand/md_rand.c:545
#14 0x0000555555c07c90 in node::crypto::CheckEntropy () at ../src/node_crypto.cc:303
#15 node::crypto::EntropySource (buffer=0x7fffffffdbf8 "", length=8) at ../src/node_crypto.cc:317
#16 0x00005555566b7f50 in v8::base::RandomNumberGenerator::RandomNumberGenerator (this=0x555558905228 <v8::base::(anonymous namespace)::platform_random_number_generator+8>) at ../deps/v8/src/base/utils/random-number-generator.cc:36
#17 0x00005555566b6cfb in std::function<void ()>::operator()() const (this=0x555558905220 <v8::base::(anonymous namespace)::platform_random_number_generator>) at /usr/include/c++/7.3.1/bits/std_function.h:706
#18 v8::base::CallOnceImpl(long*, std::function<void ()>) (once=once@entry=0x555558905220 <v8::base::(anonymous namespace)::platform_random_number_generator>, init_func=...) at ../deps/v8/src/base/once.cc:37
#19 0x00005555566b986c in v8::base::CallOnce<void> (arg=0x555558905228 <v8::base::(anonymous namespace)::platform_random_number_generator+8>, 
    init_func=0x5555566b93a0 <v8::base::LazyInstanceImpl<v8::base::RandomNumberGenerator, v8::base::StaticallyAllocatedInstanceTrait<v8::base::RandomNumberGenerator>, v8::base::DefaultConstructTrait<v8::base::RandomNumberGenerator>, v8::base::ThreadSafeInitOnceTrait, v8::base::LeakyInstanceTrait<v8::base::RandomNumberGenerator> >::InitInstance(void*)>, 
    once=0x555558905220 <v8::base::(anonymous namespace)::platform_random_number_generator>) at ../deps/v8/src/base/once.h:98
#20 v8::base::ThreadSafeInitOnceTrait::Init<void (*)(void*), void*> (storage=0x555558905228 <v8::base::(anonymous namespace)::platform_random_number_generator+8>, 
    function=0x5555566b93a0 <v8::base::LazyInstanceImpl<v8::base::RandomNumberGenerator, v8::base::StaticallyAllocatedInstanceTrait<v8::base::RandomNumberGenerator>, v8::base::DefaultConstructTrait<v8::base::RandomNumberGenerator>, v8::base::ThreadSafeInitOnceTrait, v8::base::LeakyInstanceTrait<v8::base::RandomNumberGenerator> >::InitInstance(void*)>, 
    once=0x555558905220 <v8::base::(anonymous namespace)::platform_random_number_generator>) at ../deps/v8/src/base/lazy-instance.h:146
#21 v8::base::LazyInstanceImpl<v8::base::RandomNumberGenerator, v8::base::StaticallyAllocatedInstanceTrait<v8::base::RandomNumberGenerator>, v8::base::DefaultConstructTrait<v8::base::RandomNumberGenerator>, v8::base::ThreadSafeInitOnceTrait, v8::base::LeakyInstanceTrait<v8::base::RandomNumberGenerator> >::Init (
    this=0x555558905220 <v8::base::(anonymous namespace)::platform_random_number_generator>) at ../deps/v8/src/base/lazy-instance.h:177
#22 v8::base::LazyInstanceImpl<v8::base::RandomNumberGenerator, v8::base::StaticallyAllocatedInstanceTrait<v8::base::RandomNumberGenerator>, v8::base::DefaultConstructTrait<v8::base::RandomNumberGenerator>, v8::base::ThreadSafeInitOnceTrait, v8::base::LeakyInstanceTrait<v8::base::RandomNumberGenerator> >::Pointer (
    this=0x555558905220 <v8::base::(anonymous namespace)::platform_random_number_generator>) at ../deps/v8/src/base/lazy-instance.h:182
#23 v8::base::OS::GetRandomMmapAddr () at ../deps/v8/src/base/platform/platform-posix.cc:222
#24 0x0000555555cdeb40 in v8::internal::GetRandomMmapAddr () at ../deps/v8/src/allocation.cc:142
#25 0x000055555606d18e in v8::internal::Heap::SetUp (this=this@entry=0x55555891d620) at ../deps/v8/src/heap/heap.cc:4567
#26 0x0000555556108d2f in v8::internal::Isolate::Init (this=this@entry=0x55555891d600, des=des@entry=0x7fffffffe040) at ../deps/v8/src/isolate.cc:3000
#27 0x00005555563236c2 in v8::internal::Snapshot::Initialize (isolate=isolate@entry=0x55555891d600) at ../deps/v8/src/snapshot/snapshot-common.cc:54
#28 0x0000555555d080b8 in v8::IsolateNewImpl (isolate=0x55555891d600, params=...) at ../deps/v8/src/api.cc:8379
#29 0x0000555555d08141 in v8::Isolate::New (params=...) at ../deps/v8/src/api.cc:8326
#30 0x0000555555b2ce9f in node::NewIsolate (allocator=allocator@entry=0x55555891d5e0) at ../src/node.cc:4141
#31 0x0000555555b2ebb3 in node::Start (exec_argv=0x55555891bec0, exec_argc=2, argv=0x55555891bdc0, argc=1, event_loop=0x5555588f0100 <default_loop_struct>) at ../src/node.cc:4159
#32 node::Start (argc=<optimized out>, argv=0x55555891bdc0) at ../src/node.cc:4236
#33 0x00007ffff7132567 in __libc_start_main () from /usr/lib/libc.so.6
#34 0x0000555555aeb39a in _start ()
(gdb) p (char*) $rdi
$6 = 0x5555589753b0 "/bin/node"

($rdi is the first argument in calling convention in x86-64)
So, this is why it crashes. It accidentially loaded it whole program itself.

@LionNatsu
Copy link
Author

gdb showed the __exit_funcs (a linked table of all the exit handlers in glibc) was flush after the mistaken dlopen was called.

The direct reason of crashing is libstdc++ found the address of the string to be destructed is different from the default static one (the empty string object), so it thought it has been malloc'ed, but it has not, all this was because dlopen remapped program and relocated memory to another place again (PIE). So libstdc++ tried to delete it. delete calls free, free realised the address is not a chunk from malloc, SIGABRT.

@LionNatsu
Copy link
Author

LionNatsu commented Jul 17, 2018

From:

#4  0x00005555564ebe7b in ossl_init_base () at ../deps/openssl/openssl/crypto/init.c:116

we can see why OpenSSL tries to dlopen() the main program itself.

If OpenSSL is compiled to a shared object (libssl.so.???, libcrypto.so.???), it will try to keep itself(libcrypto.so in this case) loaded in two ways:

  • -Wl,-znodelete, passing nodelete to linker.
  • "Deliberately leak a reference", dlopen itself
    /*
     * Deliberately leak a reference to ourselves. This will force the library
     * to remain loaded until the atexit() handler is run at process exit.
     */
    {
        DSO *dso = NULL;

        ERR_set_mark();
        dso = DSO_dsobyaddr(&base_inited, DSO_FLAG_NO_UNLOAD_ON_FREE);
        DSO_free(dso);
        ERR_pop_to_mark();
    }

But this is pointless for us because node link the bundled OpenSSL statically. It becomes to dlopen the node instead of libcrypto.so.

Why does node work fine without PIE?

dlopen() cannot load an executable file. It failed silently. Home run!

Why does node crash with PIE?

PIE changes the type of executable files from ET_EXEC to ET_DYN, i.e. acts like a .so file, can be loaded by dlopen (accidentally).

Why does node work fine without path?

OpenSSL uses dladdr() to get the path of the current module. If the module is the main program, dladdr() returns the first command argument argv[0]. if there is no a "/", OpenSSL will try to load "lib"+argv[0]+".so". Hence "libnode.so" does not exist, failed silently, good news.

When it is "/bin/node", no modify is needed, dlopen it directly. Oops.

Sent a PR for OpenSSL upstream here: openssl/openssl#6725

@LionNatsu
Copy link
Author

Postscript:
The original issue was I cannot use npm --global ...:

$ sudo npm --global
Error: could not get uid/gid
[ 'nobody', 1000 ]
free(): invalid pointer

    at /usr/lib/node_modules/npm/node_modules/uid-number/uid-number.js:37:16
    at ChildProcess.exithandler (child_process.js:298:5)
    at ChildProcess.emit (events.js:182:13)
    at maybeClose (internal/child_process.js:961:16)
    at Socket.stream.socket.on (internal/child_process.js:380:11)
    at Socket.emit (events.js:182:13)
    at Pipe._handle.close (net.js:595:12)
TypeError: Cannot read property 'get' of undefined
    at errorHandler (/usr/lib/node_modules/npm/lib/utils/error-handler.js:205:18)
    at /usr/lib/node_modules/npm/bin/npm-cli.js:76:20
    at cb (/usr/lib/node_modules/npm/lib/npm.js:228:22)
    at /usr/lib/node_modules/npm/lib/npm.js:266:24
    at /usr/lib/node_modules/npm/lib/config/core.js:83:7
    at Array.forEach (<anonymous>)
    at /usr/lib/node_modules/npm/lib/config/core.js:82:13
    at f (/usr/lib/node_modules/npm/node_modules/once/once.js:25:25)
    at afterExtras (/usr/lib/node_modules/npm/lib/config/core.js:173:20)
    at Conf.<anonymous> (/usr/lib/node_modules/npm/lib/config/core.js:231:22)
/usr/lib/node_modules/npm/lib/utils/error-handler.js:205
  if (npm.config.get('json')) {
                 ^

TypeError: Cannot read property 'get' of undefined
    at process.errorHandler (/usr/lib/node_modules/npm/lib/utils/error-handler.js:205:18)
    at process.emit (events.js:182:13)
    at process._fatalException (internal/bootstrap/node.js:441:27)

The iceberg under the surface is not so obvious anyways...

Tested:

  • 10.5.0 (reproducible)
  • 10.2.1 (reproducible)
  • 9.11.1 (not reproducible)

LionNatsu added a commit to LionNatsu/node that referenced this issue Jul 17, 2018
The statically linked program should not try to keep loaded
by dlopen trick.

Fixes: nodejs#21845
Refs: openssl/openssl#6725
@gireeshpunathil
Copy link
Member

@LionNatsu - does #28045 fix this?

@LionNatsu
Copy link
Author

Thanks. I'll do a test to compare if it has fixed it ASAP

@LionNatsu
Copy link
Author

The question for OpenSSL remains, I can't see why OpenSSL "pinshared" when it was built statically. Isn't there an option rule can toggle this automatically?

Anyways, the bug was fixed. the new introduced no-pinshared option works pretty well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants