Skip to content
This repository has been archived by the owner on Aug 8, 2023. It is now read-only.

Segmentation fault running "render reports an error when the request function responds with an error" test on node4-clang39-release #11281

Closed
ChrisLoer opened this issue Feb 22, 2018 · 6 comments
Labels
crash Node.js node-mapbox-gl-native tests

Comments

@ChrisLoer
Copy link
Contributor

Seen at https://circleci.com/gh/mapbox/mapbox-gl-native/74690?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link. Went away after re-running build.

@kkaefer I haven't really looked at this, but you touched threading code recently in #11005, does anything below ring any bells for you?

# render reports an error when the request function responds with an error (glyph)
Segmentation fault (core dumped)
npm ERR! Test failed.  See above for more details.
apitrace: unloaded from /usr/bin/node
Makefile:461: recipe for target 'test-node' failed
make: *** [test-node] Error 1
apitrace: unloaded from /usr/bin/make
[logbt] saw 'apitrace' exit with code:2 (INT)
[logbt] No corefile found at ./core.1541.*
[logbt] Found corefile (non-tracked) at ./core.1578.!usr!bin!node
[logbt] Processing cores...
[New LWP 1578]
[New LWP 1581]
[New LWP 1580]
[New LWP 1583]
[New LWP 1670]
[New LWP 1582]
[New LWP 1671]
[New LWP 1672]
[New LWP 1688]
[New LWP 1579]
[New LWP 1691]
[New LWP 1689]
[New LWP 1690]
[New LWP 1673]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `node /src/node_modules/.bin/tape platform/node/test/js/**/*.test.js'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  v8::Object::GetInternalField (this=0x0, index=1) at node/v4.8.7/include/v8.h:7620
7620	  O* obj = *reinterpret_cast<O**>(this);
[Current thread is 1 (Thread 0x7f2a2de17740 (LWP 1578))]

Thread 14 (Thread 0x7f2a0ebcc700 (LWP 1673)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000000000fb6699 in uv_cond_wait ()
#2  0x0000000000fa7068 in ?? ()
#3  0x0000000000fb61f9 in ?? ()
#4  0x00007f2a2c54c6ba in start_thread (arg=0x7f2a0ebcc700) at pthread_create.c:333
#5  0x00007f2a2c2823dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 13 (Thread 0x7f2a0d191700 (LWP 1690)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007f2a191e3b83 in ?? () from /usr/lib/x86_64-linux-gnu/dri/swrast_dri.so
#2  0x00007f2a191e39e7 in ?? () from /usr/lib/x86_64-linux-gnu/dri/swrast_dri.so
#3  0x00007f2a2c54c6ba in start_thread (arg=0x7f2a0d191700) at pthread_create.c:333
#4  0x00007f2a2c2823dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 12 (Thread 0x7f2a0d992700 (LWP 1689)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007f2a191e3b83 in ?? () from /usr/lib/x86_64-linux-gnu/dri/swrast_dri.so
#2  0x00007f2a191e39e7 in ?? () from /usr/lib/x86_64-linux-gnu/dri/swrast_dri.so
#3  0x00007f2a2c54c6ba in start_thread (arg=0x7f2a0d992700) at pthread_create.c:333
#4  0x00007f2a2c2823dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 11 (Thread 0x7f2a0c990700 (LWP 1691)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007f2a191e3b83 in ?? () from /usr/lib/x86_64-linux-gnu/dri/swrast_dri.so
#2  0x00007f2a191e39e7 in ?? () from /usr/lib/x86_64-linux-gnu/dri/swrast_dri.so
#3  0x00007f2a2c54c6ba in start_thread (arg=0x7f2a0c990700) at pthread_create.c:333
#4  0x00007f2a2c2823dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 10 (Thread 0x7f2a2de25700 (LWP 1579)):
#0  0x00007f2a2c554827 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x1909fc0) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
#1  do_futex_wait (sem=sem@entry=0x1909fc0, abstime=0x0) at sem_waitcommon.c:111
#2  0x00007f2a2c5548d4 in __new_sem_wait_slow (sem=0x1909fc0, abstime=0x0) at sem_waitcommon.c:181
#3  0x00007f2a2c55497a in __new_sem_wait (sem=<optimized out>) at sem_wait.c:29
#4  0x0000000000fb6522 in uv_sem_wait ()
#5  0x0000000000de7042 in node::DebugSignalThreadMain(void*) ()
#6  0x00007f2a2c54c6ba in start_thread (arg=0x7f2a2de25700) at pthread_create.c:333
#7  0x00007f2a2c2823dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 9 (Thread 0x7f2a0e193700 (LWP 1688)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007f2a191e3b83 in ?? () from /usr/lib/x86_64-linux-gnu/dri/swrast_dri.so
#2  0x00007f2a191e39e7 in ?? () from /usr/lib/x86_64-linux-gnu/dri/swrast_dri.so
#3  0x00007f2a2c54c6ba in start_thread (arg=0x7f2a0e193700) at pthread_create.c:333
#4  0x00007f2a2c2823dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 8 (Thread 0x7f2a0f3cd700 (LWP 1672)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000000000fb6699 in uv_cond_wait ()
#2  0x0000000000fa7068 in ?? ()
#3  0x0000000000fb61f9 in ?? ()
#4  0x00007f2a2c54c6ba in start_thread (arg=0x7f2a0f3cd700) at pthread_create.c:333
#5  0x00007f2a2c2823dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 7 (Thread 0x7f2a0fbce700 (LWP 1671)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000000000fb6699 in uv_cond_wait ()
#2  0x0000000000fa7068 in ?? ()
#3  0x0000000000fb61f9 in ?? ()
#4  0x00007f2a2c54c6ba in start_thread (arg=0x7f2a0fbce700) at pthread_create.c:333
#5  0x00007f2a2c2823dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 6 (Thread 0x7f2a2b178700 (LWP 1582)):
#0  0x00007f2a2c554827 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x2561f18) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
#1  do_futex_wait (sem=sem@entry=0x2561f18, abstime=0x0) at sem_waitcommon.c:111
#2  0x00007f2a2c5548d4 in __new_sem_wait_slow (sem=0x2561f18, abstime=0x0) at sem_waitcommon.c:181
#3  0x00007f2a2c55497a in __new_sem_wait (sem=<optimized out>) at sem_wait.c:29
#4  0x0000000000fc0358 in v8::base::Semaphore::Wait() ()
#5  0x0000000000e5c2b9 in v8::platform::TaskQueue::GetNext() ()
#6  0x0000000000e5c40c in v8::platform::WorkerThread::Run() ()
#7  0x0000000000fc1310 in ?? ()
#8  0x00007f2a2c54c6ba in start_thread (arg=0x7f2a2b178700) at pthread_create.c:333
#9  0x00007f2a2c2823dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 5 (Thread 0x7f2a103cf700 (LWP 1670)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x0000000000fb6699 in uv_cond_wait ()
#2  0x0000000000fa7068 in ?? ()
#3  0x0000000000fb61f9 in ?? ()
#4  0x00007f2a2c54c6ba in start_thread (arg=0x7f2a103cf700) at pthread_create.c:333
#5  0x00007f2a2c2823dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 4 (Thread 0x7f2a2a977700 (LWP 1583)):
#0  0x00007f2a2c554827 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x2561f18) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
#1  do_futex_wait (sem=sem@entry=0x2561f18, abstime=0x0) at sem_waitcommon.c:111
#2  0x00007f2a2c5548d4 in __new_sem_wait_slow (sem=0x2561f18, abstime=0x0) at sem_waitcommon.c:181
#3  0x00007f2a2c55497a in __new_sem_wait (sem=<optimized out>) at sem_wait.c:29
#4  0x0000000000fc0358 in v8::base::Semaphore::Wait() ()
#5  0x0000000000e5c2b9 in v8::platform::TaskQueue::GetNext() ()
#6  0x0000000000e5c40c in v8::platform::WorkerThread::Run() ()
#7  0x0000000000fc1310 in ?? ()
#8  0x00007f2a2c54c6ba in start_thread (arg=0x7f2a2a977700) at pthread_create.c:333
#9  0x00007f2a2c2823dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 3 (Thread 0x7f2a2c17a700 (LWP 1580)):
#0  0x00007f2a2c554827 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x2561f18) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
#1  do_futex_wait (sem=sem@entry=0x2561f18, abstime=0x0) at sem_waitcommon.c:111
#2  0x00007f2a2c5548d4 in __new_sem_wait_slow (sem=0x2561f18, abstime=0x0) at sem_waitcommon.c:181
#3  0x00007f2a2c55497a in __new_sem_wait (sem=<optimized out>) at sem_wait.c:29
#4  0x0000000000fc0358 in v8::base::Semaphore::Wait() ()
#5  0x0000000000e5c2b9 in v8::platform::TaskQueue::GetNext() ()
#6  0x0000000000e5c40c in v8::platform::WorkerThread::Run() ()
#7  0x0000000000fc1310 in ?? ()
#8  0x00007f2a2c54c6ba in start_thread (arg=0x7f2a2c17a700) at pthread_create.c:333
#9  0x00007f2a2c2823dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 2 (Thread 0x7f2a2b979700 (LWP 1581)):
#0  0x00007f2a2c554827 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x2561f18) at ../sysdeps/unix/sysv/linux/futex-internal.h:205
#1  do_futex_wait (sem=sem@entry=0x2561f18, abstime=0x0) at sem_waitcommon.c:111
#2  0x00007f2a2c5548d4 in __new_sem_wait_slow (sem=0x2561f18, abstime=0x0) at sem_waitcommon.c:181
#3  0x00007f2a2c55497a in __new_sem_wait (sem=<optimized out>) at sem_wait.c:29
#4  0x0000000000fc0358 in v8::base::Semaphore::Wait() ()
#5  0x0000000000e5c2b9 in v8::platform::TaskQueue::GetNext() ()
#6  0x0000000000e5c40c in v8::platform::WorkerThread::Run() ()
#7  0x0000000000fc1310 in ?? ()
#8  0x00007f2a2c54c6ba in start_thread (arg=0x7f2a2b979700) at pthread_create.c:333
#9  0x00007f2a2c2823dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 1 (Thread 0x7f2a2de17740 (LWP 1578)):
#0  v8::Object::GetInternalField (this=0x0, index=1) at node/v4.8.7/include/v8.h:7620
#1  node_mbgl::NodeRequest::Execute (this=<optimized out>) at ../../../platform/node/src/node_request.cpp:127
#2  0x00007f2a29a0040e in node_mbgl::NodeMap::request(mbgl::Resource const&, std::function<void (mbgl::Response)>) (this=<optimized out>, resource=..., callback_=...) at ../../../platform/node/src/node_map.cpp:1169
#3  0x00007f2a29a004ed in non-virtual thunk to node_mbgl::NodeMap::request(mbgl::Resource const&, std::function<void (mbgl::Response)>) () at ../../../platform/node/src/node_map.cpp:1155
#4  0x00007f2a29bebe6b in mbgl::GlyphManager::requestRange (this=<optimized out>, request=..., fontStack=..., range=...) at ../../../src/mbgl/text/glyph_manager.cpp:72
#5  0x00007f2a29beb979 in mbgl::GlyphManager::getGlyphs (this=<optimized out>, requestor=..., glyphDependencies=...) at ../../../src/mbgl/text/glyph_manager.cpp:49
#6  0x00007f2a29cf6824 in mbgl::GeometryTile::getGlyphs (this=<optimized out>, glyphDependencies=...) at ../../../src/mbgl/tile/geometry_tile.cpp:168
#7  0x00007f2a29d217ac in mbgl::MessageImpl<mbgl::GeometryTile, void (mbgl::GeometryTile::*)(std::map<std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::set<char16_t, std::less<char16_t>, std::allocator<char16_t> >, std::less<std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >, std::allocator<std::pair<std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const, std::set<char16_t, std::less<char16_t>, std::allocator<char16_t> > > > >), std::tuple<std::map<std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::set<char16_t, std::less<char16_t>, std::allocator<char16_t> >, std::less<std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >, std::allocator<std::pair<std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const, std::set<char16_t, std::less<char16_t>, std::allocator<char16_t> > > > > > >::invoke<0ul> (this=0x0) at ../../../include/mbgl/actor/message.hpp:34
#8  mbgl::MessageImpl<mbgl::GeometryTile, void (mbgl::GeometryTile::*)(std::map<std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::set<char16_t, std::less<char16_t>, std::allocator<char16_t> >, std::less<std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >, std::allocator<std::pair<std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const, std::set<char16_t, std::less<char16_t>, std::allocator<char16_t> > > > >), std::tuple<std::map<std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >, std::set<char16_t, std::less<char16_t>, std::allocator<char16_t> >, std::less<std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >, std::allocator<std::pair<std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const, std::set<char16_t, std::less<char16_t>, std::allocator<char16_t> > > > > > >::operator() (this=0x0) at ../../../include/mbgl/actor/message.hpp:29
#9  0x00007f2a29a10aad in mbgl::Mailbox::receive (this=0x37a1970) at ../../../src/mbgl/actor/mailbox.cpp:58
#10 0x00007f2a29a10d00 in mbgl::Mailbox::maybeReceive (mailbox=...) at ../../../src/mbgl/actor/mailbox.cpp:67
#11 0x00007f2a29d83478 in mbgl::util::RunLoop::schedule(std::weak_ptr<mbgl::Mailbox>)::{lambda()#1}::operator()() const (this=<optimized out>) at ../../../include/mbgl/util/run_loop.hpp:78
#12 mbgl::WorkTaskImpl<mbgl::util::RunLoop::schedule(std::weak_ptr<mbgl::Mailbox>)::{lambda()#1}, std::tuple<> >::invoke<>(std::integer_sequence<unsigned long>) (this=0x7f2a001c2360) at ../../../include/mbgl/util/work_task_impl.hpp:43
#13 mbgl::WorkTaskImpl<mbgl::util::RunLoop::schedule(std::weak_ptr<mbgl::Mailbox>)::{lambda()#1}, std::tuple<> >::operator()() (this=0x7f2a001c2360) at ../../../include/mbgl/util/work_task_impl.hpp:23
#14 0x00007f2a29d8264e in mbgl::util::RunLoop::process (this=<optimized out>) at ../../../include/mbgl/util/run_loop.hpp:117
#15 0x0000000000fa945b in ?? ()
#16 0x0000000000fa9533 in ?? ()
#17 0x0000000000fb9b00 in ?? ()
#18 0x0000000000faa016 in uv_run ()
#19 0x0000000000df4a20 in node::Start(int, char**) ()
#20 0x00007f2a2c19b830 in __libc_start_main (main=0x723640 <main>, argc=3, argv=0x7ffe723b0458, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffe723b0448) at ../csu/libc-start.c:291
#21 0x00000000007238ad in _start ()
@ChrisLoer
Copy link
Contributor Author

For what it's worth, I ran npm test from a local image of node4-clang39-relesae about 40 times and didn't see any problems...

@ChrisLoer
Copy link
Contributor Author

Aha, I think I see the problem here (and it's nothing to do with #11005) -- although it's up to GC, we can potentially tear down the NodeMap object as soon as we get a render failure callback from Map::onResourceError. If the map requests multiple glyph PBFs, it's possible that the request for one errors (as intended by the test), triggers the renderFinished callback, tears down the NodeMap, and then the second request runs against an undefined NodeMap and crashes.

🤔This sort of problem might cause an api-gl crash too?

@ChrisLoer
Copy link
Contributor Author

Er, actually all I've learned today is that I don't really understand node bindings, behavior changes between different versions of node, and GC timing is out to trick me...

But -- what's supposed to happen when NodeMap gets garbage collected is: the destructor will release the renderer frontend, which holds the render sources, which hold the GeometryTiles which will call Mailbox::close() which is supposed to prevent future receive calls on the mailbox from returning any messages. It is possible for the mailbox to survive after the GeometryTile is torn down if the worker currently has a temporary shared_ptr to the mailbox. In that case, the nodeRunLoop may try processing messages for the mailbox, but it won't get anything. Once the mailbox has been destroyed, the nodeRunLoop will still have a weak_ptr to the mailbox, but when it goes to lock it it will see there's nothing left and move on. Which means I don't have an explanation.

I've gotten confused in debugging by the way node will exit the process as soon as it's done running javascript (so there can be trailing calls on the C++ side that never happen), but I don't think it's relevant to this particular crash since this crash didn't happen at the end of the test run.

@ChrisLoer
Copy link
Contributor Author

So... I took the failing test:

test(`render reports an error when the request function responds with an error (${resource})`, function(t) {
var map = new mbgl.Map({
request: function(req, callback) {
var data = mockfs.dataForRequest(req);
if (mockfs[resource] === data) {
callback(new Error('message'));
} else {
callback(null, { data: data });
}
}
});
map.load(mockfs.style_vector);
map.render({ zoom: 16 }, function(err, pixels) {
t.assert(err);
t.assert(/message/.test(err.message));
t.assert(!pixels);
t.end();
});

... and modified the code to intentionally delay one of the callbacks:

if (!firstCall || resource !== 'glyph') {
    callback(new Error('message'));
} else {
    firstCall = false;
    setTimeout(() => {
        console.log("Deferred error message");
        callback(new Error('message'));
        setTimeout(() => {
            console.log("Kept JS running a bit longer");
        }, 5000);
    }, 10000);
}

I then logged the constructor and destructor for all the NodeMap objects and tracked the one used in the test:

...
Created renderer with fileSource: 0x101d963d8
Created NodeMap: 0x101d963c0
Processing response, fileSource: 0x101d963d8
Processing response, fileSource: 0x101d963d8
Processing response, fileSource: 0x101d963d8
...
Releasing NodeMap: 0x101d963c0
...
Deferred error message
Kept JS running a bit longer

If GC didn't happen, then after the "Deferred error message", you'd see a "Processing response" message, but the destructor for NodeMap also clears the NodeAsyncRequest callback, so that's working as expected.

What's interesting to me here is I think it's showing that the NodeMap object is eligible for GC as soon as the map.render callback finishes -- and also even though it's eligible for GC, it may still have tasks running in the background, and processing the event loop could cause one of those tasks to do something that interacts with v8. Is it possible GC is actually happening in the middle of the NodeMap::request call?

std::unique_ptr<mbgl::AsyncRequest> NodeMap::request(const mbgl::Resource& resource, mbgl::FileSource::Callback callback_) {
Nan::HandleScope scope;
v8::Local<v8::Value> argv[] = {
Nan::New<v8::External>(this),
Nan::New<v8::External>(&callback_)
};
auto instance = Nan::New(NodeRequest::constructor)->NewInstance(2, argv);
Nan::Set(instance, Nan::New("url").ToLocalChecked(), Nan::New(resource.url).ToLocalChecked());
Nan::Set(instance, Nan::New("kind").ToLocalChecked(), Nan::New<v8::Integer>(resource.kind));
auto request = Nan::ObjectWrap::Unwrap<NodeRequest>(instance);
request->Execute();
return std::make_unique<NodeRequest::NodeAsyncRequest>(request);
}

That Nan::New(NodeRequest::constructor)->NewInstance call goes through to v8's heap allocator, and it looks to me like any call to the heap allocator may trigger immediate GC. So the NodeMap could be valid going into that call but invalid by the time the call returned?

Can anyone who's worked with the node bindings more tell me if I've gone off the deep end or if this sounds possible?

/cc @tmpsantos @jfirebaugh @brunoabinader

@kkaefer kkaefer added the Node.js node-mapbox-gl-native label Feb 23, 2018
@ChrisLoer
Copy link
Contributor Author

I think my hypothesis may be right. I can pretty reliably trigger what looks like the same crash in a debugger by adding these lines just before the request->Execute() call in NodeMap::request:

    for (int i = 0; i < 5000; i++) {
        // Try to trigger garbage collection
        Nan::New(NodeRequest::constructor)->NewInstance(2, argv);
    }

Now I need to figure out a way to identify when we're in this grey zone of "not yet garbage collected but could be collected at any time so it's unsafe to call into v8".

ChrisLoer added a commit that referenced this issue Feb 24, 2018
Avoids a potential crash if garbage collection happens in the middle of a call to NodeMap::request from a map that's eligible for GC.
Fixes issue #11281
ChrisLoer added a commit that referenced this issue Feb 24, 2018
Avoids a potential crash if garbage collection happens in the middle of a call to NodeMap::request from a map that's eligible for GC.
Fixes issue #11281
@ChrisLoer
Copy link
Contributor Author

Fixed in #11298.

ChrisLoer added a commit that referenced this issue Feb 26, 2018
Avoids a potential crash if garbage collection happens in the middle of a call to NodeMap::request from a map that's eligible for GC.
Fixes issue #11281
ChrisLoer added a commit that referenced this issue Feb 26, 2018
Avoids a potential crash if garbage collection happens in the middle of a call to NodeMap::request from a map that's eligible for GC.
Fixes issue #11281
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
crash Node.js node-mapbox-gl-native tests
Projects
None yet
Development

No branches or pull requests

2 participants