-
Notifications
You must be signed in to change notification settings - Fork 1.3k
mbgl-node segfaults in node 10 when outstanding http requests complete after map is GCed #12252
Comments
I'm getting the same error, anything I can do to clarify the issue further? |
This is a stab in the dark, but the fact that the call trace shows a segfault inside Can you provide more detail on the version you're using, like a tag or version number? The latest release (4.1.0 at https://www.npmjs.com/package/@mapbox/mapbox-gl-native) has the fix to issue #11281, although this could definitely be a different crash following the same pattern. |
I’m using 4.1.0 but the same issue exists in 4.0.0. I’m on node 10 on MacOs.
I’ll post a more detailed trace and system info later.
|
Ok, so here we go. I'm running the generation of the tiles within a micro service. I'm currently not even doing a
Output of SegfaultHandler
Demo source (styleSourcePath points to the standard openmaptiles osm-bright-gl-style.json)
|
Thanks @flurin, that's useful information. I haven't gotten a local reproduction of this working yet, and to be honest it takes me a long time whenever I have to re-wrap my head around node/v8 (and I haven't debugged this with node 10 before). The immediate thing that jumps out at me is that the
Until one of us gets a chance to dig in to this, is it possible as a workaround for you to modify your code to hold onto the map references so they don't get GCed while there are outstanding requests? |
Thanks @ChrisLoer! However I'm having no luck determining wether or not a request is still outstanding. Especially since in the example above the callbacks are called immediately. Can you point me in the right direction? I managed to create a simpler reproduction path though that has no other dependencies except mapbox-gl-native. This example consistently fails after about 24 times.
|
The map has a The way we use the gl-native node module when we do server side rendering is we keep a pool of map objects around, and when a request comes in we acquire one of the maps from the pool and use it to render. On top of sidestepping these lifetime issues, keeping the map objects around avoids lots of redundant work that would otherwise happen during the map object creation. Thanks for the even simpler reproduction case! Incidentally, is it easy for you to try your experiment with Node 8? I know there were changes related to the async interface in Node 10 (@kkaefer just recently updated our version of |
Can reproduce in Docker |
Using a pool of maps is an interesting idea. That may also reduce our render times. I'll look in to that! Thanks for the tip! For the reproduction it is as @hkrutzer says, the above minimal script runs fine in Node 8. So it's definitely something to do with Node 10. Let me know if I can help out further in tracking this down.
|
As a workaround using a pool and keeping the reference works for now. Also much faster this way. The only tricky bit is that you cannot call |
Running the script above with |
Further experimenting still with Node 10 and a global pool of map instances still gives a segfault now and then (a lot less admittedly). I'm switching back to Node 8 for now. |
This issue has been automatically detected as stale because it has not had recent activity and will be archived. Thank you for your contributions. |
This hasn't been fixed yet. |
I believe this is now fixed by #14847 please re-open if issues continue. |
I receive random segmentation faults. this occurs under node10, using both the latest release of mapbox-gl-native and the latest code from git. I do not have a self-contained test case where it occurs, however, I have attached what debugging information I do have. Concurrency does not appear to be a factor, in reproducing it. Sometimes it will seg on the first run, sometimes an ab -c 800 -n100000 will run 500 times with no issue.
darcy@HOST:~/src$ node -v
v10.5.0
darcy@HOST:~/src$ uname -a
Linux HOST 4.15.0-23-generic #25-Ubuntu SMP Wed May 23 18:02:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
PID 25796 received SIGSEGV for address: 0x0
/home/darcy/src/node_modules/segfault-handler/build/Release/segfault-handler.node(+0x2ed7)[0x7fbfaaab5ed7]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x12890)[0x7fbfad476890]
node(_ZN4node16EmitAsyncDestroyEPN2v87IsolateENS_13async_contextE+0x15)[0x86a785]
/home/darcy/src/node_modules/@mapbox/mapbox-gl-native/lib/mbgl-node.abi-64.node(+0x565b0)[0x7fbf922325b0]
/home/darcy/src/node_modules/@mapbox/mapbox-gl-native/lib/mbgl-node.abi-64.node(+0x6b0c4)[0x7fbf922470c4]
/home/darcy/src/node_modules/@mapbox/mapbox-gl-native/lib/mbgl-node.abi-64.node(+0x69386)[0x7fbf92245386]
/home/darcy/src/node_modules/@mapbox/mapbox-gl-native/lib/mbgl-node.abi-64.node(+0x693b6)[0x7fbf922453b6]
/home/darcy/src/node_modules/@mapbox/mapbox-gl-native/lib/mbgl-node.abi-64.node(+0x57414)[0x7fbf92233414]
node(_ZN2v88internal13GlobalHandles31DispatchPendingPhantomCallbacksEb+0xc3)[0xe42a23]
node(_ZN2v88internal13GlobalHandles31PostGarbageCollectionProcessingENS0_16GarbageCollectorENS_15GCCallbackFlagsE+0x2a)[0xe42c4a]
node(_ZN2v88internal4Heap24PerformGarbageCollectionENS0_16GarbageCollectorENS_15GCCallbackFlagsE+0x1eb)[0xe80d7b]
node(_ZN2v88internal4Heap14CollectGarbageENS0_15AllocationSpaceENS0_23GarbageCollectionReasonENS_15GCCallbackFlagsE+0x194)[0xe81c74]
node(_ZN2v88internal4Heap20AllocateRawWithRetryEiNS0_15AllocationSpaceENS0_19AllocationAlignmentE+0x45)[0xe845a5]
node(_ZN2v88internal7Factory15NewFillerObjectEibNS0_15AllocationSpaceE+0x24)[0xe4cad4]
node(_ZN2v88internal26Runtime_AllocateInNewSpaceEiPPNS0_6ObjectEPNS0_7IsolateE+0x6e)[0x10ecdbe]
[0x3638ff7841bd]
Thread 1 "node" received signal SIGSEGV, Segmentation fault.
0x000000000086a785 in node::EmitAsyncDestroy(v8::Isolate*, node::async_context) ()
(gdb) bt
#0 0x000000000086a785 in node::EmitAsyncDestroy(v8::Isolate*, node::async_context) ()
#1 0x00007fffdbaf85b0 in Nan::AsyncResource::~AsyncResource (this=0x50b51b0, __in_chrg=) at ../../headers/nan/2.10.0/nan.h:513
#2 0x00007fffdbb0d0c4 in Nan::AsyncWorker::~AsyncWorker (this=0x5156ed8, __in_chrg=) at ../../headers/nan/2.10.0/nan.h:1801
#3 0x00007fffdbb0b386 in node_mbgl::NodeRequest::~NodeRequest (this=0x5156ec0, __in_chrg=)
at ../../../platform/node/src/node_request.cpp:18
#4 0x00007fffdbb0b3b6 in node_mbgl::NodeRequest::~NodeRequest (this=0x5156ec0, __in_chrg=)
at ../../../platform/node/src/node_request.cpp:25
#5 0x00007fffdbaf9414 in Nan::ObjectWrap::WeakCallback (info=...) at ../../headers/nan/2.10.0/nan_object_wrap.h:126
#6 0x0000000000e42a23 in v8::internal::GlobalHandles::DispatchPendingPhantomCallbacks(bool) ()
#7 0x0000000000e42c4a in v8::internal::GlobalHandles::PostGarbageCollectionProcessing(v8::internal::GarbageCollector, v8::GCCallbackFlags) ()
#8 0x0000000000e80d7b in v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) ()
#9 0x0000000000e81c74 in v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) ()
#10 0x0000000000e845a5 in v8::internal::Heap::AllocateRawWithRetry(int, v8::internal::AllocationSpace, v8::internal::AllocationAlignment) ()
#11 0x0000000000e4cad4 in v8::internal::Factory::NewFillerObject(int, bool, v8::internal::AllocationSpace) ()
#12 0x00000000010ecdbe in v8::internal::Runtime_AllocateInNewSpace(int, v8::internal::Object**, v8::internal::Isolate*) ()
#13 0x00001729be5841bd in ?? ()
#14 0x00001729be584121 in ?? ()
#15 0x00007fffffff8ce0 in ?? ()
#16 0x0000000000000006 in ?? ()
#17 0x00007fffffff8d70 in ?? ()
#18 0x00001729be87a96e in ?? ()
#19 0x0000016800000000 in ?? ()
#20 0x0000000000000037 in ?? ()
---Type to continue, or q to quit---
#21 0x000000000000001a in ?? ()
#22 0x0000000000000037 in ?? ()
#23 0x0000000000000020 in ?? ()
#24 0x00000a89dd582201 in ?? ()
#25 0x00000a89dd582289 in ?? ()
#26 0x00000a89dd582381 in ?? ()
#27 0x00000a4317a1e219 in ?? ()
#28 0x000007d8dfe79869 in ?? ()
#29 0x00000a89dd5823b9 in ?? ()
#30 0x00000a89dd582411 in ?? ()
#31 0x00000a89dd582451 in ?? ()
#32 0x00007fffffff8da0 in ?? ()
#33 0x00001729be648f00 in ?? ()
#34 0x000005d6347022e1 in ?? ()
#35 0x000005d6347022e1 in ?? ()
#36 0x00000a89dd5824f9 in ?? ()
#37 0x00000a89dd582539 in ?? ()
#38 0x00007fffffff8df0 in ?? ()
#39 0x00001729be635ed7 in ?? ()
#40 0x000005d6347022e1 in ?? ()
#41 0x000005d6347022e1 in ?? ()
#42 0x0000000000000003 in ?? ()
#43 0x0000000002415398 in ?? ()
---Type to continue, or q to quit---
#44 0x00007fffffff8ef0 in ?? ()
#45 0x0000000000000003 in ?? ()
#46 0x00000a89dd582589 in ?? ()
#47 0x00000a89dd5825c9 in ?? ()
#48 0x00007fffffff8e28 in ?? ()
#49 0x00001729be58c5a3 in ?? ()
#50 0x000005d6347022e1 in ?? ()
#51 0x0000000000000000 in ?? ()
The text was updated successfully, but these errors were encountered: