Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault in e2e/groups.spec.js #164

Open
trevorr opened this issue Nov 16, 2024 · 1 comment
Open

Segfault in e2e/groups.spec.js #164

trevorr opened this issue Nov 16, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@trevorr
Copy link

trevorr commented Nov 16, 2024

Environment Information

  • OS: MacOS 15.1
  • Node Version: 20.18
  • NPM Version: 10.18.2
  • C++ Toolchain: llvm
  • confluent-kafka-javascript version: 377e22a

Steps to Reproduce

$ npx mocha --timeout 120000 e2e/groups.spec.js

  Consumer group/Producer
[1]    13601 segmentation fault  npx mocha --timeout 120000 e2e/

Here's the stack trace:

llnode -- node ./node_modules/.bin/mocha  --timeout 120000 e2e/groups.spec.js
(lldb) target create "node"
Current executable set to '/opt/node-debug/bin/node' (arm64).
(lldb) settings set -- target.run-args  "./node_modules/.bin/mocha" "--timeout" "120000" "e2e/groups.spec.js"
(lldb) plugin load '/Users/trevor/.nvm/versions/node/v20.18.0/lib/node_modules/llnode/llnode.dylib'
(lldb) settings set prompt '(llnode) '
(llnode) r
Process 11240 launched: '/opt/node-debug/bin/node' (arm64)


  Consumer group/Producer
Process 11240 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x0000000128807018 confluent-kafka-javascript.node`Nan::Callback::Call_(v8::Isolate*, v8::Local<v8::Object>, int, v8::Local<v8::Value>*, Nan::AsyncResource*) const [inlined] v8::Local<v8::Function>::New(isolate=0x0000000140008000, that=0x0000000000000000) at v8-local-handle.h:287:68 [opt]
   284 	
   285 	  V8_INLINE static Local<T> New(Isolate* isolate,
   286 	                                const PersistentBase<T>& that) {
-> 287 	    return New(isolate, internal::ValueHelper::SlotAsValue<T>(that.val_));
   288 	  }
   289 	
   290 	  V8_INLINE static Local<T> New(Isolate* isolate,
warning: confluent-kafka-javascript.node was compiled with optimization - stepping may behave oddly; variables may not be available.
(llnode) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
  * frame #0: 0x0000000128807018 confluent-kafka-javascript.node`Nan::Callback::Call_(v8::Isolate*, v8::Local<v8::Object>, int, v8::Local<v8::Value>*, Nan::AsyncResource*) const [inlined] v8::Local<v8::Function>::New(isolate=0x0000000140008000, that=0x0000000000000000) at v8-local-handle.h:287:68 [opt]
    frame #1: 0x0000000128807018 confluent-kafka-javascript.node`Nan::Callback::Call_(v8::Isolate*, v8::Local<v8::Object>, int, v8::Local<v8::Value>*, Nan::AsyncResource*) const [inlined] v8::Local<v8::Function> Nan::New<v8::Function, v8::NonCopyablePersistentTraits<v8::Function>>(p=0x0000000000000000) at nan_implementation_12_inl.h:422:10 [opt]
    frame #2: 0x0000000128807014 confluent-kafka-javascript.node`Nan::Callback::Call_(this=0x0000000000000000, isolate=<unavailable>, target=(val_ = 0x000000013800aee8), argc=4, argv=0x000000016fdf6968, resource=0x000000016fdf68a0) const at nan.h:1880:36 [opt]
    frame #3: 0x0000000128802fac confluent-kafka-javascript.node`Nan::Callback::Call(this=0x0000000000000000, argc=4, argv=0x000000016fdf6968) const at nan.h:1824:25 [opt]
    frame #4: 0x0000000128825a54 confluent-kafka-javascript.node`NodeKafka::Workers::KafkaConsumerConsumeLoop::HandleMessageCallback(this=0x000000014f708810, msg=0x00006000023a8d80, ec=<unavailable>) at workers.cc:775:13 [opt]
    frame #5: 0x0000000128829e90 confluent-kafka-javascript.node`NodeKafka::Workers::MessageWorker::WorkMessage(this=0x000000014f708810) at workers.h:110:7 [opt]
    frame #6: 0x0000000100b88d08 node`uv__async_io(loop=0x00000001041d4c90, w=<unavailable>, events=<unavailable>) at async.c:176:5 [opt]
    frame #7: 0x0000000100b9b9b0 node`uv__io_poll(loop=0x00000001041d4c90, timeout=0) at kqueue.c:381:9 [opt]
    frame #8: 0x0000000100b89270 node`uv_run(loop=0x00000001041d4c90, mode=UV_RUN_DEFAULT) at core.c:447:5 [opt]
    frame #9: 0x0000000100005818 node`node::SpinEventLoopInternal(env=0x0000000138018c00) at embed_helpers.cc:41:7 [opt]
    frame #10: 0x00000001001347ac node`node::NodeMainInstance::Run(this=<unavailable>, exit_code=0x000000016fdff184, env=0x0000000138018c00) at node_main_instance.cc:124:9 [opt]
    frame #11: 0x00000001001344b8 node`node::NodeMainInstance::Run(this=<unavailable>) at node_main_instance.cc:100:3 [opt]
    frame #12: 0x00000001000b02c4 node`node::Start(int, char**) [inlined] node::StartInternal(argc=<unavailable>, argv=<unavailable>) at node.cc:1502:24 [opt]
    frame #13: 0x00000001000b01d4 node`node::Start(argc=<unavailable>, argv=<unavailable>) at node.cc:1509:27 [opt]
    frame #14: 0x00000001863b4274 dyld`start + 2840

(This even occurs with a fix for #161.)

@milindl
Copy link
Contributor

milindl commented Nov 18, 2024

Thank you for the report. I can't reproduce it, but looking at valgrind and the stacktrace, I think it's the same one as #34 . As far as I understand, it's because we're sending the message from the consume loop thread, to the event loop thread, but by the time it makes to the event loop thread, the callback it's supposed to call is destroyed already.

It seems to be specific to the consume() loop in the non-promisified API.

I'll keep both around until I'm certain that they are the same.

@milindl milindl added the bug Something isn't working label Nov 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants