-
Notifications
You must be signed in to change notification settings - Fork 398
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Native crash when producer is closed #7
Comments
Do you have a big test case you can send me a gist of? If not I can just try to write one myself from your description. It looks like RdKafka::Topic::create needs to use the thread lock in case we are currently disconnecting. I'll fix this at the native level with some JS level checks to make sure it isn't throwing an unnecessary unhandled exception. |
@webmakersteve my 'big test case' is tests for https://github.com/wikimedia/change-propagation repo that I'm currently trying to upgrade to use your client, but it's a very messy WIP right now, I don't think you will make it run in the current state. I can push my WIP work and give you instructions how to run it if you want, but honestly I think you will just waist your time setting it up, bc even if you set it up, it's reproducible quite rarely, it's a data race after all. |
@webmakersteve Got it! Here's a small test: const Kafka = require('./lib');
const producer = new Kafka.Producer({
'metadata.broker.list': 'localhost:9092'
});
producer.connect(undefined, () => {
producer.disconnect(() => {
console.log('Disconnected');
});
while (true) {
producer.Topic('test', {});
}
}); It crashes repeatedly. |
Reproduced. I'd like to fix this issue regardless, but the way I would recommend producing to topics anyway is to make the topics and handle them yourself. e.g. producer.connect(undefined, (err) => {
if (err) {
return;
}
var kafkaTopic = producer.Topic('test', {});
for (var i = 0; i < 2000; i++) {
producer.produce({ topic: kafkaTopic, message: new Buffer('message' });
}
}); And you can do that for whatever topics you'd like. It used to create a topic every time produce was called, which was a waste of memory and CPU. So I changed topics to be wrapped objects so that they would be inaccessible at the same time as their connections (generally because they would be in the same scope). Looking at this problem I realize I'll need to have some way to invalidate topics when the consumer or producer is disconnected unless i just rely on using JavaScript to guard it. I would probably need a callback that the A situation like your test case where are you manually initializing a topic after you have called disconnect would likely be a programming error. It shouldn't segfault anyway (and that's why it should be fixed), but I think if you change up how you do it you can probably get around the problem until i refactor it a bit. |
Ye, sure, it's not a valid situation of course and if it didn't segfault but throw an error or something I wouldn't create this ticket, just segfault is not acceptable as a consequence of any programming errors :) |
Absolutely! And that's why I'm going to fix it! Haha. |
Fixes test case (as far as I can tell) for Issue #7
I believe this fixes that particular test case: https://github.com/Blizzard/node-rdkafka/tree/topic-connection-check Do you want to try that and ensure it also works for you? Still doesn't solve the problem of "What if I create a topic and then try to use that topic after I have disconnected?" That will need to be handled in a special way. |
@webmakersteve I've already changed stuff in my WIP, so it doesn't do this stupid stuff that was causing the crash any more, but looking at your commit that should work fine.. Thank you |
Fixes test case (as far as I can tell) for Issue #7
Closing this issue. Reopen it if you find it not working :) |
Here's a relevant part of the crash report:
Unfortunately I couldn't create a small tests case to reliably reproduce, but the cause seem to be quite clear. When
produce
anddisconnect
calls are placed with very unlucky timing, we've got a race:disconnect
called, ProducerDisconnect work is placed.produce
called - since we're updating the_isConnected
only after the disconnect happened, the produce call goes tomaybeTopic
and then to native code and gets all the way till hereI think the easiest solution is to update JS _isConnected property before the actual disconnect happens - then all these racy code paths will be protected. However, maybe it's better to invest time in fixing those races on the native level, not sure which path do you wanna choose.
The text was updated successfully, but these errors were encountered: