Skip to content
This repository has been archived by the owner on Mar 7, 2023. It is now read-only.

Fixed a crash after stopMonitoring on Linux in Electron #138

Closed
wants to merge 1 commit into from

Conversation

antelle
Copy link

@antelle antelle commented Apr 9, 2021

Hi! 👋

Thanks for the library, it's awesome and seems to be much more stable compared to alternatives! 🎉

The library works like this (simplified pseuducode):

start() {
    isRunning = true;
    fd = udev_monitor_get_fd();
    start_polling();
}
stop() {
    isRunning = false;
    destroy_udev();  <-- destroys fd
}
poll() {
    while (isRunning) {
        poll(fd, 100ms);
    }
}

When we call stop, the device fd is destroyed because we destroy udev handle, however if we're still using this fd in poll, it causes SIGSEGV in Electron. It doesn't repeat in node.js, no idea why.

Simple PoC:

const { app, BrowserWindow } = require('electron');

app.whenReady().then(() => {
    const win = new BrowserWindow({ width: 800, height: 600 });
    win.loadURL('https://example.com');

    const mod = require('.');
    
    mod.startMonitoring();
    console.log('started');
    setTimeout(() => {
        mod.stopMonitoring();
        console.log('stopped');
        app.quit();
    }, 1000);
});

Compile it with:

npm i electron-rebuild
node_modules/.bin/electron-rebuild --version 12.0.0 --only .
node_modules/.bin/electron electron-test.js

Result (not always, repeat it several times):

started
stopped
~/node-usb-detection/node_modules/electron/dist/electron exited with signal SIGSEGV

This change moves destroying of the library handle to the callback that's called after "work" is complete.

While this is not critical because stopMonitoring must be called before exit, it's better if the app quits correctly instead of crashing.

I've tested only a simple case and I assume, this needs a bit more extensive testing.

@dfischer23
Copy link

when i use stopMonitoring() from within plain nodejs (17.1.0) on my Arch linux (systemd), i get:
Assertion 'm' failed at src/libsystemd/sd-device/device-monitor.c:443, function device_monitor_receive_device(). Aborting.

This PR fixes it, allowing me to cleanly quit my program. Please merge and release!

Copy link
Owner

@MadLittleMods MadLittleMods left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution @antelle! Sorry for the delay 🙇

Seems like simple enough change and appreciate the context in the description. Just asking a few more questions to understand this better as there is a lot of "what" changed but not "why"

@@ -244,6 +241,9 @@ static void cbWork(uv_work_t *req) {
}

static void cbAfter(uv_work_t *req, int status) {
udev_monitor_unref(mon);
udev_unref(udev);
Copy link
Owner

@MadLittleMods MadLittleMods Jan 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't we just move these above uv_mutex_destroy in Stop? It seems like it would have the same effect.

I assume there is nuance between cbAfter vs cbTerminate that we want this difference. Why don't we want to call udev_unref(udev); in cbTerminate?

Copy link
Contributor

@Julusian Julusian Jan 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from some googling, I think the flow is that after cbWork(running on the threadpool) has finished, cbAfter(run on the event loop) will be called.
cbTerminate will be called (I assume on the event loop) when the program wishes to terminate.

It sounds to me like cbTerminate wants to simply set isRunning to false, and let cbAfter do all the cleanup.
I think the problem is that the poll loop is in the middle of a poll at the point where udev_monitor_unref(mon); is called, causing undefined behaviour

A follow up question is why does cbAfter/cbTerminate do the cleanup instead of doing it at the end of cbWork after the loop has stopped?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds to me like cbTerminate wants to simply set isRunning to false, and let cbAfter do all the cleanup.
I think the problem is that the poll loop is in the middle of a poll at the point where udev_monitor_unref(mon); is called, causing undefined behaviour

@Julusian This sounds reasonable 👍

A follow up question is why does cbAfter/cbTerminate do the cleanup instead of doing it at the end of cbWork after the loop has stopped?

This also seems reasonable as long as those clean-up functions can be called there 🤷

Appreciate the context and confidence here ❤️

udev_monitor_unref(mon);
udev_unref(udev);

isRunning = false;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an exact behavior we're going for by moving this below (please explain and document)? Or is just stylistic preference?

I could see it maybe mattering if the uv_xxx stuff takes time to run and Stop is called multiple times, in the previous code, we wouldn't do all of the cleanup multiple times.

Copy link
Contributor

@Julusian Julusian Jan 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I agree that changing isRunning at the top makes more sense, as we want the loop to exit asap, and not start another iteration while we start cleaning up

@@ -89,16 +89,13 @@ void Stop() {
return;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If someone wants to submit a new PR with the review addressed and tested, we can move this forward more. Credit still goes to all involved (@antelle 🙇).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just adding a reference here, too. I think I've made the requested changes: #162

@MadLittleMods
Copy link
Owner

Closing as #162 has merged these changes 🚀

Everyone mentioned in the upcoming 4.14.0 changelog and will be released once I figure out why the GitHub Actions for the Windows builds aren't working, #164

Thanks for the contribution @antelle @umbernhard @Julusian! 🐙

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants