Use a better strategy on Windows for main thread detection #4006

notgull · 2024-11-23T03:41:26Z

We can get a list of the threads in the process, and determine which
thread came first. This thread will be the one who called the main
function. So use this instead of the current strategy if it's available.

This aims to resolve #3999. @PJB3005 would you be able to test this in the dylib configuration?

Tested on all platforms changed
Added an entry to the changelog module if knowledge of this change could be valuable to users
Updated documentation to reflect any user-facing changes, including notes of platform-specific behavior
Created or updated an example program if it would help users understand this functionality
Updated feature matrix, if new features were added or implemented

madsmtm

IMO all this is all a bit much for just a debug check to make developing on other platforms easier...

Why, again, isn't this just a warning that can be turned off?

notgull · 2024-11-23T03:56:28Z

Why, again, isn't this just a warning that can be turned off?

It's to make platforms consist. Ideally every platform would act the exact same way to prevent differences in platforms from becoming too large. Not to mention, it can be turned off via the any_thread method.

src/platform_impl/windows/event_loop.rs

PJB3005 · 2024-11-23T14:51:16Z

src/platform_impl/windows/event_loop.rs

+        let mut slot = mem::MaybeUninit::uninit();
+        let result = if self.first_entry {
+            self.first_entry = false;
+            unsafe { toolhelp::Thread32First(self.handle.as_raw_handle() as _, slot.as_mut_ptr()) }
+        } else {
+            unsafe { toolhelp::Thread32Next(self.handle.as_raw_handle() as _, slot.as_mut_ptr()) }
+        };


You need to initialize the dwSize field of THREADENTRY32, right now this call just errors and no threads get iterated.

PJB3005 · 2024-11-23T15:15:14Z

Yeah this seems to work and is probably watertight enough. Though I have to point out it's possible for it to false negative if you terminate the real main thread by doing something like calling ExitThread() manually on it, then it might think another thread is the main thread. (https://devblogs.microsoft.com/oldnewthing/20100827-00/?p=13023)

I do still want to point out though that this really should be put behind cfg(debug_assertions). This is a lot of code to be running for a developer debug assert and there is no good reason to check any of this on release builds. Best case scenario you save some modding community or ~user with weird technical issue~ 5 years in the future.

PJB3005 · 2024-11-23T15:18:12Z

src/platform_impl/windows/event_loop.rs

+    fn new(process_id: u32) -> Result<Self, EventLoopError> {
+        // Take a snapshot.
+        let handle =
+            unsafe { toolhelp::CreateToolhelp32Snapshot(toolhelp::TH32CS_SNAPTHREAD, process_id) };


You can pass 0 to CreateToolhelp32Snapshot, no need to pass the current process ID.

We use the process ID in other places, so I figure why not.

According to the docs:

This parameter can be zero to indicate the current process.

So it's not really necessary.

I figure it's slightly clearer to whoever is reading who may not be familiar with this API.

notgull · 2024-11-23T17:19:58Z

Though I have to point out it's possible for it to false negative if you terminate the real main thread by doing something like calling ExitThread() manually on it, then it might think another thread is the main thread.

This is weird enough and requires enough unsafe code that I'm fine if someone gets around it like this.

I do still want to point out though that this really should be put behind cfg(debug_assertions). This is a lot of code to be running for a developer debug assert and there is no good reason to check any of this on release builds. Best case scenario you save some modding community or ~~user with weird technical issue~~ 5 years in the future.

Like I said, all platforms should have uniform behavior. Differences between platforms should be considered a bug, save for features not being supported.

We can get a list of the threads in the process, and determine which thread came first. This thread will be the one who called the main function. So use this instead of the current strategy if it's available. Signed-off-by: John Nunley <dev@notgull.net>

PJB3005 · 2024-11-23T22:33:44Z

Like I said, all platforms should have uniform behavior. Differences between platforms should be considered a bug, save for features not being supported.

I understand that, however the only reason "consistency between platforms" is even being enforced here is for development reasons.

There is no situation where library consumers ever want this code to be running on a release build. Best case scenario the bomb never goes off. Worst case scenario there there is another scenario that can break our assumption, and it blows up in somebody's face on a shipped game 5 years in the future. Maybe that's triggered by some compat layer like Wine. Maybe some modder is trying to do something funny. Regardless, this check serves no positive purpose once the application leaves the developers' machine, and that is best ensured by removing it.

As a developer I myself would not like this kind of code to be present when I am shipping a game. There is no good reason to have this risk factor in a production-grade library. "You can turn it off with any_thread" sure, but how many developers are aware of this? The fact is that devs trust libraries like this to not put landmines in their application in the first place.

Think of this with a thought experiment:

Would you have agreed to put it behind cfg(debug_assertion) if we were not able to improve the detection over the original method, which we know is flawed in practice.
Can we prove that this current check is watertight and will never malfunction in the future? In 5 years? 10? 15?

By removing this you're reducing moving parts, you're removing unnecessary work (seriously, why would my game, on startup on any person's PC, seriously need to iterate all threads in the OS?), you're just reducing the chance of things going wrong.

PJB3005

I also think keeping the main_thread_id_via_crt at this point is pointless. I assume we can envision no scenario in which main_thread_id_via_snapshot will ever fail, but we do know scenarios in which main_thread_id_via_crt fails. Because of this we should at least remove it, so that if main_thread_id_via_snapshot fails we do not end back up with the original issue unaddressed.

notgull mentioned this pull request Nov 23, 2024

Make Windows main thread checks more forgiving #4003

Open

5 tasks

madsmtm reviewed Nov 23, 2024

View reviewed changes

notgull force-pushed the notgull/winmain branch from 12fd29b to 849a8df Compare November 23, 2024 03:55

notgull requested a review from kchibisov as a code owner November 23, 2024 03:55

notgull commented Nov 23, 2024

View reviewed changes

src/platform_impl/windows/event_loop.rs Outdated Show resolved Hide resolved

PJB3005 reviewed Nov 23, 2024

View reviewed changes

notgull force-pushed the notgull/winmain branch from 849a8df to 0767a5f Compare November 23, 2024 17:18

notgull force-pushed the notgull/winmain branch from 0767a5f to 9c9dde9 Compare November 23, 2024 17:20

PJB3005 reviewed Nov 23, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use a better strategy on Windows for main thread detection #4006

Use a better strategy on Windows for main thread detection #4006

notgull commented Nov 23, 2024

madsmtm left a comment

notgull commented Nov 23, 2024

PJB3005 Nov 23, 2024

notgull Nov 23, 2024

PJB3005 commented Nov 23, 2024 •

edited

Loading

PJB3005 Nov 23, 2024

notgull Nov 23, 2024

PJB3005 Nov 23, 2024

notgull Nov 23, 2024

notgull commented Nov 23, 2024

PJB3005 commented Nov 23, 2024 •

edited

Loading

PJB3005 left a comment

Use a better strategy on Windows for main thread detection #4006

Are you sure you want to change the base?

Use a better strategy on Windows for main thread detection #4006

Conversation

notgull commented Nov 23, 2024

madsmtm left a comment

Choose a reason for hiding this comment

notgull commented Nov 23, 2024

PJB3005 Nov 23, 2024

Choose a reason for hiding this comment

notgull Nov 23, 2024

Choose a reason for hiding this comment

PJB3005 commented Nov 23, 2024 • edited Loading

PJB3005 Nov 23, 2024

Choose a reason for hiding this comment

notgull Nov 23, 2024

Choose a reason for hiding this comment

PJB3005 Nov 23, 2024

Choose a reason for hiding this comment

notgull Nov 23, 2024

Choose a reason for hiding this comment

notgull commented Nov 23, 2024

PJB3005 commented Nov 23, 2024 • edited Loading

PJB3005 left a comment

Choose a reason for hiding this comment

PJB3005 commented Nov 23, 2024 •

edited

Loading

PJB3005 commented Nov 23, 2024 •

edited

Loading