Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Futex Rework #1098

Merged
merged 13 commits into from
Aug 23, 2024
Merged

Futex Rework #1098

merged 13 commits into from
Aug 23, 2024

Conversation

danielschemmel
Copy link
Contributor

@danielschemmel danielschemmel commented Jul 26, 2024

This PR splits futex into a zoo of futex_* functions with different and correctly typed arguments.

There are still a few open issues and questions:

  1. The futex syscall treats one of its arguments as either a nullable *const Timespec or a u32, depending on the context. The futex_time64 fallback to futex_old uses NOSYS as its deciding factor, which can also be generated in other ways. This may lead to futex_old dereferencing an invalid pointer (because it is null, or not a pointer in the first place). As we now have different entrypoints that know whether the value should be a nullable pointer or a u32, I would suggest splitting up the backend futex functions into two variants (each) with different signatures. This would prevent a fallback to futex_old happening when the argument is not a pointer. Adding the null check is then trivial.
  2. There are two flags that may sometimes change the function behavior. The first flag, FUTEX_PRIVATE_FLAG basically just says whether the lock(s) are guaranteed to be process-private (for some kernel optimizations). It is available for every single function. The second flag FUTEX_CLOCK_REALTIME changes how the timeout is interpreted in a few functions. Removing those flags by way of introducing more functions is viable, but I am unsure if it really makes sense to add that many more functions. Since their meaning is very different, I would suggest splitting them into two enums (FutexPrivacy::Shared and FutexPrivacy::ProcessPrivate, and FutexClock::Realtime and FutexClock::Monotone) and making them two arguments. It might even make sense to take them as compile time parameters. What do you think?
  3. There are (already) a lot of functions. Instead of having them be global functions, there could be a struct Futex<'a>(&'a AtomicU32); (and similarly FutexPI), which then could show only the functions applicable to that scenario. E.g., Futex(&word).cmp_requeue(...) and FutexPI(&word).cmp_requeue(...) would call futex_cmp_requeue and futex_cmp_requeue_pi respectively without polluting each other's namespace. While this would make the API a bit nicer, I fear it would suggest that it does more than just the syscalls. What do you prefer?
  4. While I have not done so right now, it should be possible to retain the old futex API, so as to be semver compatible - ideally with a #[deprecated]. Would this be important to you? If so, should FutexOperations be retained in a way that does not add any new variants?

This PR implements #214 and one point from #753 (futex's first argument should be Atomic) and builds on the changes from #1097.

@danielschemmel danielschemmel force-pushed the futex-rework branch 3 times, most recently from a515d21 to 89dac89 Compare July 26, 2024 23:38
@notgull
Copy link
Contributor

notgull commented Jul 27, 2024

It might be better to try this against the 1.0-staging branch, which already has some breaking changes.

@danielschemmel
Copy link
Contributor Author

My bad, would it make more sense to ensure there are no breaking changes, or to target 1.0-staging?

Copy link
Member

@sunfishcode sunfishcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this! I'll need more time to fully review this, but at a first glance this looks like a good direction.

  • The futex syscall treats one of its arguments as either a nullable *const Timespec or a u32, depending on the context. The futex_time64 fallback to futex_old uses NOSYS as its deciding factor, which can also be generated in other ways. This may lead to futex_old dereferencing an invalid pointer (because it is null, or not a pointer in the first place). As we now have different entrypoints that know whether the value should be a nullable pointer or a u32, I would suggest splitting up the backend futex functions into two variants (each) with different signatures. This would prevent a fallback to futex_old happening when the argument is not a pointer. Adding the null check is then trivial.

This sounds like a good idea to me.

  • There are two flags that may sometimes change the function behavior. The first flag, FUTEX_PRIVATE_FLAG basically just says whether the lock(s) are guaranteed to be process-private (for some kernel optimizations). It is available for every single function. The second flag FUTEX_CLOCK_REALTIME changes how the timeout is interpreted in a few functions. Removing those flags by way of introducing more functions is viable, but I am unsure if it really makes sense to add that many more functions. Since their meaning is very different, I would suggest splitting them into two enums (FutexPrivacy::Shared and FutexPrivacy::ProcessPrivate, and FutexClock::Realtime and FutexClock::Monotone) and making them two arguments. It might even make sense to take them as compile time parameters. What do you think?

Could you say more about what the advantage of this would be over the current FutexFlags approach?

  • There are (already) a lot of functions. Instead of having them be global functions, there could be a struct Futex<'a>(&'a AtomicU32); (and similarly FutexPI), which then could show only the functions applicable to that scenario. E.g., Futex(&word).cmp_requeue(...) and FutexPI(&word).cmp_requeue(...) would call futex_cmp_requeue and futex_cmp_requeue_pi respectively without polluting each other's namespace. While this would make the API a bit nicer, I fear it would suggest that it does more than just the syscalls. What do you prefer?

I like how global functions makes it clear that they're "just functions"; as you say, it also feels like a wrapper type might suggest it does more. For namespacing purposes, what if we created a new pub mod futex and put all these new functions in it? Perhaps we could even drop the futex_ prefix from them, so users could do futex::wait instead of futex_wait.

  • While I have not done so right now, it should be possible to retain the old futex API, so as to be semver compatible - ideally with a #[deprecated]. Would this be important to you? If so, should FutexOperations be retained in a way that does not add any new variants?

Yes, I think it would be good to retain the old API and remain semver-compatible. A #[deprecated] makes sense to me too. I don't have an opinion about whether to hide new variants in FutexOperations.

src/backend/libc/thread/syscalls.rs Outdated Show resolved Hide resolved
src/backend/linux_raw/thread/syscalls.rs Outdated Show resolved Hide resolved
@danielschemmel danielschemmel force-pushed the futex-rework branch 5 times, most recently from ccd636f to 6012fad Compare July 27, 2024 19:49
@danielschemmel
Copy link
Contributor Author

All of @sunfishcode's suggestions should be implemented - thanks!

  • futex is now a deprecated function as well as a module that contains functions like wake and cmp_requeue_pi.
  • the backends are split into futex_val2 and futex_timespec, which should both handle the argument without potentially segfaulting.
    • the futex functions now performs dynamic dispatch to pick the right backend

As to splitting the PRIVATE and CLOCK flags:
The PRIVATE flag applies to each function, as it is a property of the futex word, while the CLOCK flag applies to the timeout parameter that only 5 out of 14 functions even take.
The drawback is increasing the number of arguments to 5 out of 14 functions by one.

@danielschemmel
Copy link
Contributor Author

I have added some tests (and fixed a stupid bug), and cleaned up the naming a bit.

One thing I also noticed is that the new functions probably don't really need any unsafe anymore: as they now take and return safe types, they can only really cause a deadlock - which a normal rust function with a loop { } can just as well.

@danielschemmel danielschemmel force-pushed the futex-rework branch 2 times, most recently from 7898d3f to 41ed719 Compare August 4, 2024 15:26
@sunfishcode
Copy link
Member

One thing I also noticed is that the new functions probably don't really need any unsafe anymore: as they now take and return safe types, they can only really cause a deadlock - which a normal rust function with a loop { } can just as well.

This sounds right. Deadlocks are not a safety issue in Rust. At the moment it appears functions like wait are still unsafe in this PR. Are you planning to remove the unsafe?

@danielschemmel
Copy link
Contributor Author

One thing I also noticed is that the new functions probably don't really need any unsafe anymore: as they now take and return safe types, they can only really cause a deadlock - which a normal rust function with a loop { } can just as well.

This sounds right. Deadlocks are not a safety issue in Rust. At the moment it appears functions like wait are still unsafe in this PR. Are you planning to remove the unsafe?

Yes, I'll do so right away, I just wanted to have your input before removing the unsafe.

@danielschemmel
Copy link
Contributor Author

Done.

One more note, libc includes FUTEX_WAITERS/FUTEX_OWNER_DIED as of 0.2.158, so if you are OK with updating the dependency, we can also use that in the libc backend.

@sunfishcode
Copy link
Member

Yes, updating the libc dependency to the latest 0.2 is fine.

@danielschemmel
Copy link
Contributor Author

Ah well, turns out libc's "linux-like" and rustix's "linux_kernel" are just different enough that this fails on android.

@sunfishcode
Copy link
Member

Looks good, thanks!

@sunfishcode sunfishcode merged commit b0d3681 into bytecodealliance:main Aug 23, 2024
43 checks passed
@sunfishcode
Copy link
Member

This is released in rustix 0.38.35.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants