Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a way to const-construct a sem_t #2015

Open
Diggsey opened this issue Dec 31, 2020 · 16 comments
Open

Provide a way to const-construct a sem_t #2015

Diggsey opened this issue Dec 31, 2020 · 16 comments
Labels
C-API-request Category: API request

Comments

@Diggsey
Copy link

Diggsey commented Dec 31, 2020

(mem::zeroed() is never going to be const, and MaybeUninit::zeroed().assume_init() is still not const)

@Diggsey Diggsey added the C-API-request Category: API request label Dec 31, 2020
@djarek
Copy link
Contributor

djarek commented Jan 2, 2021

POSIX allows semaphores to be implemented using file descriptors. Unless I'm missing something, this means that there's no portable way of doing const initialization of sem_t (even in C and C++).

@Diggsey
Copy link
Author

Diggsey commented Jan 2, 2021

You would still need to call sem_init before you start using the semaphore. This is just to allow declaring a static variable of the appropriate type.

@djarek
Copy link
Contributor

djarek commented Jan 3, 2021

What's wrong with this?

static mut sem: MaybeUninit<libc::sem_t> = MaybeUninit::uninit();

And to initialize it you use:

let ret = unsafe { libc::sem_init(sem.as_mut_ptr(), 0, 0) };

Obviously, it's not very nice to use, but I don't think that anything better can be provided, because sem_t doesn't seem to have any (public) sentinel values. So any value is garbage unless it came from sem_init call.

@Diggsey
Copy link
Author

Diggsey commented Jan 3, 2021

That is UB as soon as you have more than one thread accessing sem because as_mut_ptr takes a mutable borrow of self. To get this to work you actually need:

static mut sem: UnsafeCell<MaybeUninit<libc::sem_t>> = UnsafeCell::new(MaybeUninit::uninit());

(Note: static mut is still needed because the target is not Sync)

Furthermore, you cannot use the methods on MaybeUninit to get at the inner value at all, because they again take &mut self, so you have to do pointer casts or transmutes from *mut MaybeUninit<libc::sem_t> to *mut libc::sem_t on every access to the semaphore.

I don't think I need to explain why this is terrible...

@maxbla
Copy link
Contributor

maxbla commented Feb 21, 2021

@Diggsey It sounds like you actually want a static sem_t, not a const sem_t (you don't need one for the other). If so, this is the domain of lazy_static or once_cell. Specifically,

lazy_static! {
    static ref SEMAPHORE: libc::sem_t = {
        let mut semaphore: MaybeUninit<libc::sem_t> = MaybeUninit::uninit();
        libc::sem_init(semaphore.as_mut_ptr(), /*other args*/);
        unsafe {semaphore.assume_init()}
    };
}

std::mem::zeroed() might be a better option than MaybeUninit, since you could avoid the unsafe.

Hope that helps.

@Diggsey
Copy link
Author

Diggsey commented Feb 21, 2021

@maxbla No, that is just a hacky workaround that won't suffice for my use-case (accessing the semaphore from a signal handler).

Only certain operations on a sem_t are guaranteed to be async-signal-safe, crates like lazy_static and once_cell are not guaranteed to be async-signal-safe, and so should not be used from a signal handler.

There's no reason why sem_t cannot be const-constructible (it is in C/C++).

@maxbla
Copy link
Contributor

maxbla commented Feb 22, 2021

Yeah, rust doesn't generally play nicely with async signal safety. Luckily, transmute is const, so I think you can still accomplish what you're after without any additions to libc.

// AlignedSem has the same alignment and size as a sem_t
#[repr(C)]
union AlignedSem {
    array: [u8; size_of::<libc::sem_t>()],
    sem: libc::sem_t,
}

// particularly unsafe - subverts detection of use-before-initialize bugs
static mut MUTEX: libc::sem_t = unsafe {
    std::mem::transmute(AlignedSem{array: [0u8; size_of::<libc::sem_t>()]})
};

fn main() {
    unsafe {libc::sem_init(&mut MUTEX as *mut _, 0, 1)};
    // register signal handlers here
}

Rust really makes this hard - the need for the AlignedSem type is unfortunate.

@Diggsey
Copy link
Author

Diggsey commented Feb 22, 2021

unsafe {libc::sem_init(&mut MUTEX as *mut _, 0, 1)};

As mentioned, creating a mutable borrow to the static variable is UB, since it can be accessed by multiple threads.

@maxbla
Copy link
Contributor

maxbla commented Feb 22, 2021

You would still need to call sem_init before you start using the semaphore. This is just to allow declaring a static variable of the appropriate type.

I did this

mem::zeroed() is never going to be const, and MaybeUninit::zeroed().assume_init() is still not const

I made a const, zeroed sem_t in the static

As mentioned, creating a mutable borrow to the static variable is UB, since it can be accessed by multiple threads.

As I constructed it, that mutable borrow was the first line in main. As long as you call sem_init() before any threads are spawned, UB is not possible. Hey, while you're at the start of main, make a *mut libc::sem_t that you can alias instead of &muts (which is not UB).

let mutex_ptr = unsafe {&mut MUTEX as *mut _};
// using mutex_ptr might alias *mut, but that is allowed
let res = unsafe{libc::sem_init(mutex_ptr, 0, 1)};

sem_post() is async signal safe, so you can safely clone mutex_ptr into multiple threads and sem_post() it. I don't see that being too useful, so I have a feeling you want to to sem_getvalue, which isn't async signal safe.

Alternatively, use a thread_local! to avoid aliasing except by signal handlers.

It sounds like everything you asked for is possible, just the specific examples of code given to you were not suitable for your use case. If you either wrote a minimal example of what you're doing in C or provided an example of an api libc could add that would do what you need, this issue would be actionable.

@Diggsey
Copy link
Author

Diggsey commented Feb 22, 2021

@maxbla there is a workaround, which I presented here: #2015 (comment)

The idea of storing the *mut sem_t in a separate static mut is one I had not thought of - that would also work, and would eliminate the need for an UnsafeCell at least.

That said, the fact that these workarounds are so obscure, and that it's so easy to accidentally get UB, is reason enough to add a const-constructor to sem_t.

Furthermore, libc is supposed to provide the same functionality as is available in C/C++ and in those languages the sem_t type is const-constructible.

@maxbla
Copy link
Contributor

maxbla commented Feb 22, 2021

The idea of storing the *mut sem_t in a separate static mut is one I had not thought of - that would also work, and would eliminate the need for an UnsafeCell at least.

mutex_ptr can't be static since it isn't Sync. The best way is probably to use a static UnsafeCell<libc::sem_t> with the silly AlignedSem/transmute trick I used. For anyone reading just this comment, the final result is

// AlignedSem has the same alignment and size as a sem_t
#[repr(C)]
union AlignedSem {
    array: [u8; size_of::<libc::sem_t>()],
    sem: libc::sem_t,
}

// SAFETY: see std::mem::zeroed()
static mut MUTEX: UnsafeCell<libc::sem_t> = unsafe {
    let zeroed_sem = std::mem::transmute(AlignedSem{array: [0u8; size_of::<libc::sem_t>()]});
    UnsafeCell::new(zeroed_sem)
};

fn main() {
    unsafe {libc::sem_init(MUTEX.get(), 0, 1)};
    // register signal handlers here
}

libc is supposed to provide the same functionality as is available in C/C++ and in those languages the sem_t type is const-constructible.

libc is supposed to be a low-level binding to system types, constants, and functions. The fact that it provides the same functionality is merely a consequence of this. For example, C doesn't have a function to statically zero memory, it has a language feature - that all global and static variables are automatically initialized to zero, but also {0}, as in sem_t semaphore = {0}. By this criterion, libc should not include such a feature (but unsafe Rust should).

the fact that these workarounds are so obscure, and that it's so easy to accidentally get UB, is reason enough to add a const-constructor to sem_t

If you're writing signal handlers (in Rust or C), you need some obscure knowledge. Hopefully std::MaybeUninit::zeroed becomes const soon and the required knowledge becomes less obscure. For now, no other type in this crate has its own special special const constructor and I don't see what makes sem_t special. This seems more like the territory of nix.

@Diggsey
Copy link
Author

Diggsey commented Feb 22, 2021

mutex_ptr can't be static since it isn't Sync.

Luckily static mut doesn't need to be sync, and I would only need read-access to the pointer from the signal handler :)

For now, no other type in this crate has its own special special const constructor and I don't see what makes sem_t special

True, but sometimes language limitations require library-level solutions. TBH, I would be happy if all types defined by libc had a const fn new method.

@maxbla
Copy link
Contributor

maxbla commented Feb 26, 2021

I think this should be closed in favor of rust-lang/rust#62061

@DoumanAsh
Copy link
Contributor

Some bit of self-ad.
I have crate to wrap platform APIs for semaphores https://github.com/DoumanAsh/semka
It provides const initialization as unsafe function and requires you to perform init when you need to use it.

@sweihub
Copy link

sweihub commented Nov 29, 2022

@DoumanAsh Thanks for your self-ad semaphore, however it's a local one not inter-process, will you add an inter-process one?

@RalfJung
Copy link
Member

mem::zeroed() is never going to be const

In fact it is const since Rust 1.75. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-API-request Category: API request
Projects
None yet
Development

No branches or pull requests

6 participants