Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bitfields raw pointer accessors #2674

Closed
y86-dev opened this issue Oct 28, 2023 · 13 comments · Fixed by #2876
Closed

bitfields raw pointer accessors #2674

y86-dev opened this issue Oct 28, 2023 · 13 comments · Fixed by #2876
Labels
enhancement help wanted rust-for-linux Issues relevant to the Rust for Linux project

Comments

@y86-dev
Copy link

y86-dev commented Oct 28, 2023

When creating bindings for a struct with bitfields the accessor functions that are created take an immutable receiver (&self). This is a problem if the underlying type is used by both Rust and C, since Rust is not allowed to create &self references, since the C side might modify the value at any time.
One solution could be to create raw accessor functions that allow access via *const Self.

Input C Header

struct foo {
    unsigned available:1;
};

Bindgen Invocation

$ bindgen input.h
impl foo {
    #[inline]
    pub fn available(&self) -> ::std::os::raw::c_uint {
        unsafe { ::std::mem::transmute(self._bitfield_1.get(0usize, 1u8) as u32) }
    }
    #[inline]
    pub fn set_available(&mut self, val: ::std::os::raw::c_uint) {
        unsafe {
            let val: u32 = ::std::mem::transmute(val);
            self._bitfield_1.set(0usize, 1u8, val as u64)
        }
    }
    #[inline]
    pub fn new_bitfield_1(
        available: ::std::os::raw::c_uint,
    ) -> __BindgenBitfieldUnit<[u8; 1usize]> {
        let mut __bindgen_bitfield_unit: __BindgenBitfieldUnit<[u8; 1usize]> = Default::default();
        __bindgen_bitfield_unit.set(0usize, 1u8, {
            let available: u32 = unsafe { ::std::mem::transmute(available) };
            available as u64
        });
        __bindgen_bitfield_unit
    }
}

Expected Results

Additional functions that should be generated:

impl foo {
    #[inline]
    pub unsafe fn raw_available(this: *const Self) -> ::std::os::raw::c_uint {
        unsafe {
            ::std::mem::transmute(__BindgenBitfieldUnit::raw_get(
                addr_of!((*this)._bitfield_1),
                0usize,
                1u8,
            ) as u32)
        }
    }

    #[inline]
    pub unsafe fn raw_set_available(this: *mut Self, val: ::std::os::raw::c_uint) {
        unsafe {
            let val: u32 = ::std::mem::transmute(val);
            __BindgenBitfieldUnit::raw_set(addr_of!((*this)._bitfield_1), 0usize, 1u8, val as u64)
        }
    }
}

This also requires changes to the generated __BingenBitfieldUnit.

@tgross35
Copy link
Contributor

@pvdrz gentle nudge, do you have any thoughts on the proposal here? I think RFL may have a use case

@tgross35
Copy link
Contributor

Additional context since I keep forgetting: https://lore.kernel.org/rust-for-linux/ba9614cf-bff6-4617-99cb-311fe40288c1@proton.me/ and its parent

Basically we have a C struct that has interior mutability on some fields, so usually it is in an UnsafeCell. Using these bitfield methods mean that you need to create an &self without the UnsafeCell - that is, you have a reference to something with interior mutability but the compiler has no way of knowing it has that interior mutability. So incorrect things can potentially happen quietly.

@Lokathor
Copy link
Contributor

When you say "since the C side might modify the value at any time.", I'm unclear if you mean other threads will modify the data concurrently?

Otherwise, if this is all within a single thread, then there's not really a problem as far as I can tell.

@tgross35
Copy link
Contributor

In the use case, we have an `UnsafeCell< pointer to a struct with:

  • Some fields that never change while we have it (i.e. are const or locking is handled by the caller)
  • Some fields that are mutex-protected and could change if we are preempted while using it

The issue is that it is in general incorrect to take &* reference to the struct to access the const fields because the other could change, but that is required for bitfield access. I think the risk for UB here is exceedingly low if you are not also accessing the locked fields, but it is a bit of an easy footgun telling rustc something that isn't true.

I'm not 100% sure I understand this all myself and the locking story is complicated, so I may be misrepresenting some of it.

@Lokathor
Copy link
Contributor

Well, unsafecell doesn't protect you from any concurrency problems. If there's supposed to be a lock, then the lock must be held to correctly read or write the available field even if a &foo reference is never actually made, because raw pointers still have to follow data race rules.

@tgross35
Copy link
Contributor

I don't think the parent issue showcases the problem well, it's more like:

#[repr(C)]
struct foo {
    // this is a bitfield
    field: u8,
    mtx: mutex_t,
    // mutex protected
    protected: u8
}

// this is how you reference the type normally
struct FooWrapper(UnsafeCell<foo>);

Then you must construct a &foo via unsafe { &*foo_wrapper.0.get() } to use these bitfield accessors. The problem is with:

  1. Context A creates &foo and access field
  2. Preemption hits, context B locks mtx, changes protected, and unlocks
  3. Context A resumes and &foo points to changed data, rustc has no way of knowing about it
  4. Context A accesses field

This is as far as it goes, and I think the incorrectness is purely semantic meaning this is something miri would flag if it could (I believe), but there is no actual UB.

@tgross35
Copy link
Contributor

I guess there could be a theoretical race condition if context A then tries to lock mtx and access protected. Locking the mutex will prevent reordering at that level - but maybe rustc could take the "const & with no UnsafeCell never ever changes" very literally and do some reordering itself. So it could load field and protected at the same time (step 4 above) and then lock mtx - meaning you would be getting an unsynchronized read of protected.

That seems like a stretch but maybe not impossible? Probably need @y86-dev to clarify things

@Lokathor
Copy link
Contributor

I think I follow now. The quick fix might be to distribute the UnsafeCell over the individual fields of foo, but that's got some ergonomic drawbacks i expect

@y86-dev
Copy link
Author

y86-dev commented Nov 24, 2023

I guess there could be a theoretical race condition if context A then tries to lock mtx and access protected. Locking the mutex will prevent reordering at that level - but maybe rustc could take the "const & with no UnsafeCell never ever changes" very literally and do some reordering itself. So it could load field and protected at the same time (step 4 above) and then lock mtx - meaning you would be getting an unsynchronized read of protected.

That seems like a stretch but maybe not impossible? Probably need @y86-dev to clarify things

Your example from above is much better than mine, since it actually has a mutex. But the protected field should also be a bitfield, so

// bingden generates
struct Foo {
    protected: u8, // bitfield
    mtx: mutex,
}

The problem lies with the implementation of mutex, since it contains a wait_list, an intrusive circular doubly linked list that contains all threads waiting for the mutex. This list might change even if we lock the mutex (since at any time another thread might try to lock it, but since we hold the lock, it goes to sleep).

I think I follow now. The quick fix might be to distribute the UnsafeCell over the individual fields of foo, but that's got some ergonomic drawbacks i expect

Since Foo is generated by bingden, I do not know if that is possible.

@Lokathor
Copy link
Contributor

I definitely don't think it's currently possible, someone would have to add that ability to bindgen.

@y86-dev
Copy link
Author

y86-dev commented Nov 24, 2023

I would prefer the solution outlined in this issue, since relying on raw pointers also avoids the aliasing problems we would have when having a &mut T instead of a &T.

@Lokathor
Copy link
Contributor

You do lose the ability to use method syntax, which makes the API kinda weird, but otherwise it's also fine, yes.

@emilio
Copy link
Contributor

emilio commented Feb 2, 2024

Sounds good to add this fwiw.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement help wanted rust-for-linux Issues relevant to the Rust for Linux project
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants