Skip to content

Source for union claim? #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
workingjubilee opened this issue Dec 25, 2024 · 7 comments
Open

Source for union claim? #1

workingjubilee opened this issue Dec 25, 2024 · 7 comments

Comments

@workingjubilee
Copy link

Hello, I was reviewing "An Empirical Study of Rust-for-Linux".

Although Rust has a union primitive, it cannot provide ABI compatibility with C union. Thus RFL implements a struct called __BindgenUnionField with the same memory layout as the C interface.

In practice, rust-bindgen can use native unions if using rustc 1.19 or higher. A #[repr(C)] union is in fact ABI compatible with a C union. The reason for the __BindgenUnionField workaround is not because of ABI compatibility, but because... I don't know, have people just not read bindgen's documentation? https://rust-lang.github.io/rust-bindgen/using-unions.html?highlight=union#which-union-type-will-bindgen-generate

It only fails to emit them when the type's fields neither implement Copy nor are allowed to use ManuallyDrop.

I am curious from where you derived the empirical source of your claim?

@Richardhongyu
Copy link
Owner

Hi, Jubilee. Sorry for the late, I've been stuck on some school stuff in the past weeks.

Our observation is from bindings_generated.rs, generated by the rust-bindgen. This file has Rust code corresponding with the C header files. We found that despite Rust having a union type, there are still some cases in which rust-bindgen needs to use __BindgenUnionField for the same layout.

This is still true in the current time. I use the latest RFL code and the default Linux config to generate the bindings_generated.rs file. I have attached it behind and you can have a look.

For example, you can find __BindgenUnionField in bpf_array and mem_cgroup. The reason may be that the inner type does not implement the Copy trait. In this case, Rust's union type does not have the same ABI as the C union. Thus rust-bindgen needs __BindgenUnionField to provide the same memory layout between Rust and C.

#[repr(C)]
pub struct bpf_array {
    pub map: bpf_map,
    pub elem_size: u32_,
    pub index_mask: u32_,
    pub aux: *mut bpf_array_aux,
    pub __bindgen_anon_1: bpf_array__bindgen_ty_1,
}
#[repr(C)]
pub struct bpf_array__bindgen_ty_1 {
    pub __bindgen_anon_1: __BindgenUnionField<bpf_array__bindgen_ty_1__bindgen_ty_1>,
    pub __bindgen_anon_2: __BindgenUnionField<bpf_array__bindgen_ty_1__bindgen_ty_2>,
    pub __bindgen_anon_3: __BindgenUnionField<bpf_array__bindgen_ty_1__bindgen_ty_3>,
    pub bindgen_union_field: [u64; 0usize],
}
#[repr(C)]
#[repr(align(64))]
pub struct mem_cgroup {
    pub css: cgroup_subsys_state,
    pub id: mem_cgroup_id,
    pub __bindgen_padding_0: [u64; 5usize],
    pub memory: page_counter,
    pub __bindgen_anon_1: mem_cgroup__bindgen_ty_1,
    pub memory_peaks: list_head,
    pub swap_peaks: list_head,
    pub peaks_lock: spinlock_t,
    pub high_work: work_struct,
    pub vmpressure: vmpressure,
    pub oom_group: bool_,
    pub swappiness: ffi::c_int,
    pub events_file: cgroup_file,
    pub events_local_file: cgroup_file,
    pub swap_events_file: cgroup_file,
    pub vmstats: *mut memcg_vmstats,
    pub memory_events: [atomic_long_t; 9usize],
    pub memory_events_local: [atomic_long_t; 9usize],
    pub socket_pressure: ffi::c_ulong,
    pub kmemcg_id: ffi::c_int,
    pub objcg: *mut obj_cgroup,
    pub orig_objcg: *mut obj_cgroup,
    pub objcg_list: list_head,
    pub vmstats_percpu: *mut memcg_vmstats_percpu,
    pub cgwb_list: list_head,
    pub cgwb_domain: wb_domain,
    pub cgwb_frn: [memcg_cgwb_frn; 4usize],
    pub deferred_split_queue: deferred_split,
    pub nodeinfo: __IncompleteArrayField<*mut mem_cgroup_per_node>,
}
#[repr(C)]
#[repr(align(64))]
pub struct mem_cgroup__bindgen_ty_1 {
    pub swap: __BindgenUnionField<page_counter>,
    pub memsw: __BindgenUnionField<page_counter>,
    pub bindgen_union_field: [u8; 256usize],
}

bindings_generated.zip

@workingjubilee
Copy link
Author

"This is present in the code" does not a claim about ABI compatibility make.

@workingjubilee
Copy link
Author

"P -> Q, therefore P -> R" does not hold unless you inject some other assumption.

Your conclusion is "This is emitted by bindgen, therefore the union must be ABI incompatible."

Your assumption must be "They would obviously use union if it was at-all compatible, so if this is still emitted by bindgen, that must mean it is not compatible."

The reality is more like "This is emitted by bindgen because that's bindgen's default configuration, because it has frozen its defaults for no obvious reason, and no one changed the settings."

And bindgen can also handle using ManuallyDrop for the Copy case, but again, you need to configure it to do so. These, too, will have the same binary interface.

cc @ojeda @pvdrz

Ultimately,

In this case, Rust's union type does not have the same ABI as the C union.

This is false, and you have not actually supported the assertion. It has the same binary interface. You may be trying to assert it has a different logical type in Rust than in C, but that doesn't matter: the type does not matter to the C code in the same way. A #[repr(C)] union, with ManuallyDrop wrappers where necessary, will have the same physical binary interface.

@chorman0773
Copy link

chorman0773 commented Jan 15, 2025

In this case, Rust's union type does not have the same ABI as the C union.

Also, if it is true and the union is corretly using repr(C) (which is what bindgen uses), then that's a bug and should be reported. An example of such a case, including binary output on the platforms where it differs in ABI, would be useful, I'd assume.

@pvdrz
Copy link

pvdrz commented Jan 15, 2025

I agree with @workingjubilee, bindgen has maintained certain features as the default due to historical reasons and it is up to the user to enable the right features for the desired target version.

As Jubilee already explained, the custom union type produced by bindgen is only used due to the lack of union in older Rust versions, not due to an ABI incompatibility.

@Richardhongyu
Copy link
Owner

"This is present in the code" does not a claim about ABI compatibility make.

Thanks for pointing this out!

@Richardhongyu
Copy link
Owner

Richardhongyu commented Jan 16, 2025

The reality is more like "This is emitted by bindgen because that's bindgen's default configuration, because it has frozen its defaults for no obvious reason, and no one changed the settings."

And bindgen can also handle using ManuallyDrop for the Copy case, but again, you need to configure it to do so. These, too, will have the same binary interface.

However, I'm still confused by the flexible array case. Suppose bindgen can not generate the rust union with the correct configuration, can we say it is not ABI compatible? I tried to add --manually-drop-union ".*" in the rust-bindgen command and it turns out that bindgen can not use the rust union type when it meets unions with zero length.

For example, the union in the struct bpf_array still can not have corresponding rust union.

struct bpf_array {
	struct bpf_map map;
	u32 elem_size;
	u32 index_mask;
	struct bpf_array_aux *aux;
	union {
		DECLARE_FLEX_ARRAY(char, value) __aligned(8);
		DECLARE_FLEX_ARRAY(void *, ptrs) __aligned(8);
		DECLARE_FLEX_ARRAY(void __percpu *, pptrs) __aligned(8);
	};
};

#define DECLARE_FLEX_ARRAY(TYPE, NAME) \
	__DECLARE_FLEX_ARRAY(TYPE, NAME)

#define __DECLARE_FLEX_ARRAY(TYPE, NAME)	\
	struct { \
		struct { } __empty_ ## NAME; \
		TYPE NAME[]; \
	}
#[repr(C)]
pub struct bpf_array {
    pub map: bpf_map,
    pub elem_size: u32_,
    pub index_mask: u32_,
    pub aux: *mut bpf_array_aux,
    pub __bindgen_anon_1: bpf_array__bindgen_ty_1,
}
#[repr(C)]
pub struct bpf_array__bindgen_ty_1 {
    pub __bindgen_anon_1: __BindgenUnionField<bpf_array__bindgen_ty_1__bindgen_ty_1>,
    pub __bindgen_anon_2: __BindgenUnionField<bpf_array__bindgen_ty_1__bindgen_ty_2>,
    pub __bindgen_anon_3: __BindgenUnionField<bpf_array__bindgen_ty_1__bindgen_ty_3>,
    pub bindgen_union_field: [u64; 0usize],
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants