Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run bindgen on linux uapi? #2

Closed
laanwj opened this issue Nov 2, 2019 · 28 comments · Fixed by #8
Closed

Run bindgen on linux uapi? #2

laanwj opened this issue Nov 2, 2019 · 28 comments · Fixed by #8

Comments

@laanwj
Copy link

laanwj commented Nov 2, 2019

What I've noticed is that none of the Rust syscall projects run bindgen on Linux's uapi headers. The uapi headers define the interface between user space and kernel space:

Maybe I'm missing something, but I don't understand why not—I suppose it's no help with the syscalls or numbers themselves (it declares no functions), but would give structures and constants, at least.

Tried to look around but there's no linux-sys crate. There is input-linux-sys though. Maybe it's just too much?

@elichai
Copy link
Owner

elichai commented Nov 3, 2019

I think because it's really sparsed over headers.
I will try to see what exact headers are interesting and try to come up with a script to autogenerate everything. that would be pretty great (though some of this stuff can/should be converted to rust enums which bindgen isn't so great doing that)

Right now I'm using https://github.com/rust-lang/libc for the types. but having this autogenerated would be much more fool proof.

@laanwj
Copy link
Author

laanwj commented Nov 3, 2019

Right now I'm using https://github.com/rust-lang/libc for the types. but having this autogenerated would be much more fool proof.

I saw that, which is what made me wonder this. I thought it's the wrong way around, after all, libc provides a wrapper over the kernel API which is often different from the API itself.

@elichai
Copy link
Owner

elichai commented Nov 3, 2019

Yep. one simple example is rust-lang/libc#1561.
libc accepts flock but inside of it it will convert it into flock64 in some cases.

@elichai
Copy link
Owner

elichai commented Nov 3, 2019

Another problem.
if we takeO_APPEND for example
it's in the generic headers(./include/uapi/asm-generic/fcntl.h) and sometimes in the non-generic ones (i.e. ./arch/mips/include/uapi/asm/fcntl.h)
Using ifndefs. can we be sure it's handled correctly by bindgen?

(I actually just tested exactly this and bindgen worked suprisingly well and gave me O_APPEND=8)

@laanwj
Copy link
Author

laanwj commented Nov 3, 2019

Right, assuming the cbindgen "wrapper" file is made to include the headers in the right way, it should be correct. It is supposed to handle that in the same way C does.

I think because it's really sparsed over headers.

Absolutely. Normally the real only clients of uapi are C libraries, and low-level platform specific software such as mesa. If you're not going to implement graphics drivers in rust, you can skip all the drm headers for example.

(though some of this stuff can/should be converted to rust enums which bindgen isn't so great doing that)

True, it doesn't provide any kind of rust-friendly binding, the output is very raw. But it's good to include from in more friendly bindings, often.

@elichai
Copy link
Owner

elichai commented Nov 3, 2019

Feel free to look at a first try at that https://github.com/elichai/syscalls-rs/tree/linux-sys

@laanwj
Copy link
Author

laanwj commented Nov 4, 2019

Nice! seems a good step towards no_std support.

@elichai
Copy link
Owner

elichai commented Nov 4, 2019

I'm a bit scared to use these without a serious review (I'm not sure if it also uses my system headers for some of the includes which will kinda ruin the whole thing) but i'll keep working on it

@elichai
Copy link
Owner

elichai commented Nov 4, 2019

for example. just by adding also ./linux/include to C_INCLUDE_PATH the generated bindings contains more things.
@theuni

@laanwj
Copy link
Author

laanwj commented Nov 4, 2019

Good point. Isn't there a bindgen flag to ignore system headers?

Linux uapi headers are more or less self-sufficient and not depend on any libc headers or anything else installed on the system.

Possibly, doing this properly is going to require patching bindgen 😨

@elichai
Copy link
Owner

elichai commented Nov 4, 2019

I think i've found the right solution :)
a combination of make headers_install ARCH=$arch and the fact that bindgen is actually calling clang so I can pass clang flags (i.e. -nostdinc)

@elichai
Copy link
Owner

elichai commented Nov 4, 2019

Pushed an update to https://github.com/elichai/syscalls-rs/tree/linux-sys
I'm a bit more confident in the correctness of this now.
will need to try and maximize testing for different platforms.
and also try and write tests that will compare the sizes of the types here versus rust-lang/libc

EDIT: Should this live on a separate git?

@laanwj
Copy link
Author

laanwj commented Nov 4, 2019

and also try and write tests that will compare the sizes of the types here versus rust-lang/libc

Here it's both a blessing and a curse that bindgen will use types such as ::std::os::raw::c_schar ::std::os::raw::c_int.
On one hand it introduces a dependency on std (this can be overridden with --ctypes-prefix IIRC there's some work on moving these to core, looks like the "official" no-std crate for this is cty ), on the other, at least this guarantees that the sizes match the C compiler for the architecture actually compiled for—not the one running bindgen.

EDIT: Should this live on a separate git?

That's up to you. There doesn't have to be a one-to-one crate to git repository relationship, and it's often the case that the -sys crate is part of the same repository.

@elichai
Copy link
Owner

elichai commented Nov 4, 2019

On one hand it introduces a dependency on std (this can be overridden with --ctypes-prefix IIRC there's some work on moving these to core or another crate), on the other, at least this guarantees that the sizes match the C compiler for the architecture actually compiled for—not the one running bindgen.

I was involved in that discussion in the past. it's kinda stuck (people don't want OS/arch specific stuff in libcore)
But at least they do want to take cty under the rust-lang umbrella (which is honestly good enough for me) rust-lang/libc#1286

Anyhow, there's still some problems i'm seeing with bindgen, specifically rust-lang/rust-bindgen#1665

@laanwj
Copy link
Author

laanwj commented Nov 4, 2019

(which is honestly good enough for me)

Same. Just filed a PR for RISC-V support, should be good to go: japaric/cty#16

@laanwj
Copy link
Author

laanwj commented Nov 4, 2019

Anyhow, there's still some problems i'm seeing with bindgen, specifically rust-lang/rust-bindgen#1665

Ugh. That'd be really hard to detect and convert on the bindgen side. We should make sure that __i386__ and __ILP32__ and such are defined correctly for the target platform when invoking bindgen (instead of relying on clang to do this).

@elichai
Copy link
Owner

elichai commented Nov 4, 2019

I think it mostly depends on if llvm tokenizes ifdefs, or the preprocessor runs before codegen.
(because IIRC bindgen literally just uses llvm IR back and forth)

how can we define macros like i386? aren't they builtins?

@laanwj
Copy link
Author

laanwj commented Nov 4, 2019

how can we define macros like i386? aren't they builtins?

I suppose bindgen only runs the preprocessor, so they could be manually defined? If not, we kind of need a --no-default-defines flag.

Edit: but at least we understand now why linux-sys crate wasn't a thing yet 😄

@elichai
Copy link
Owner

elichai commented Nov 4, 2019

hehe yeah.
I have a feeling the only way will be inputting a bunch of #undef. but I hope I'm wrong

@dvc94ch
Copy link
Contributor

dvc94ch commented Nov 4, 2019

is the linux uapi supposed to be sufficient?

AF_INET and SOCK_DGRAM are declared in include/linux/socket.h and not in include/uapi/linux/socket.h. This makes me kind of confused about the purpose of uapi in the first place.

@elichai
Copy link
Owner

elichai commented Nov 4, 2019

Sadly no.
we can try to push upstream (linux kernel) to expose more constants needed for syscalls (like AF_INET and SHUT_*)
But currently some of the syscalls are a mess in terms of headers.

@laanwj
Copy link
Author

laanwj commented Nov 4, 2019

Huh I didn't know. Looks like uapi is still a work in progress, of untangling the internal headers from userspace interface ones.

@elichai
Copy link
Owner

elichai commented Nov 4, 2019

I can try to take a few constants and send a patch to the linux mailing list and see how people react to that idea.
and then we can slowly add support for anything we're missing (I hope not too many lol)

@laanwj
Copy link
Author

laanwj commented Nov 5, 2019

The headers_install step is a good idea, it works around that issue for now.

@dvc94ch
Copy link
Contributor

dvc94ch commented Nov 5, 2019

The headers_install step only installs the uapi. I think that the most effective approach is to start by using libc to implement and scope out what is required, then taking the pragmatic approach of copying the required definitions into a local libc-defs and then incrementally send patches upstream until libc-defs becomes obsolete.

@laanwj
Copy link
Author

laanwj commented Nov 5, 2019

So there are implicit definitions in the C ilbrary headers that just happen to match the value the kernel implementation expects?

That's worse than I thought, I assumed it was selective inside the Linux kernel where headers get installed from. But so some headers are completely external.

Even worse, I suppose, if these definitions can differ per platform.

I can't really use the libc crate at the moment (building in a no_std environment), but I'm starting to think this was all just a bad idea 😄 I have no real need for this and was just wondering how far I could take it.

@elichai
Copy link
Owner

elichai commented Nov 5, 2019

You can use the libc crate on no_std with passing default-features = false https://github.com/rust-lang/libc/blob/master/Cargo.toml#L26

Anyhow. yeah. I hope the diff between kernel headers and libc headers isn't that big. that we can compensate for it by maintaining our own stuff and slowly upstreaming (or even using the libc crate for that until we upstream everything)

@elichai
Copy link
Owner

elichai commented Nov 8, 2019

Looks like the only sane way to do this that will also handle the ifdefs is via build time bindgen (unlike how I wanted, via scripts)

rust-lang/rust-bindgen#1665 (comment)

EDIT: that's how that looks like: https://github.com/elichai/syscalls-rs/tree/linux-sys2/linux-sys

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants