Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Signedness mismatch #1594

Open
npmccallum opened this issue Jul 16, 2019 · 11 comments
Open

Signedness mismatch #1594

npmccallum opened this issue Jul 16, 2019 · 11 comments

Comments

@npmccallum
Copy link

Input C/C++ Header

#define FOO 253

Bindgen Invocation

$ bindgen input.h

Actual Results

/* automatically generated by rust-bindgen */

pub const FOO: u32 = 253;

Expected Results

/* automatically generated by rust-bindgen */

pub const FOO: ::std::os::raw::c_int = 253;

Summary

Bindgen outputs a u32 instead of c_int for defines. This makes them difficult to work with due to signedness mismatch.

@npmccallum
Copy link
Author

See https://en.cppreference.com/w/cpp/language/integer_literal

Specifically the section: "The type of the literal"

Bindgen should not attempt to detect signedness. It should rely on the u or U suffix for that.

@emilio
Copy link
Contributor

emilio commented Jul 17, 2019

Well, in practice most people don't really use the u / U suffix, so we try to pick an appropriate size. In any case you can override this: https://docs.rs/bindgen/0.50.0/bindgen/callbacks/trait.ParseCallbacks.html#method.int_macro

@npmccallum
Copy link
Author

There are two issues here: size and signedness. Size detection is fine. Signedness is not.

Trying to detect sign is wrong. Half the time you will get it wrong and people will have to fix it up in the rust code. The C specification is very clear what sign is assigned to literals. The only difference is that Rust doesn't have implicit conversion but C does.

Put simply, you're trying to work around a bug in the source file. This is absolutely a bad default. The default should be to follow the C specification. Anything else is unpredictable.

@emilio
Copy link
Contributor

emilio commented Jul 19, 2019

That is fair. I'm not opposed to changing behavior here in theory. The macro evaluator (https://github.com/jethrogb/rust-cexpr) doesn't provide any information other than the final i64 value, so fixing this requires changing that first, aiui.

@npmccallum
Copy link
Author

@emilio I've been doing some thinking about this problem. Obviously, cexpr still needs to provide us the information. However, bindgen also discards information. Specifically, it converts an integer literal to a typed integer constant without any information of how it is used. Committing to a type ahead of time results in ergonomic problems later.

What if instead of making C macro definitions into Rust constants we turned them into Rust macros? We obviously trade some syntactic nicety for ergonomics. Code that uses the macros would get integer literals and Rust would automatically promote the type at the site of use.

Thoughts?

@emilio
Copy link
Contributor

emilio commented Sep 16, 2019

It may be ok to do that, but I'm not 100% sure ergonomics are better... Specially since macros in pre-2018 edition (I think?) are subject to module ordering, so if you have:

mod bindings;
mod other; // <- this uses the macros in `bindings`

It'd work, while:

mod other; // <- this uses the macros in `bindings`
mod bindings;

wouldn't.

@npmccallum
Copy link
Author

npmccallum commented Sep 16, 2019

@emilio The ergonomics of pre-2018 Rust are really a low priority to me. So much has happened since 2018 in the language that if you're worried about ergonomics you just update to stable or even nightly.

The bigger concern that I see is that bindgen would break its current API. But it solves this problem:

#define CONSTANT 1234

void foo(uint64_t value);
void bar(int value);

Currently, bindgen doesn't work without casts in either of the above function calls. You end up with the following output:

pub const CONSTANT: u32 = 1234;

extern "C" {
    pub fn foo(value: u64);
}

extern "C" {
    pub fn bar(value: ::std::os::raw::c_int);
}

The problem is, of course, that CONSTANT can't be passed to either function without a cast. This is because bindgen chooses the type for CONSTANT ahead of time.

My proposal is to make bindgen emit this instead:

#[macro_export]
macro_rules! CONSTANT {
    () => { 1234 }
}

extern "C" {
    pub fn foo(value: u64);
}

extern "C" {
    pub fn bar(value: ::std::os::raw::c_int);
}

In this case, both of these calls works as expected:

foo(CONSTANT!());
bar(CONSTANT!());

@jethrogb
Copy link
Contributor

I like the idea in general, but I think it's worthwhile keeping the existing behavior around for libraries with a sane API. Maybe we can make this configurable using the ParseCallbacks?

@npmccallum
Copy link
Author

@jethrogb Configurable seems sane to me.

@moosterhof
Copy link

Slightly related, size detection also fails: #923

@ojeda
Copy link
Contributor

ojeda commented Sep 28, 2023

The macro idea is discussed at #1594, perhaps the discussion around that should be moved there.

However, bindgen also discards information. Specifically, it converts an integer literal to a typed integer constant without any information of how it is used.

The bigger concern that I see is that bindgen would break its current API. But it solves this problem:

#define CONSTANT 1234

void foo(uint64_t value);
void bar(int value);

Currently, bindgen doesn't work without casts in either of the above function calls.

Please see my reply at #2120 (comment) -- the discussion is very similar.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants