Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Signedness of c_char differs on ARMv7 vs. x86_64 #48633

Closed
hsivonen opened this issue Mar 1, 2018 · 6 comments
Closed

Signedness of c_char differs on ARMv7 vs. x86_64 #48633

hsivonen opened this issue Mar 1, 2018 · 6 comments

Comments

@hsivonen
Copy link
Member

hsivonen commented Mar 1, 2018

On Linux on ARMv7, c_char is u8. On Linux on x86_64, c_char is `i8. See the XPCOM bug that caused me to find out.

Does importing this design flaw of the C language into Rust solve any real problem, considering that Rust runs only on two's-complement systems and the FFI ABI for u8 and i8 is the same?

If not, I suggest making the signedness of c_char in Rust the same across platforms. Ideally, it would be u8, but since x86_64 is the most common platform that developers use and it's i8 there, it might not be feasible to use u8 without breaking existing code. (Theoretically, changing the signedness for ARMv7 is a breaking change, but it practice, it isn't a breaking change for code that already compiles on x86_64 also, which is most crates except maybe some embedded specialty crates.)

@kennytm
Copy link
Member

kennytm commented Mar 1, 2018

You should use c_schar (equals to signed char) or c_uchar (equals to unsigned char) if you want platform-independent signedness. The c_char type by definition is the same as C's char type and we're not going to change this regardless how flawed C's design is.

(cc #41619, the portability lint should ensure user does not rely on c_char being signed)

@hsivonen
Copy link
Member Author

hsivonen commented Mar 1, 2018

You should use c_schar (equals to signed char) or c_uchar (equals to unsigned char) if you want platform-independent signedness.

If working e.g. with CString from the standard library, the choice of type isn't up to the programmer. (When it is up to the programmer, using u8 and i8 and ignoring the concern of Rust perhaps getting ported to a DSP where C char isn't 8-bit wide is the most practical solution.)

The c_char type by definition is the same as C's char type and we're not going to change this regardless how flawed C's design is.

Does this policy position solve an actual technical problem for users of Rust? It seems to cause one.

@cuviper
Copy link
Member

cuviper commented Mar 1, 2018

There are actually quite a few targets with unsigned c_char, and its documentation directly states that it could be either way.

Does this policy position solve an actual technical problem for users of Rust?

It represents the reality of C, and deviating from this may cause subtle bugs too.

@hsivonen
Copy link
Member Author

hsivonen commented Mar 2, 2018

It represents the reality of C,

This doesn't explain value to Rust. So far, I don't believe it's technically valuable to leak this part of C reality to Rust. So far, I believe it's actively harmful. (I also think it's bad for Rust to import foreign design quirks wholesale for the political reason of not having to make case-by-case technical decisions, possibly via the RFC process.)

and deviating from this may cause subtle bugs too.

What kind of bugs? In what kind of case would a Rust program want to vary its processing of depending on the signedness of C char on the other side of the FFI?

@petrochenkov
Copy link
Contributor

In what kind of case would a Rust program want to vary its processing of depending on the signedness of C char on the other side of the FFI?

IIRC, ABIs can treat signed and unsigned integers differently (something like #31725).

@hsivonen
Copy link
Member Author

hsivonen commented Mar 2, 2018

IIRC, ABIs can treat signed and unsigned integers differently (something like #31725).

Ah, that makes sense for the types themselves. I was too much in the mode of thinking of pointers to c_char.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants