Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wip: Add c_char type #861

Closed
wants to merge 1 commit into from
Closed

wip: Add c_char type #861

wants to merge 1 commit into from

Conversation

tiehuis
Copy link
Member

@tiehuis tiehuis commented Mar 27, 2018

Some preliminary work on adding a c_char type to correspond to the char c type. Any questions/discussion is welcome.

Why do we need this?

Currently a char type is translated from c as a u8. This is wrong for two reasons.

First, the sign of a char for each platform is implementation defined. This causes issues since it means both u8 and i8 are not compatible with char. This is easy to see when trying to create an exportable zig function which takes a char compatible parameter (impossible with the status quo). Even common targets differ; arm for example uses an unsigned char vs. x86_64 using a signed char.

Second, a char is not necessarily 8 bits. However, on all common platforms this is true (and I think all LLVM support?). This is of much lower concern and I think the assumption that it is always 8-bits is fine to keep right now.

What to add

  • translate-c needs to convert char to c_char types
  • translate-c needs to check the requested target environment and generate different headers depending on the targets char signedness.
  • documentation updates
  • update stdlib os functions to handle different c string type (see questions).
  • the stdlib needs to be updated at the os level to handle the change of type for cstring types which now are of type []const c_char instead of []const u8.

Questions

Would implicit casting from []const c_char to []const u8 make sense? I've had a brief look at changing the required os functions and it makes usage a fair bit more annoying to use c strings with low-level os zig functions. I may follow up with a port of the linux os functions and see how that fares. We can always do an explicit @ptrCast.

@andrewrk
Copy link
Member

andrewrk commented Mar 27, 2018

My original rationale for not having a c_char type is this:

  • C leaves the signed-ness up to the implementation, and therefore C source code which uses char is defined to be correct regardless of the signedness that is used for char.
  • Therefore we can just pick unsigned, and C source code will be correct. For example when doing translate-c, I believe it is sound to translate char to u8.
  • Having the signed-ness be (non-implementation-) defined is desirable

You have pointed out a use case where this falls short:

This is easy to see when trying to create an exportable zig function which takes a char compatible parameter (impossible with the status quo).

I agree that zig should be able to export a function (or type) that uses the char C type. Although I think that explicitly using unsigned or signed is better in new C APIs, one of Zig's use cases is matching a C library's interface but having a different implementation.

But is there another use case for c_char? If not, here's an idea: c_char is exactly the same as u8, except for the purposes of generating .h files, in which case it generates char instead of unsigned char. Translate-c would still translate char as u8. We would not need to modify the zig std lib.

One more thing:

translate-c needs to check the requested target environment and generate different headers depending on the targets char signedness.

I believe clang takes care of this automatically, provided that we are passing the correct -triple argument; I don't remember if we implemented this yet or not.

@andrewrk
Copy link
Member

andrewrk commented Mar 27, 2018

One more thing: if this is for #514 (libc in zig), I suspect that the best strategy here may be to ignore the .h file generated by zig and instead manually create (or copy from musl) the .h files, and then have zig provide the implementation only. I believe the libc spec requires specifically named headers and certain kinds of macros and other one-off requirements, to the point where I would not try to wrangle zig's auto .h file generation for this particular use case.

@tiehuis
Copy link
Member Author

tiehuis commented Mar 27, 2018

Good points. The conversion of c_char to u8 seems like a good fit in general and is a good solution for now I think. I'll modify this PR to provide that behavior.

Partly inspired by the libc work but that was always going to need separate headers anyway due to the many macro requirements so no issues there.

@tiehuis tiehuis mentioned this pull request Mar 31, 2018
@tiehuis
Copy link
Member Author

tiehuis commented Mar 31, 2018

Closing for now since it isn't immediately pressing and if we want c_char as a u8 alias except for the header generation it seems to require a few changes to how we deal with the type entries for u8 everywhere.

Refer to #875.

@tiehuis tiehuis closed this Mar 31, 2018
@tiehuis tiehuis deleted the c_char branch March 31, 2018 00:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants