Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enforce C99 inttypes usage for standard C functions #46032

Open
3 tasks
stephanosio opened this issue May 26, 2022 · 52 comments
Open
3 tasks

Enforce C99 inttypes usage for standard C functions #46032

stephanosio opened this issue May 26, 2022 · 52 comments
Assignees
Labels
area: C Library C Standard Library area: Portability Standard compliant code, toolchain abstraction area: Toolchains Toolchains Enhancement Changes/Updates/Additions to existing features RFC Request For Comments: want input from the community
Milestone

Comments

@stephanosio
Copy link
Member

stephanosio commented May 26, 2022

Is your enhancement proposal related to a problem? Please describe.

Zephyr currently overrides the toolchain default C99 integer types ([u]intN_t) to fixed types such that int32_t can always be processed without a size modifier, and int64_t with ll size modifier.

This "hack," introduced in #16645, has caused various toolchain compatibility issues (e.g. #37712 (comment)) and will not be sustainable as we start adding more non-GNU toolchains as supported toolchains for Zephyr -- note that __INT32_TYPE__ and its friends defined by zephyr_stdint.h are non-standard and the toolchain default C library is not obliged to set up the type system using these.

Describe the solution you'd like

Stop re-defining the toolchain default types (i.e. remove zephyr_stdint.h) and always use the C99 inttypes macros with the C99 integer types; e.g.:

int32_t val;

printf("val = %d", val);        /* Incorrect */
printf("val = %" PRId32, val);  /* Correct */
int64_t val;

printf("val = %lld", val);      /* Incorrect */
printf("val = %" PRId64, val);  /* Correct */

For this, the following changes need to be implemented:

  • Rework the minimal libc inttypes.h to not assume the fixed types defined by zephyr_stdint.h; the size modifiers need to be resolved based on the native toolchain types.
  • Fix all discrete size modifier used with the C99 integer types to use the C99 inttypes macros.
  • Remove zephyr_stdint.h.

Additional context

It may be useful to go over the following related issues:

@stephanosio stephanosio added Enhancement Changes/Updates/Additions to existing features area: C Library C Standard Library labels May 26, 2022
@stephanosio stephanosio added this to the v3.2.0 milestone May 26, 2022
@stephanosio stephanosio self-assigned this May 26, 2022
@stephanosio
Copy link
Member Author

stephanosio commented May 26, 2022

stephanosio added a commit to stephanosio/zephyr that referenced this issue May 26, 2022
This commit adds the missing `PRIx{FAST,LEAST}N` C99 integer type
format macros that correspond to the C99 integer types overridden in
the `zephyr_stdint.h` header:

  PRIdFAST8, PRIdFAST16, PRIdFAST32, PRIdFAST64
  PRIdLEAST8, PRIdLEAST16, PRIdLEAST32, PRIdLEAST64

  PRIiFAST8, PRIiFAST16, PRIiFAST32, PRIiFAST64
  PRIiLEAST8, PRIiLEAST16, PRIiLEAST32, PRIiLEAST64

  PRIoFAST8, PRIoFAST16, PRIoFAST32, PRIoFAST64
  PRIoLEAST8, PRIoLEAST16, PRIoLEAST32, PRIoLEAST64

  PRIuFAST8, PRIuFAST16, PRIuFAST32, PRIuFAST64
  PRIuLEAST8, PRIuLEAST16, PRIuLEAST32, PRIuLEAST64

  PRIxFAST8, PRIxFAST16, PRIxFAST32, PRIxFAST64
  PRIxLEAST8, PRIxLEAST16, PRIxLEAST32, PRIxLEAST64

  PRIXFAST8, PRIXFAST16, PRIXFAST32, PRIXFAST64
  PRIXLEAST8, PRIXLEAST16, PRIXLEAST32, PRIXLEAST64

Note that these macros will eventually need to be defined according to
the toolchain-specified types when the `zephyr_stdint.h` hack is
removed in the future; refer to the the GitHub issue zephyrproject-rtos#46032 for more
details.

Signed-off-by: Stephanos Ioannidis <root@stephanos.io>
@andyross
Copy link
Contributor

Just to record for posterity: I hate the ISO PRI* macro nonsense with a fiery passion. It's ridiculous-looking and hard to read, it takes a perfectly working API with perfectly working error reporting and introduces line noise syntax not for correctness[1], but for portability. And that's just the Wrong Thing. Let code manage the build problems that result, if there are platform-dependent sizes in printf format strings then it's the programmer's job to make sure they aren't used[2].

I won't -1 this. I won't stand in the way of a merge. But this is just awful. We're now committing to explaining to generations of new developers why Zephyr's printf and logging layers are awful. And they're all going to ask "Why doesn't Linux have this problem?!". What's our answer going to be?

[1] There's no typesafety problem with passing e.g. a "int64_t" to a "%ld" on arm64, for example. The compiler knows the types, it knows how to pass a 64 bit argument, it knows how big a "long" is, and it can trivially (and does!) emit a printf format error on i386 builds. Why is that a problem? If the user wrote non-portable code they should fix it. They don't need new (and bad!) syntax in the language to help with that.

[2] e.g. By using %lld in the first place to ensure the full 64 bit value is printed, or by deliberately downconverting a value to 32 bits, etc... Those decisions are application decisions! The printf library can't make them for us, and it shouldn't try.

carlescufi pushed a commit that referenced this issue May 26, 2022
This commit adds the missing `PRIx{FAST,LEAST}N` C99 integer type
format macros that correspond to the C99 integer types overridden in
the `zephyr_stdint.h` header:

  PRIdFAST8, PRIdFAST16, PRIdFAST32, PRIdFAST64
  PRIdLEAST8, PRIdLEAST16, PRIdLEAST32, PRIdLEAST64

  PRIiFAST8, PRIiFAST16, PRIiFAST32, PRIiFAST64
  PRIiLEAST8, PRIiLEAST16, PRIiLEAST32, PRIiLEAST64

  PRIoFAST8, PRIoFAST16, PRIoFAST32, PRIoFAST64
  PRIoLEAST8, PRIoLEAST16, PRIoLEAST32, PRIoLEAST64

  PRIuFAST8, PRIuFAST16, PRIuFAST32, PRIuFAST64
  PRIuLEAST8, PRIuLEAST16, PRIuLEAST32, PRIuLEAST64

  PRIxFAST8, PRIxFAST16, PRIxFAST32, PRIxFAST64
  PRIxLEAST8, PRIxLEAST16, PRIxLEAST32, PRIxLEAST64

  PRIXFAST8, PRIXFAST16, PRIXFAST32, PRIXFAST64
  PRIXLEAST8, PRIXLEAST16, PRIXLEAST32, PRIXLEAST64

Note that these macros will eventually need to be defined according to
the toolchain-specified types when the `zephyr_stdint.h` hack is
removed in the future; refer to the the GitHub issue #46032 for more
details.

Signed-off-by: Stephanos Ioannidis <root@stephanos.io>
@stephanosio
Copy link
Member Author

stephanosio commented May 26, 2022

While I am not a very big fan of this either, this is the only way to be portable across different compilers and we have no option but to do this, especially as we start introducing more non-GNU compiler support to the Zephyr.

I believe this is an inevitable decision for us.

p.s. It is as simple as, if you want to use C99 integer types (int32_t, ...), you will have to use the corresponding C99 integer type format specifier macros (PRId32, ...) too.

@andyross
Copy link
Contributor

"But why doesn't Linux need this?" :)

@npitre
Copy link
Collaborator

npitre commented May 26, 2022 via email

@stephanosio
Copy link
Member Author

"But why doesn't Linux need this?" :)

Because they use their own type system? (u32, ...)

We can go back to doing that again; but, as long as we are using the C99 integer types, we will have to use the corresponding C99 integer type format specifiers too.

@npitre
Copy link
Collaborator

npitre commented May 26, 2022 via email

@andyross
Copy link
Contributor

Or we can commit a silly hack that we've been living with just fine. I mean, I get that clang picked some different defaults (or whatever, I'm no expert on the linked bug), but we already have a toolchain abstraction where we could synchronize this.

It's far simpler for us to have a layer that manages some platform-specific typedefs than it is to inflict this weird non-C/non-printf alien syntax on app developers.

That's the core of the problem here. This PRI nonsense just "isn't C code". It's a mistake. The printf() API has a four decade tradition of hackers (some of them even older than me!) who understand it in their bones. It's not the only way to format strings, but it's the C way.

Or consider it this way: if we really did want to change our string formatting syntax, why on earth would we pick THIS one? Why not move to something that looked more like Rust std::fmt, for example, which I think is quite nice personally? I mean, your answer is going to be "because that's rust and not C", right? Well... this isn't C either. No one uses these things in practical code, they're a wart to paper over a mistake in the standard.

@stephanosio
Copy link
Member Author

The basic types to which they're mapped is

Yes, and that is allowed by the C standard -- that is the real problem. As long as that is allowed, we will have to use the proper abstraction provided by the standard.

Make int32_t onto an int and int64_t onto a long long at the
toolchain level and you'll be fine. That's trivial to accomplish there.

For the GNU toolchains and maybe Clang, yes; for others like Arm Compiler, IAR, and whatever we add support for in the future, likely not.

@stephanosio
Copy link
Member Author

Or we can commit a silly hack that we've been living with just fine.

I have been saying we cannot do that as I noted above "__INT32_TYPE__ and its friends defined by zephyr_stdint.h are non-standard and the toolchain default C library is not obliged to set up the type system using these."

It's far simpler for us to have a layer that manages some platform-specific typedefs

For GCC, yes; for others, not really.

if we really did want to change our string formatting syntax, why on earth would we pick THIS one?

Because it is a part of the C99 standard. As I said above, "it is as simple as, if you want to use C99 integer types (int32_t, ...), you will have to use the corresponding C99 integer type format specifier macros (PRId32, ...) too."

@stephanosio
Copy link
Member Author

If you are really unhappy with this, we can consider going back to the days of u32_t and its friends, but we have had plenty of problems with that too.

I will emphasise again: "it is as simple as, if you want to use C99 integer types (int32_t, ...), you will have to use the corresponding C99 integer type format specifier macros (PRId32, ...) too."

@andyross
Copy link
Contributor

I have been saying we cannot do that as I noted above "INT32_TYPE and its friends defined by zephyr_stdint.h are non-standard and the toolchain default C library is not obliged to set up the type system using these."

Right, so just make our headers produce the same types as the toolchain/platform expects (again, we already have a framework for doing this). That fixes the build errors, and leaves the portability issue as a task for the developer, which is where it should be anyway.

@npitre
Copy link
Collaborator

npitre commented May 26, 2022 via email

@stephanosio
Copy link
Member Author

Right, so just make our headers produce the same types as the toolchain/platform expects

That may not be feasible for all compilers. Some compilers may not even have something like __INT32_TYPE__ which we can use to override the compiler-default C library-defined C99 fixed-width types.

Do we really have to support all compilers under the sun?

Maybe not "all compilers," but we are committed to supporting multiple toolchains.

If so, shouldn't we try to ask them to create a "Zephyr" mode first?

Ideally, yes; but, not everything is ideal in real life ...

@stephanosio stephanosio added area: Portability Standard compliant code, toolchain abstraction area: Toolchains Toolchains labels May 26, 2022
@npitre
Copy link
Collaborator

npitre commented May 26, 2022 via email

@stephanosio
Copy link
Member Author

As a side note, if we, as project, want to introduce a non-standard string formatting syntax (and that is probably warranted, if I may add), we can; but, that is a whole another issue and does not change the fact that the use of standard C functions need to comply with the C standard.

@stephanosio
Copy link
Member Author

Having int32_t sometimes map to an int, sometimes to a long is a
complete abomination.

That is indeed an abomination. But for int64_t, it may very well be long or long long depending on whom you ask.

@npitre
Copy link
Collaborator

npitre commented May 26, 2022 via email

@stephanosio
Copy link
Member Author

It is possible for the toolchain to be stricter in its type association and still be within the C standard.

I am inclined to argue that something like below is not C99-compliant and is indeed wrong:

int32_t val;

printf("val = %d", val);        /* Incorrect */

int32_t is only meant to be printed out using the corresponding C99 format specifier macro, which is PRI{d,i,o}32 and that is how it ought to be in order to be fully portable.

At least, for printf, which is a standard C function, I strongly believe that is how it needs to be handled.

For other similar non-standard/Zephyr-specific functions like printk and LOG_*, we could potentially introduce custom format specifiers like %I32d or somewhere along that line to avoid the "PRIxN abominations."

@keith-packard
Copy link
Collaborator

(I don't think anyone likes the C99 format specifiers; I wish they had just added more size prefixes to the printf spec instead)

Does zephyr support any platforms for which

  • sizeof(int) != 4
  • sizeof(long long) != 8

If not, then we could use %d for int32_t and %lld for int64_t. Unfortunately, this can still generate warnings on platforms for which sizeof(long) == 4 and int32_t is a typedef for long (sigh), but you could stick a cast to (int) for int32_t in the printf call. Not great.

It would be good to evaluate where we could be using size_t and ptrdiff_t instead of explicit bit sizes as those can be represented with %zd and %td in printf fomat strings.

Alternatively, cast everything to long long and use %lld everywhere. Easier to read than "%"PRId32"\n". That does depend on printf supporting 64-bit integers (which is optional on 32-bit platforms).

One significant benefit for using the right format specifiers is that we can use the existing code in gcc and clang which evaluate printf format strings and make sure the arguments are of the right type. I think this is worth a small amount of pain

@cfriedt
Copy link
Member

cfriedt commented May 30, 2022

I would +1 removing zephyr_stdint.h in favour of that being the toolchain's responsibility.

Not crazy about using %" PRId32 " instead of %d everywhere though just to print an int. Also would prefer to not always need to specify int32_t rather than int.

I find PRIu64, PRIx64, and PRId64 mainly helpful as a common way to print 64-bit values on both 32 and 64-bit machines.

laxiLang pushed a commit to laxiLang/zephyr that referenced this issue May 30, 2022
This commit adds the missing `PRIx{FAST,LEAST}N` C99 integer type
format macros that correspond to the C99 integer types overridden in
the `zephyr_stdint.h` header:

  PRIdFAST8, PRIdFAST16, PRIdFAST32, PRIdFAST64
  PRIdLEAST8, PRIdLEAST16, PRIdLEAST32, PRIdLEAST64

  PRIiFAST8, PRIiFAST16, PRIiFAST32, PRIiFAST64
  PRIiLEAST8, PRIiLEAST16, PRIiLEAST32, PRIiLEAST64

  PRIoFAST8, PRIoFAST16, PRIoFAST32, PRIoFAST64
  PRIoLEAST8, PRIoLEAST16, PRIoLEAST32, PRIoLEAST64

  PRIuFAST8, PRIuFAST16, PRIuFAST32, PRIuFAST64
  PRIuLEAST8, PRIuLEAST16, PRIuLEAST32, PRIuLEAST64

  PRIxFAST8, PRIxFAST16, PRIxFAST32, PRIxFAST64
  PRIxLEAST8, PRIxLEAST16, PRIxLEAST32, PRIxLEAST64

  PRIXFAST8, PRIXFAST16, PRIXFAST32, PRIXFAST64
  PRIXLEAST8, PRIXLEAST16, PRIXLEAST32, PRIXLEAST64

Note that these macros will eventually need to be defined according to
the toolchain-specified types when the `zephyr_stdint.h` hack is
removed in the future; refer to the the GitHub issue zephyrproject-rtos#46032 for more
details.

Signed-off-by: Stephanos Ioannidis <root@stephanos.io>
@stephanosio stephanosio added the RFC Request For Comments: want input from the community label Jul 13, 2022
@stephanosio stephanosio modified the milestones: v3.2.0, future Sep 12, 2022
@stephanosio stephanosio moved this to LTS3 in Release Plan Oct 4, 2022
@npitre
Copy link
Collaborator

npitre commented Oct 11, 2022 via email

@stephanosio
Copy link
Member Author

#46708 (comment)

TL;DR: If you want to use int32_t and its friends in the printf, use PRId32 and its friends too. Do not use int32_t in the printf and drop the accompanying PRId32 because that is going to create a big mess.

I too agree with you that the PRIxN macros in the printf are horrible and we do not want to live with this wart in the Zephyr codebase. So let us drop it all. Let us switch over to using the printk and the Zephyr-native cbprintf API with the alternative C99 fixed-width integer type format specifiers like %I32d for int32_t as suggested in #46032 (comment).

@keith-packard
Copy link
Collaborator

I too agree with you that the PRIxN macros in the printf are horrible and we do not want to live with this wart in the Zephyr codebase. So let us drop it all. Let us switch over to using the printk and the Zephyr-native cbprintf API with the alternative C99 fixed-width integer type format specifiers like %I32d for int32_t as suggested in #46032 (comment).

I think I slightly disagree with this. The (significant) benefit of using printf is that the C compiler will typecheck arguments against the format strings. Any printf-like-function which doesn't match the printf formatting conventions cannot use the compiler's printf typechecking, so you lose that benefit. Even if you call it printk, or cbprintf, I'd still encourage you to consider the checking benefits vs the convenience of simpler usage.

If you don't like PRId32 (and, frankly, who does), you could use %ld and insert a (long) cast for the argument. On 32-bit systems using ILP32 and 64-bit systems using LP64, this is zero cost -- long is the native machine word size in both cases.

Using the standards also lowers the barrier to entry for new Zephyr developers; they can leverage knowledge gained in other environments without having to constantly second-guess where Zephyr differs. There are other benefits, like being able to validate the implementation by using existing test suites and using printf to implement printk and cbprintf (which Zephyr already does when using picolibc).

@carlescufi carlescufi moved this from LTS3 Goals to v3.4 in Release Plan Oct 11, 2022
@jgl-meta
Copy link
Collaborator

@stephanosio What is the status of this? Will this make it into the 3.4 release?

@keith-packard
Copy link
Collaborator

Looking forward to c23, we should expect to have %w format specifiers in the standard libraries at some point soon, which means we can use %w32d instead of PRId32.

@marc-hb
Copy link
Collaborator

marc-hb commented Apr 26, 2023

Why do we have to bow to them?

Zephyr is becoming the most important RTOS project out there. That
should be sufficient incentive for those other compilers to create a
config with sane defaults for Zephyr.

This does not seem realistic. Some of Linus Torvalds' complaints about gcc have fallen on deaf ears. Kernels are a small "market" for compilers.

Also, Linux supports just one toolchain (+ clang, a... gcc emulator on top of LLVM) whereas Zephyr bends over backwards to support unknown and proprietary toolchains thanks to an elaborate CMake indirection layer. Zephyr also has a licensing model welcoming closed source libraries and even a tool for binary blobs. Expecting to rally all these in the crusade against PRIu64 is not realistic.

In any case, let's not over-engineer things unless necessary.

Everyone agrees PRIu64 is incredibly ugly and a huge mistake. Probably for backwards compatibility, always a huge problem with C/C++. But why is it necessary to (over)engineer a solution that avoids PRIu64 at all costs? Code style and readability matter, a lot. But "necessary" means something different.

If there is a simple and reliable solution that avoids PRIu64 for C and C++ across a reasonable range of unmodified toolchains then perfect; "send patches!" However @stephanosio's experience and expertise + the length of this thread and of a good few other before make me feel very sceptical. Speaking of patches, does Zephyr already have some build-time test case(s) for this?

Looking forward to c23, we should expect to have %w format specifiers in the standard libraries at some point soon, which means we can use %w32d instead of PRId32.

Great news! So the compilers "bowed" after all. Now we can get rid of PRIu64 around 2030 ;-)

@stephanosio
Copy link
Member Author

@stephanosio What is the status of this? Will this make it into the 3.4 release?

@jgl-meta This is still under discussion and is unlikely to make into the 3.4 release.

Looking forward to c23, we should expect to have %w format specifiers in the standard libraries at some point soon, which means we can use %w32d instead of PRId32.

@keith-packard That is a great news; but, using those format specifiers will require us to set the minimum required C standard version for Zephyr to C23, which I do not think is going to happen in the foreseeable future.

Great news! So the compilers "bowed" after all. Now we can get rid of PRIu64 around 2030 ;-)

@marc-hb Given that we are still debating whether it is reasonable to require C11 in 2023, maybe 2040 is more realistic :)

@stephanosio
Copy link
Member Author

Another option could be to add the C23 %w format specifier support to our printk and cbprintf libraries.

As long as we use the %w format specifier with only printk and cbprintf (i.e. not in the C standard library functions such as printf), we do not need set the minimum required C standard version to C23.

This will effectively allow us to not use the wart that PRIxN is in the Zephyr codebase.

@cfriedt
Copy link
Member

cfriedt commented Apr 27, 2023

@marc-hb Given that we are still debating whether it is reasonable to require C11 in 2023, maybe 2040 is more realistic :)

🥲

Could rust help here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: C Library C Standard Library area: Portability Standard compliant code, toolchain abstraction area: Toolchains Toolchains Enhancement Changes/Updates/Additions to existing features RFC Request For Comments: want input from the community
Projects
Status: No status
Status: Future
Development

No branches or pull requests

8 participants