Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Mbool memory chunk to improve compilation of computations at type _Bool #513

Merged
merged 7 commits into from
Jun 16, 2024

Conversation

xavierleroy
Copy link
Contributor

This PR introduces a new kind of memory chunk, Mbool, to be used when compiling loads and stores of values of C type _Bool.

The semantics of a memory load with chunk Mbool is as follows:

  • Read 8 bits as an integer n (all known ABIs mandates that _Bool has size 1 byte)
  • Return 0 if n = 0, 1 if n = 1, and Vundef otherwise.

With this semantics, the result of the load is guaranteed to be a correct value of type _Bool, i.e. 0, 1 or Vundef. This is not the case in the current CompCert, where _Bool values are read using the Mint8unsigned chunk, resulting in any integer in the 0..255 range, or Vundef.

Consequently, the value analysis of Boolean computations is more precise, enabling nice optimizations, as in the following examples.

_Bool f(_Bool * p) { return *p; }

void g(_Bool * p) { *p ^= 1; }

_Bool h(_Bool * p, _Bool * q) { return *p & *q; }

With this PR, all three examples are compiled without any normalization to _Bool.

The semantics of Mbool loads is carefully engineered so that it can be implemented in the final assembly code as a 8-bit unsigned load Mint8unsigned, producing a "more defined than" result.

What's in this PR?

  • The Mbool memory chunk is added, with the semantics outlined above.
  • Value analysis and other back-end passes are adapted to deal with Mbool.
  • The Stacking pass transforms Mbool loads and stores into Mint8unsigned loads and stores, after all optimizations have been done.
  • The C front-end generates Mbool loads and stores to access l-values of _Bool type.
  • The type soundness result in Ctyping is strengthened; now, it shows that the only values of type _Bool are 0, 1 and Vundef.

In passing, the PR refactors some code from the various Asmgen/Asmgenproof files into the shared Stacking/Stackingproof files. This was not absolutely necessary, just a good opportunity to do so.

xavierleroy and others added 7 commits May 22, 2024 19:11
It's a 8-bit quantity that is normalized to 0, 1 or Vundef at load time.

This enables a more precise value analysis of loads from pointers to _Boo,
and the corresponding optimizations of casts to _Bool types.

Not yet generated by the C front-end, but correctly treated throughout
the back-end.
Some memory chunks have no corresponding processor instruction
(e.g. store Mint8signed).
Previously, they were replaced by other, equivalent chunks
(e.g. store Mint8unsigned) during the Asmgen pass, in target-specific code,
causing some duplication of proofs.

This commit replaces chunks by equivalent chunks supported by all target
processors in the target-independent pass Stacking, so as to factor out
the corresponding proofs.

Target-specific Asmgen and Asmgenproof* files are simplified accordingly.

Code emitters: simplify loads, stores
Thus enabling more optimizations in the back-end.
So that builtins and external functions with return type `_Bool` can
be described more precisely.

Values returned by a function of return type `_Bool` are still normalized
as 8-bit integers, because that's what the x86 and AArch64 ABI say
(the return value is an 8-bit integer that is equal to 0 or to 1).
Now, a value of type _Bool is 0, 1, or Vundef.
Also: introduce and use Val.norm_bool_cases.
@xavierleroy
Copy link
Contributor Author

For the record: this change renders undefined some CompCert C programs that currently have defined behavior according to the CompCert C semantics, but are undefined behavior in ISO C. Consider:

_Bool char2bool (unsigned char c) { return *((_Bool *) &c); }
int main() { return char2bool(42); }

Currently, this is defined to return 1. (The function char2bool is equivalent to a c != 0 test.)

With the change in this PR, the load of the _Bool is undefined, since the byte read from memory is neither 0 nor 1.

char2bool(0) and char2bool(1) have well-defined semantics, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants