-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cranelift: formalize, document, and audit binary representation of booleans #3205
Comments
cc @abrown @afonso360 @bjorn3 @alexcrichton @akirilov-arm for recent folks who've discussed this topic. |
I'd like to propose the following for loads and stores:
The last is probably the most controversial; disallowing loads and stores of bools avoids the semantic question, which may have been why this was done (@sunfishcode comments?). But we already allow bitcasts, so we already have some notion of the bits underlying a bool, so IMHO it's better to allow loading/storing those bits. An alternative is to canonicalize a boolean when loading. For a Then for bitcasts:
All of the above should fully pin down how we store the values in registers as well: we (i) have a well-defined integer interpretation of the bool, (ii) always store it in canonicalized form (all zeroes or all ones in defined bits), and (iii) keep to the same "upper bits undefined" invariant as for integers. With one exception: for a Thoughts? |
Add agenda item to 2021-08-23 Cranelift meeting for #3205.
I've just added this to the agenda for Monday's public Cranelift biweekly meeting as well. |
I think a side effect of going doing the path above would be that |
That's a good point; the distinction between the two is pretty confusing, so if we could remove |
Why not have |
So my thought process is: Pragmatic reasons: |
I should probably add another possibility too: we could define all of The main reason for the all-ones representation, to my understanding, was to be able to use e.g. a This probably needs a sanity check from a SIMD perspective though -- @abrown / @jlb6740 / @akirilov-arm, thoughts? |
Ah, here is one example of a dependence on all-ones in cg_clif: link (linked by @afonso360 in #3003) |
I would mention that the all-ones/all-zeroes representation for SIMD is not just because using the output of a comparison as a mask is convenient, it's also that the Wasm SIMD spec specifies that representation (but obviously without the boolean type) and that machines (x64, aarch64) represent things this way as well. |
@abrown this has me thinking further: the existing Wasm-SIMD implementation (lowering in So one fundamental question I have is: what is the use-case for boolean types wider than 1 bit? If for an actual SIMD use-case we represent the mask as an In other words, what we decide here doesn't impact Wasm-SIMD, because Wasm-SIMD doesn't depend on CLIF-level bool types; so it's better to base on decision on whatever makes the most semantic sense and leads to the least confusion, IMHO. If Argued from another perspective: If we say that booleans (abstractly) have two possible values, then storing such a value in anything wider than 1 bit grants us the freedom to choose the bit representation for each of those two possible values. In other words, So I'm starting to convince myself a bit more than the |
Rust stores bools as 0 or 1. Storing b1 as all zeros or all ones would require an extra instruction in the common case to convert it to 0 or 1 when storing it. The larger boolean types are pretty much only useful for vector types where it is common for masks to have all zeros or all ones as value. |
@cfallin I am a bit confused about your comments on
The all ones case is actually a single AArch64 instruction as well, |
@akirilov-arm indeed, that was more or less my point as well -- in the past it has been described as all-ones "because SIMD" and e.g. see the |
Here's a link to the 2021-08-23 meeting notes where this was discussed originally. |
There has been discussion recently about whether or not we need the boolean types at all, and we seem to be converging on the opinion that we should remove them rather than handle the issues surrounding load/store with I'd like to propose the following semantics for instructions that return boolean values now:
|
Thanks @elliottt -- I think the above is largely a good plan and will result in nice simplifications and removal of awkward special cases!
For this case specifically I think the semantics would be for |
I agree. IIRC, there may be an optimization or two that rely on the boolean type but given the move towards e-graphs those optimizations will need to be re-worked anyway. |
Remove the boolean types from cranelift, and the associated instructions breduce, bextend, bconst, and bint. Standardize on using 1/0 for the return value from instructions that produce scalar boolean results, and -1/0 for boolean vector elements. Fixes #3205 Co-authored-by: Afonso Bordado <afonso360@users.noreply.github.com> Co-authored-by: Ulrich Weigand <ulrich.weigand@de.ibm.com> Co-authored-by: Chris Fallin <chris@cfallin.org>
In #3202 and a number of past issues as well, we've repeatedly encountered some uncertainty as to (i) how boolean values of varying widths are stored in registers, (ii) whether they can be loaded/stored or not and what representation they take in memory (see e.g. #3102 where we addressed this for trampoline args), and (iii) how this interacts with boolean<->integer conversion ops like
bint
andbmask
.The understanding we seem to generally agree on is that booleans of width > 1 bit have all bits set (true) or all bits cleared (false). And a
b1
is stored in a register with the same invariant as other integer values: the upper bits (in this case everything above the LSB) are undefined.However, this is far from being confusion-free and unambiguous; thus, it seems reasonable to (i) agree definitively on how bools are represented in registers and memory, and (ii) audit our compliance to whatever invariants we decide on.
The text was updated successfully, but these errors were encountered: