-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CHERI-valid pointer stuffing produces worse codegen than an implementation which does huge wrapping offsets #96152
Comments
Yes with_addr should be a compiler intrinsic so that it can just Do The Right Thing and get properly optimized. I note as much in core::ptr, but, I am not a Compiler Person. |
We're missing this fold: https://alive2.llvm.org/ce/z/eKRvKm This is missed both on the InstCombine and the DAGCombine level. |
https://reviews.llvm.org/D124763 for the InstCombine fold (which does not actually help here, this needs to happen on DAGCombine level). |
https://reviews.llvm.org/D124772 is the DAGCombine fold -- which also doesn't actually help here, because demanded bits simplification interfered and reduces the width of the xor. |
https://reviews.llvm.org/D124710 finishes the InstCombine side of this, and https://reviews.llvm.org/D124856 is another step on the DAGCombine side -- this one affects the original case, but doesn't give the result we want yet. |
https://reviews.llvm.org/D124930 would actually fix the original case. |
@nikic that's amazing. :-) |
Yes, everything has landed here, so this should be fixed with the next LLVM upgrade. |
Which is the version of LLVM that will contain these patches? Looks like that would be LLVM 15? |
The codegen at the godbolt link in the top-level comment is unchanged, it's still this: example::old_style:
mov rax, rdi
or rax, 1
ret
example::cheri_compat:
mov eax, edi
not eax
and eax, 1
add rax, rdi
ret
example::fast:
mov rax, rdi
or rax, 1
ret |
Godbolt doesn't seem to have the update yet
|
Ah! Looks like the playground is probably up to date? Its short links don't capture the rest of the state, so here's the disassembly it spits out for the example code: playground::old_style: # @playground::old_style
# %bb.0:
movq %rdi, %rax
orq $1, %rax
retq
# -- End function
playground::cheri_compat: # @playground::cheri_compat
# %bb.0:
movq %rdi, %rax
orq $1, %rax
retq
# -- End function
playground::fast: # @playground::fast
# %bb.0:
movq %rdi, %rax
orq $1, %rax
retq
# -- End function So I think we're good here! Thanks! |
Is it worth adding a codegen test for this? |
Possibly. But we have a small problem maybe? All three of those compile to different IR: define nonnull ptr @_ZN7example9old_style17h5807fc970566528bE(ptr %a) unnamed_addr #0 {
%_3 = ptrtoint ptr %a to i64
%_2 = or i64 %_3, 1
%0 = inttoptr i64 %_2 to ptr
ret ptr %0
}
define ptr @_ZN7example12cheri_compat17hf4c9a96c91413294E(ptr %a) unnamed_addr #0 {
%old = ptrtoint ptr %a to i64
%old.not = and i64 %old, 1
%diff = xor i64 %old.not, 1
%0 = getelementptr i8, ptr %a, i64 %diff
ret ptr %0
}
define ptr @_ZN7example4fast17hfe789c747db55c0eE(ptr %a) unnamed_addr #0 {
%old = ptrtoint ptr %a to i64
%new = or i64 %old, 1
%count = sub i64 0, %old
%0 = getelementptr i8, ptr %a, i64 %count
%1 = getelementptr i8, ptr %0, i64 %new
ret ptr %1
} So I could write a codegen test that checks that they compile to what they do right now, but I feel like it would be valid and good for LLVM to produce the same IR for all 3, and it would be a bummer if the codegen test broke when it does. If there's a way around this, I'm happy to PR the codegen test. |
The codegen test in this case would be an x86 assembly codegen test. The LLVM IR cannot be the same, because all of those have different semantics (in terms of provenance). |
The last two should be identical in terms of provenance? |
…r=nikic Add a codegen test for rust-lang#96152 This is a regression test for rust-lang#96152, it is intended to check that our codegen for a particular strict provenance pattern is always as good as the ptr2int2ptr/provenance-ignoring style. r? `@nikic`
I came across this here: tokio-rs/bytes#542 so I'm raising it more broadly because this seems like a portability hazard. godbolt demo
In a world where we can forget about provenance, one could set the lowest bit in a pointer with this:
But of course we want to have a provenance model, including because we want to support architectures where pointer provenance is checked at runtime. So one might want to implement this function like so to be compatible with CHERI:
But that version is slower. Instead of just
mov + or
, it gets compiled tomov + not + and + and
. Which is very silly. We can get the original codegen back by writing this in a style which is almost certainly invalid on CHERI:It doesn't make sense to me that users should have to choose between compatibility with CHERI and avoiding ptr-int-ptr casts while keeping a careful eye out for codegen regressions.
The text was updated successfully, but these errors were encountered: