-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Result<u32, u32>
uses less efficient ABI than Result<i32, u32>
#100698
Comments
There's a very thorny ABI problem of sign-extension that we sadly had to respect: on many platforms, passing e.g. // C
int8_t halve_i8(int8_t x) { return x / 2; } // Rust
halve_i8(-2 /* 0xfe */) That may still return However, |
As for the issue itself: You'll also see this if you try to return With AFAIK this is pretty much always good when dealing with a value inside a function: one SSA component per field is what LLVM tried to get anyway (e.g. via SROA), we're just getting ahead of both us and it wasting time. For Rust's call ABI/"calling convention" (all the FFI ones ignore If both So do we want fewer registers, or less bitwise/memory juggling to encode/decode the components? I'm not sure, not even of how to test it - we could obviously add a quick check and see if For the record, this is a demonstration of the status quo on godbolt: https://godbolt.org/z/d4f9Pqexn.
cc @rust-lang/wg-llvm @workingjubilee |
What would the effect of keeping ScalarPair for fat pointers (as a lot of codegen code depends on this) and maybe structs with two register sized fields, but using Aggregate for everything else be? |
Hmm, I suppose assumptions about whether something will inline is probably relevant too. I'd guess that more registers is probably the better way if it's inlined, since I presume it's easier for LLVM to understand what's happening if it doesn't need to undo the bit packing first.
|
You also have to consider how much everything newtypes pointers, like I would personally be inclined to ignore this issue until anyone actually shows a downside from "more registers" (I'm aware of the theoretical ones but is it bad in practice?). However, we could try coalescing
Yeah, we should never do bitpacking intra-function for this reason, at most we can change the call ABI, but we want to go in the direction of more "exploding data types into their scalar leaves" (aka "SROA") intra-function - and inlining effectively "erases" the call ABI, making "inter" into "intra". This is where a proper "ABI mapping" system would be very nice, something where you'd describe a function in the form best for inlining (e.g. w/ arguments/returns split into individual SSA values), and then the export of that function would be handled just like reification to a pub static LINK_EXPORTS: MyLinkExports = MyLinkExports {
// ...
foo: foo as fn(_) -> _,
// ...
}; And then the "reify function into function pointer" construct in the IR would contain the registers/stack mapping of the arguments/return passing, outside of the function. If you have to call the exported function in a way that can handle late-binding/interposition (which is mandated in C whenever you don't label a function as And for anything private, the calls would be in the inlining-friendly form, which would also allow the backend to choose an optimal ABI even if it doesn't inline based on some global analysis (LLVM does do some stuff like this today, but it's much clunkier, since it "edits" the call ABI the frontend chose instead of synthesizing a new one). Also, in such a system, you could fully annotate arguments/return value fields with e.g. ranges and other invariants (even if the ABI mapping needs to introduce indirection, cram two smaller values into an integer etc.).
Ahh, thanks! (I was checking |
I noticed this while writing a codegen test for #37939 (comment).
It seems that
Result<u32, u32>
andResult<i32, i32>
are passed as two 32-bit numbers, whileResult<i32, u32>
andResult<u32, i32>
are passed as a single 64-bit number.Compiler:
rustc 1.65.0-nightly (86c6ebee8 2022-08-16)
Code
LLVM IR
https://rust.godbolt.org/z/n91oWdjMh
I'm not sure if there's something stopping
Result<u32, u32>
from being coerced to a single 64-bit number (maybe some heuristic for auto-vectorization?), but this seemed a bit odd to me. It's also a bit surprising to me thatResult<u32, u32>
is represented asdefine { i32, i32 }
, but I don't really know a lot about the Rust ABI.The text was updated successfully, but these errors were encountered: