-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resurrect: rustc_target: Add alignment to indirectly-passed by-value types, correcting the alignment of byval on x86 in the process. #112157
Conversation
@bors r+ rollup=never |
📌 Commit ebb4f7ebf3e04be6992c57c48f31b850201e2dc7 has been approved by It is now in the queue for this repository. |
Nominating this as an FYI: This introduces a semantic difference between a struct with explicit alignment |
@bors r- |
f38be1c
to
f6a4828
Compare
@rustbot review |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Some changes occurred in compiler/rustc_codegen_cranelift cc @bjorn3 |
This comment has been minimized.
This comment has been minimized.
@bors r+ rollup=never |
📌 Commit 12112cf8d269d903e8bdafda730b9983cb4d489e has been approved by It is now in the queue for this repository. |
The job Click to see the possible cause of the failure (guessed by this bot)
|
@bors treeclosed=100 Try build got merged into master, everything is broken. |
@bors retry |
☀️ Test successful - checks-actions |
Finished benchmarking commit (7a17f57): comparison URL. Overall result: ❌✅ regressions and improvements - ACTION NEEDEDNext Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesThis benchmark run did not return any relevant results for this metric. Binary sizeResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 657.725s -> 657.257s (-0.07%) |
Visiting for perf-triage. The six primary regressions were all to variants of diesel (all of check/debug/opt) for variious full and incr-full scenarios. It isn't noise, there seems to be a clear cliff that starts with this PR when looking at the graph starting from 2023-0702. I'm not marking as triaged yet, but I was tempted to do so, because this PR is a prerequiste for unlocking various |
cachegrind diff of
I suspect this change allowed a memcpy to be optimized out somewhere, changing the size of a function enough to perturb inlining, and this happened to be a bad move. (I was actually expecting this to be caused by the cost of adding additional attributes, but that seems to be insignificant here.) |
Resurrect: rustc_target: Add alignment to indirectly-passed by-value types, correcting the alignment of byval on x86 in the process. Same as rust-lang#111551, which I [accidentally closed](rust-lang#111551 (comment)) :/ --- This resurrects PR rust-lang#103830, which has sat idle for a while. Beyond rust-lang#103830, this also: - fixes byval alignment for types containing vectors on Darwin (see `tests/codegen/align-byval-vector.rs`) - fixes byval alignment for overaligned types on x86 Windows (see `tests/codegen/align-byval.rs`) - fixes ABI for types with 128bit requested alignment on ARM64 Linux (see `tests/codegen/aarch64-struct-align-128.rs`) r? `@nikic` --- `@pcwalton's` original PR description is reproduced below: Commit 88e4d2c from five years ago removed support for alignment on indirectly-passed arguments because of problems with the `i686-pc-windows-msvc` target. Unfortunately, the `memcpy` optimizations I recently added to LLVM 16 depend on this to forward `memcpy`s. This commit attempts to fix the problems with `byval` parameters on that target and now correctly adds the `align` attribute. The problem is summarized in [this comment] by `@eddyb.` Briefly, 32-bit x86 has special alignment rules for `byval` parameters: for the most part, their alignment is forced to 4. This is not well-documented anywhere but in the Clang source. I looked at the logic in Clang `TargetInfo.cpp` and tried to replicate it here. The relevant methods in that file are `X86_32ABIInfo::getIndirectResult()` and `X86_32ABIInfo::getTypeStackAlignInBytes()`. The `align` parameter attribute for `byval` parameters in LLVM must match the platform ABI, or miscompilations will occur. Note that this doesn't use the approach suggested by eddyb, because I felt it was overkill to store the alignment in `on_stack` when special handling is really only needed for 32-bit x86. As a side effect, this should fix rust-lang#80127, because it will make the `align` parameter attribute for `byval` parameters match the platform ABI on LLVM x86-64. [this comment]: rust-lang#80822 (comment)
Stop using LLVM struct types for alloca, byval, sret, and many GEPs This is an extension of rust-lang#98615, extending the removal from field offsets to most places that it's feasible right now. (It might make sense to split this PR up, but I want to test perf with everything.) For `alloca`, `byval`, and `sret`, the type has no semantic meaning, only the size matters\*†. Using `[N x i8]` is a more direct way to specify that we want `N` bytes, and avoids relying on LLVM's layout algorithm. Particularly for `alloca`, it is likely that a future LLVM will change to a representation where you only specify the size. For GEPs, upstream LLVM is in the beginning stages of [migrating to `ptradd`](https://discourse.llvm.org/t/rfc-replacing-getelementptr-with-ptradd/68699). LLVM 19 will [canonicalize](llvm/llvm-project#68882) all constant-offset GEPs to i8, which is the same thing we do here. \*: Since we always explicitly specify the alignment. For `byval`, this wasn't the case until rust-lang#112157. †: For `byval`, the hidden copy may be impacted by padding in the LLVM struct type, i.e. padding bytes may not be copied. (I'm not sure if this is done today, but I think it would be legal.) But we manually pad our LLVM struct types specifically to avoid there ever being LLVM-visible padding, so that shouldn't be an issue here. r? `@ghost`
Stop using LLVM struct types for byval/sret For `byval`, and `sret`, the type has no semantic meaning, only the size matters\*†. Using `[N x i8]` is a more direct way to specify that we want `N` bytes, and avoids relying on LLVM's struct layout. \*: The alignment would also matter if we didn't explicitly specify it. From what I can tell, we always specified the alignment for `sret`; for `byval`, we didn't until rust-lang#112157. †: For `byval`, the hidden copy may be impacted by padding in the LLVM struct type, i.e. padding bytes may not be copied. (I'm not sure if this is done today, but I think it would be legal.) But we manually pad our LLVM struct types specifically to avoid there ever being LLVM-visible padding, so that shouldn't be an issue. Split out from rust-lang#121577. r? `@nikic`
Stop using LLVM struct types for byval/sret For `byval` and `sret`, the type has no semantic meaning, only the size matters\*†. Using `[N x i8]` is a more direct way to specify that we want `N` bytes, and avoids relying on LLVM's struct layout. \*: The alignment would matter, if we didn't explicitly specify it. From what I can tell, we always specified the alignment for `sret`; for `byval`, we didn't until rust-lang#112157. †: For `byval`, the hidden copy may be impacted by padding in the LLVM struct type, i.e. padding bytes may not be copied. (I'm not sure if this is done today, but I think it would be legal.) But we manually pad our LLVM struct types specifically to avoid there ever being LLVM-visible padding, so that shouldn't be an issue. Split out from rust-lang#121577. r? `@nikic`
Stop using LLVM struct types for byval/sret For `byval` and `sret`, the type has no semantic meaning, only the size matters\*†. Using `[N x i8]` is a more direct way to specify that we want `N` bytes, and avoids relying on LLVM's struct layout. \*: The alignment would matter, if we didn't explicitly specify it. From what I can tell, we always specified the alignment for `sret`; for `byval`, we didn't until rust-lang#112157. †: For `byval`, the hidden copy may be impacted by padding in the LLVM struct type, i.e. padding bytes may not be copied. (I'm not sure if this is done today, but I think it would be legal.) But we manually pad our LLVM struct types specifically to avoid there ever being LLVM-visible padding, so that shouldn't be an issue. Split out from rust-lang#121577. r? `@nikic`
Stop using LLVM struct types for byval/sret For `byval` and `sret`, the type has no semantic meaning, only the size matters\*†. Using `[N x i8]` is a more direct way to specify that we want `N` bytes, and avoids relying on LLVM's struct layout. \*: The alignment would matter, if we didn't explicitly specify it. From what I can tell, we always specified the alignment for `sret`; for `byval`, we didn't until rust-lang#112157. †: For `byval`, the hidden copy may be impacted by padding in the LLVM struct type, i.e. padding bytes may not be copied. (I'm not sure if this is done today, but I think it would be legal.) But we manually pad our LLVM struct types specifically to avoid there ever being LLVM-visible padding, so that shouldn't be an issue. Split out from rust-lang#121577. r? `@nikic`
Same as #111551, which I accidentally closed :/
This resurrects PR #103830, which has sat idle for a while.
Beyond #103830, this also:
tests/codegen/align-byval-vector.rs
)tests/codegen/align-byval.rs
)tests/codegen/aarch64-struct-align-128.rs
)r? @nikic
@pcwalton's original PR description is reproduced below:
Commit 88e4d2c from five years ago removed
support for alignment on indirectly-passed arguments because of problems with
the
i686-pc-windows-msvc
target. Unfortunately, thememcpy
optimizations Irecently added to LLVM 16 depend on this to forward
memcpy
s. This commitattempts to fix the problems with
byval
parameters on that target and nowcorrectly adds the
align
attribute.The problem is summarized in this comment by @eddyb. Briefly, 32-bit x86 has
special alignment rules for
byval
parameters: for the most part, theiralignment is forced to 4. This is not well-documented anywhere but in the Clang
source. I looked at the logic in Clang
TargetInfo.cpp
and tried to replicateit here. The relevant methods in that file are
X86_32ABIInfo::getIndirectResult()
andX86_32ABIInfo::getTypeStackAlignInBytes()
. Thealign
parameter attributefor
byval
parameters in LLVM must match the platform ABI, or miscompilationswill occur. Note that this doesn't use the approach suggested by eddyb, because
I felt it was overkill to store the alignment in
on_stack
when specialhandling is really only needed for 32-bit x86.
As a side effect, this should fix #80127, because it will make the
align
parameter attribute for
byval
parameters match the platform ABI on LLVMx86-64.