-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Subtree sync for rustc_codegen_cranelift #130716
Conversation
By splitting the `FnSig` within `TyKind::FnPtr` into `FnSigTys` and `FnHeader`, which can be packed more efficiently. This reduces the size of the hot `TyKind` type from 32 bytes to 24 bytes on 64-bit platforms. This reduces peak memory usage by a few percent on some benchmarks. It also reduces cache misses and page faults similarly, though this doesn't translate to clear cycles or wall-time improvements on CI.
I rarely need a debug build of cg_clif, and even if I actually need one, using package overrides in Cargo.toml to only do a debug build for the rustc_codegen_cranelift crate works much better.
To avoid overriding the source for coretests every time.
…ouxu,Amanieu,Jubilee nontemporal_store: make sure that the intrinsic is truly just a hint The `!nontemporal` flag for stores in LLVM *sounds* like it is just a hint, but actually, it is not -- at least on x86, non-temporal stores need very special treatment by the programmer or else the Rust memory model breaks down. LLVM still treats these stores as-if they were normal stores for optimizations, which is [highly dubious](llvm/llvm-project#64521). Let's avoid all that dubiousness by making our own non-temporal stores be truly just a hint, which is possible on some targets (e.g. ARM). On all other targets, non-temporal stores become regular stores. ~~Blocked on rust-lang/stdarch#1541 propagating to the rustc repo, to make sure the `_mm_stream` intrinsics are unaffected by this change.~~ Fixes rust-lang#114582 Cc `@Amanieu` `@workingjubilee`
…ompiler-errors Shrink `TyKind::FnPtr`. By splitting the `FnSig` within `TyKind::FnPtr` into `FnSigTys` and `FnHeader`, which can be packed more efficiently. This reduces the size of the hot `TyKind` type from 32 bytes to 24 bytes on 64-bit platforms. This reduces peak memory usage by a few percent on some benchmarks. It also reduces cache misses and page faults similarly, though this doesn't translate to clear cycles or wall-time improvements on CI. r? `@compiler-errors`
Yes, we've been getting frequent spurious failures on that test. |
…bjorn3 Subtree sync for rustc_codegen_cranelift The main highlight this time is a fix for a miscompilation of RwLock on macOS. Also a Cranelift update and a couple of extra vendor intrinsics on arm64. r? `@ghost` `@rustbot` label +A-codegen +A-cranelift +T-compiler
🔒 Merge conflict This pull request and the master branch diverged in a way that cannot be automatically merged. Please rebase on top of the latest master branch, and let the reviewer approve again. How do I rebase?Assuming
You may also read Git Rebasing to Resolve Conflicts by Drew Blessing for a short tutorial. Please avoid the "Resolve conflicts" button on GitHub. It uses Sometimes step 4 will complete without asking for resolution. This is usually due to difference between how Error message
|
☔ The latest upstream changes (presumably #130724) made this pull request unmergeable. Please resolve the merge conflicts. |
afad3fb
to
9456ff1
Compare
There are merge commits (commits with multiple parents) in your changes. We have a no merge policy so these commits will need to be removed for this pull request to be merged. You can start a rebase with the following commands:
The following commits are merge commits (since this message was last posted): |
@bors r+ p=1 subtree sync |
This comment has been minimized.
This comment has been minimized.
@bors r+ p=1 subtree sync |
☀️ Test successful - checks-actions |
Finished benchmarking commit (648d024): comparison URL. Overall result: ✅ improvements - no action needed@rustbot label: -perf-regression Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)Results (primary -2.3%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (primary 0.5%, secondary 8.0%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 767.469s -> 767.841s (0.05%) |
The main highlight this time is a fix for a miscompilation of RwLock on macOS. Also a Cranelift update and a couple of extra vendor intrinsics on arm64.
r? @ghost
@rustbot label +A-codegen +A-cranelift +T-compiler