-
Notifications
You must be signed in to change notification settings - Fork 13.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add amdgpu target #134740
Add amdgpu target #134740
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @GuillaumeGomez (or someone else) some time within the next two weeks. Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (
|
These commits modify compiler targets. This PR changes how LLVM is built. Consider updating src/bootstrap/download-ci-llvm-stamp. Some changes occurred in src/doc/rustc/src/platform-support cc @Noratrieb This PR modifies If appropriate, please update |
r? jieyouxu |
cc @eddyb Hello, tagging you for domain expertise if you want to chime in. |
Thanks for the PR, @Flakebi. I'm going to request that you open a MCP at https://github.com/rust-lang/compiler-team/issues/ to gauge team consensus for adding this target, primarily to give compiler team members some opportunity to ask clarifying questions and register possible concerns, since:
Note that usually adding more "conventional" Tier 3 targets do not need to go through the MCP process, but this target looks not so conventional. |
@rustbot author |
Thank you for the quick review! I opened an MCP here: rust-lang/compiler-team#823 |
cc @ZuseZ4 |
☔ The latest upstream changes (presumably #134822) made this pull request unmergeable. Please resolve the merge conflicts. |
Add amdgpu target Add amdgpu target to rustc and enable the LLVM target. Fix compiling `core` with the amdgpu: The amdgpu backend makes heavy use of different address spaces. This leads to situations, where a pointer in one addrspace needs to be casted to a pointer in a different addrspace. `bitcast` is invalid for this case, `addrspacecast` needs to be used. Fix compilation failures that created bitcasts for such cases by creating pointer casts (which creates an `addrspacecast` under the hood) instead. MCP: rust-lang/compiler-team#823 Tracking issue: rust-lang#135024 Kinda related to the original amdgpu tracking issue rust-lang#51575 (though that one has been closed for a while). try-job: dist-loongarch64-linux try-job: dist-loongarch64-musl try-job: dist-powerpc64-linux
☀️ Try build successful - checks-actions |
@bors r=workingjubilee |
💡 This pull request was already approved, no need to approve it again.
|
☀️ Test successful - checks-actions |
Finished benchmarking commit (c03c38d): comparison URL. Overall result: ❌ regressions - please read the text belowOur benchmarks found a performance regression caused by this PR. Next Steps:
@rustbot label: +perf-regression Instruction countThis is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.
Max RSS (memory usage)Results (primary 2.1%, secondary 3.3%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (primary 2.4%, secondary -2.6%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 777.876s -> 781.62s (0.48%) |
The |
I'm not sure that size is really a driver for more instruction counts, but it does look like the mere possibility of cross-compiling to AMDGPU is enabling more passes/logic(?) even if presumably those don't do anything on x86. Maybe an optimization opportunity for LLVM/clang? It might be unavoidable with the architecture LLVM has today though. Sampling a few cachegrind diffs: helloworld:
clap:
|
what the eff. we should probably look into this more, it’s super weird |
I remember that sometimes binary/dynamic library size increases also increased icounts due to the dynamic linker doing more work. But based on CacheGrind, it looks like LLVM is actually doing more work, seems like it maybe iterates over more passes that were enabled by the amdgpu target? |
The max-rss increases also look unexpected, and numerous enough to not be measurement noise. (Could memory allocation be in these ??? cg reports, sometimes it does this for me rather than finding jemalloc/malloc, probably some tests with local builds could be interesting with better debuginfo. That would also help with checking the cycles and wall time results, which seemingly aren’t super stable in these results.) Does this need more time to bake maybe? @Mark-Simulacrum you’ve marked this as triaged because it’s less actionable on our side than in llvm, right? |
[experiment] dont init anything except x86 What if do not init all llvm targets always? Maybe fix regression in rust-lang#134740 r? `@ghost` `@rustbot` label +S-experimental btw, here https://github.com/rust-lang/rust/blob/c182ce9cbc8c29ebc1b4559d027df545e6cdd287/compiler/rustc_llvm/llvm-wrapper/PassWrapper.cpp#L81-L186 similar list for targets, but it missing amdgpu. Is amdgpu works without it? kick perf run please
hate to go "ooh, LLVM troubles, let's tell Nikita!" but uhhhh "weird LLVM perf" really does need the Vibe Sense of that kind of expertise, sooo cc @nikic |
Adding the amdgpu target shouldn't make any additional passes run -- additional cost from registering additional passes etc is plausible though. max-rss increasing with increasing code size is pretty common. |
Upstream changes relative to 1.85.1: Version 1.86.0 (2025-04-03) ========================== Language -------- - [Stabilize upcasting trait objects to supertraits.] (rust-lang/rust#134367) - [Allow safe functions to be marked with the `#[target_feature]` attribute.] (rust-lang/rust#134090) - [The `missing_abi` lint now warns-by-default.] (rust-lang/rust#132397) - Rust now lints about double negations, to catch cases that might have intended to be a prefix decrement operator (`--x`) as written in other languages. This was previously a clippy lint, `clippy::double_neg`, and is [now available directly in Rust as `double_negations`.] (rust-lang/rust#126604) - [More pointers are now detected as definitely not-null based on their alignment in const eval.] (rust-lang/rust#133700) - [Empty `repr()` attribute applied to invalid items are now correctly rejected.] (rust-lang/rust#133925) - [Inner attributes `#![test]` and `#![rustfmt::skip]` are no longer accepted in more places than intended.] (rust-lang/rust#134276) Compiler -------- - [Debug-assert that raw pointers are non-null on access.] (rust-lang/rust#134424) - [Change `-O` to mean `-C opt-level=3` instead of `-C opt-level=2` to match Cargo's defaults.] (rust-lang/rust#135439) - [Fix emission of `overflowing_literals` under certain macro environments.] (rust-lang/rust#136393) Platform Support ---------------- - [Replace `i686-unknown-redox` target with `i586-unknown-redox`.] (rust-lang/rust#136698) - [Increase baseline CPU of `i686-unknown-hurd-gnu` to Pentium 4.] (rust-lang/rust#136700) - New tier 3 targets: - [`{aarch64-unknown,x86_64-pc}-nto-qnx710_iosock`] (rust-lang/rust#133631). For supporting Neutrino QNX 7.1 with `io-socket` network stack. - [`{aarch64-unknown,x86_64-pc}-nto-qnx800`] (rust-lang/rust#133631). For supporting Neutrino QNX 8.0 (`no_std`-only). - [`{x86_64,i686}-win7-windows-gnu`] (rust-lang/rust#134609). Intended for backwards compatibility with Windows 7. `{x86_64,i686}-win7-windows-msvc` are the Windows MSVC counterparts that already exist as Tier 3 targets. - [`amdgcn-amd-amdhsa`](rust-lang/rust#134740). - [`x86_64-pc-cygwin`](rust-lang/rust#134999). - [`{mips,mipsel}-mti-none-elf`] (rust-lang/rust#135074). Initial bare-metal support. - [`m68k-unknown-none-elf`](rust-lang/rust#135085). - [`armv7a-nuttx-{eabi,eabihf}`, `aarch64-unknown-nuttx`, and `thumbv7a-nuttx-{eabi,eabihf}`] (rust-lang/rust#135757). Refer to Rust's [platform support page][platform-support-doc] for more information on Rust's tiered platform support. Libraries --------- - The type of `FromBytesWithNulError` in `CStr::from_bytes_with_nul(bytes: &[u8]) -> Result<&Self, FromBytesWithNulError>` was [changed from an opaque struct to an enum] (rust-lang/rust#134143), allowing users to examine why the conversion failed. - [Remove `RustcDecodable` and `RustcEncodable`.] (rust-lang/rust#134272) - [Deprecate libtest's `--logfile` option.] (rust-lang/rust#134283) - [On recent versions of Windows, `std::fs::remove_file` will now remove read-only files.] (rust-lang/rust#134679) Stabilized APIs --------------- - [`{float}::next_down`] (https://doc.rust-lang.org/stable/std/primitive.f64.html#method.next_down) - [`{float}::next_up`] (https://doc.rust-lang.org/stable/std/primitive.f64.html#method.next_up) - [`<[_]>::get_disjoint_mut`] (https://doc.rust-lang.org/stable/std/primitive.slice.html#method.get_disjoint_mut) - [`<[_]>::get_disjoint_unchecked_mut`] (https://doc.rust-lang.org/stable/std/primitive.slice.html#method.get_disjoint_unchecked_mut) - [`slice::GetDisjointMutError`] (https://doc.rust-lang.org/stable/std/slice/enum.GetDisjointMutError.html) - [`HashMap::get_disjoint_mut`] (https://doc.rust-lang.org/std/collections/hash_map/struct.HashMap.html#method.get_disjoint_mut) - [`HashMap::get_disjoint_unchecked_mut`] (https://doc.rust-lang.org/std/collections/hash_map/struct.HashMap.html#method.get_disjoint_unchecked_mut) - [`NonZero::count_ones`] (https://doc.rust-lang.org/stable/std/num/struct.NonZero.html#method.count_ones) - [`Vec::pop_if`] (https://doc.rust-lang.org/std/vec/struct.Vec.html#method.pop_if) - [`sync::Once::wait`] (https://doc.rust-lang.org/stable/std/sync/struct.Once.html#method.wait) - [`sync::Once::wait_force`] (https://doc.rust-lang.org/stable/std/sync/struct.Once.html#method.wait_force) - [`sync::OnceLock::wait`] (https://doc.rust-lang.org/stable/std/sync/struct.OnceLock.html#method.wait) These APIs are now stable in const contexts: - [`hint::black_box`] (https://doc.rust-lang.org/stable/std/hint/fn.black_box.html) - [`io::Cursor::get_mut`] (https://doc.rust-lang.org/stable/std/io/struct.Cursor.html#method.get_mut) - [`io::Cursor::set_position`] (https://doc.rust-lang.org/stable/std/io/struct.Cursor.html#method.set_position) - [`str::is_char_boundary`] (https://doc.rust-lang.org/stable/std/primitive.str.html#method.is_char_boundary) - [`str::split_at`] (https://doc.rust-lang.org/stable/std/primitive.str.html#method.split_at) - [`str::split_at_checked`] (https://doc.rust-lang.org/stable/std/primitive.str.html#method.split_at_checked) - [`str::split_at_mut`] (https://doc.rust-lang.org/stable/std/primitive.str.html#method.split_at_mut) - [`str::split_at_mut_checked`] (https://doc.rust-lang.org/stable/std/primitive.str.html#method.split_at_mut_checked) Cargo ----- - [When merging, replace rather than combine configuration keys that refer to a program path and its arguments.] (rust-lang/cargo#15066) - [Error if both `--package` and `--workspace` are passed but the requested package is missing.] (rust-lang/cargo#15071) This was previously silently ignored, which was considered a bug since missing packages should be reported. - [Deprecate the token argument in `cargo login` to avoid shell history leaks.] (rust-lang/cargo#15057) - [Simplify the implementation of `SourceID` comparisons.] (rust-lang/cargo#14980) This may potentially change behavior if the canonicalized URL compares differently in alternative registries. Rustdoc ----- - [Add a sans-serif font setting.] (rust-lang/rust#133636) Compatibility Notes ------------------- - [The `wasm_c_abi` future compatibility warning is now a hard error.] (rust-lang/rust#133951) Users of `wasm-bindgen` should upgrade to at least version 0.2.89, otherwise compilation will fail. - [Remove long-deprecated no-op attributes `#![no_start]` and `#![crate_id]`.] (rust-lang/rust#134300) - [The future incompatibility lint `cenum_impl_drop_cast` has been made into a hard error.] (rust-lang/rust#135964) This means it is now an error to cast a field-less enum to an integer if the enum implements `Drop`. - [SSE2 is now required for "i686" 32-bit x86 hard-float targets; disabling it causes a warning that will become a hard error eventually.] (rust-lang/rust#137037) To compile for pre-SSE2 32-bit x86, use a "i586" target instead. Internal Changes ---------------- These changes do not affect any public interfaces of Rust, but they represent significant improvements to the performance or internals of rustc and related tools. - [Build the rustc on AArch64 Linux with ThinLTO + PGO.] (rust-lang/rust#133807) The ARM 64-bit compiler (AArch64) on Linux is now optimized with ThinLTO and PGO, similar to the optimizations we have already performed for the x86-64 compiler on Linux. This should make it up to 30% faster.
Add amdgpu target to rustc and enable the LLVM target.
Fix compiling
core
with the amdgpu:The amdgpu backend makes heavy use of different address spaces. This
leads to situations, where a pointer in one addrspace needs to be casted
to a pointer in a different addrspace.
bitcast
is invalid for thiscase,
addrspacecast
needs to be used.Fix compilation failures that created bitcasts for such cases by
creating pointer casts (which creates an
addrspacecast
under the hood)instead.
MCP: rust-lang/compiler-team#823
Tracking issue: #135024
Kinda related to the original amdgpu tracking issue #51575 (though that one has been closed for a while).