Autodiff Upstreaming - rustc_codegen_llvm changes #130060

ZuseZ4 · 2024-09-07T07:37:16Z

Now that the autodiff/Enzyme backend is merged, this is an upstream PR for the rustc_codegen_llvm changes.
It also includes small changes to three files under compiler/rustc_ast, which overlap with my frontend PR (#129458).
Here I only include minimal definitions of structs and enums to be able to build this backend code.
The same goes for minimal changes to compiler/rustc_codegen_ssa, the majority of changes there will be in another PR, once either this or the frontend gets merged.

We currently have 68 files left to merge, 19 in the frontend PR, 21 (+3 from the frontend) in this PR, and then ~30 in the middle-end.

This PR is large because it includes two of my three large files (~800 loc each). I could also first only upstream enzyme_ffi.rs, but I think people might want to see some use of these bindings in the same PR?

To already highlight the things which reviewers might want to discuss:

enzyme_ffi.rs: I do have a fallback module to make sure that we don't link rustc against Enzyme when we build rustc without autodiff support.
add_panic_msg_to_global was a pain to write and I currently can't even use it. Enzyme writes gradients into shadow memory. Pass in one float scalar? We'll allocate and return an extra float telling you how this float affected the output. Pass in a slice of floats? We'll let you allocate the vector and pass in a mutable reference to a float slice, we'll then write the gradient into that slice. It should be at least as large as your original slice, so we check that and panic if not. Currently we panic silently, but I already generate a nicer panic message with this function. I just don't know how to print it to the user. yet. I discussed this with a few rustc devs and the best we could come up with (for now), was to look for mangled panic calls in the IR and pick one, which works surprisingly reliably. If someone knows a good way to clean this up and print the panic message I'm all in, otherwise I can remove the code that writes the nicer panic message and keep the silent panic, since it's enough for soundness. Especially since this PR is already a bit larger.
SanitizeHWAddress: When differentiating C++, Enzyme can use TBAA to "understand" enums/unions, but for Rust we don't have this information. LLVM might to speculative loads which (without TBAA) confuse Enzyme, so we disable those with this attribute. This attribute is only set during the first opt run before Enzyme differentiates code. We then remove it again once we are done with autodiff and run the opt pipeline a second time. Since enums are everywhere in Rust, support for them is crucial, but if this looks too cursed I can remove these ~100 lines and keep them in my fork for now, we can then discuss them separately to make this PR simpler?
Duplicated llvm-opt runs: Differentiating already optimized code (and being able to do additional optimizations on the fly, e.g. for GPU code) is the reason why Enzyme is so fast, so the compile time is acceptable for autodiff users: https://enzyme.mit.edu/talks/Publications/ (There are also algorithmic issues in Enzyme core which are more serious than running opt twice).
I assume that if we merge these minimal cg_ssa changes here already, I also need to fix the other backends (GCC and cliff) to have dummy implementations, correct?
I'm happy to split this PR up further if reviewers have recommendations on how to.

For the full implementation, see: #129175

Tracking:

Tracking Issue for autodiff #124509

rustbot · 2024-09-07T07:37:24Z

r? @fee1-dead

rustbot has assigned @fee1-dead.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

rustbot · 2024-09-07T07:37:25Z

⚠️ Warning ⚠️

These commits modify submodules.

rustbot · 2024-09-07T07:37:27Z

This PR modifies config.example.toml.

If appropriate, please update CONFIG_CHANGE_HISTORY in src/bootstrap/src/utils/change_tracker.rs.

Some changes occurred in cfg and check-cfg configuration

cc @Urgau

Urgau · 2024-09-07T10:06:53Z

compiler/rustc_session/src/config/cfg.rs

@@ -176,6 +176,8 @@ pub(crate) fn default_configuration(sess: &Session) -> Cfg {
    // NOTE: These insertions should be kept in sync with
    // `CheckCfg::fill_well_known` below.

+    ins_none!(sym::autodiff_fallback);


This shouldn't be insta stable, it should be at least gated behind nightly compiler.

Suggested change

ins_none!(sym::autodiff_fallback);

if sess.is_nightly_build() {

ins_none!(sym::autodiff_fallback);

}

Please also follow all the steps regarding a new cfg as defined in the top of this file (as well as the tests files):

rust/compiler/rustc_session/src/config/cfg.rs

Lines 10 to 21 in e26b02a

//! ## Adding a new cfg

//!

//! Adding a new feature requires two new symbols one for the cfg it-self

//! and the second one for the unstable feature gate, those are defined in

//! `rustc_span::symbol`.

//!

//! As well as the following points,

//! - Add the activation logic in [`default_configuration`]

//! - Add the cfg to [`CheckCfg::fill_well_known`] (and related files),

//! so that the compiler can know the cfg is expected

//! - Add the cfg in [`disallow_cfgs`] to disallow users from setting it via `--cfg`

//! - Add the feature gating in `compiler/rustc_feature/src/builtin_attrs.rs`

fee1-dead · 2024-09-08T03:45:33Z

r? compiler

michaelwoerister · 2024-10-07T09:21:57Z

r? compiler

davidtwco · 2024-10-07T14:24:44Z

There's very little chance of this being merged in one PR with one commit of this size. You'll need to split this up into well-commented/motivated PRs that can be landed one at a time. I haven't spent much time looking at this PR, so I don't have any suggestions on how to split this up. I'd recommend finding someone on the compiler team who is interested in these changes and who you can work with to do the reviews.

nikic

I don't have time to review this, so just one drive-by note: It looks like a decent part of the extra FFI APIs in enzyme_ffi.rs are essentially duplicates of things that we already have bindings for under slightly different names and signatures. Like we already have LLVMRustAddFunctionAttributes/LLVMRustAddCallSiteAttributes and this introduces LLVMRustAddEnumAttributeAtIndex. It also looks like the code doesn't make use of the Builder abstraction and instead calls FFI APIs directly everywhere, which is probably also where the duplication comes from.

compiler/rustc_codegen_llvm/src/llvm/enzyme_ffi.rs

oli-obk · 2025-01-01T20:07:26Z

bors · 2025-01-01T20:07:28Z

📌 Commit 4895a33 has been approved by oli-obk

It is now in the queue for this repository.

klensy · 2025-01-01T20:20:13Z

compiler/rustc_llvm/llvm-wrapper/RustWrapper.cpp

+//    pub fn LLVMRustVerifyFunction(V: &Value, action: LLVMRustVerifierFailureAction) -> Bool;
+extern "C" bool LLVMRustVerifyFunction(LLVMValueRef Fn,
+                                       LLVMRustVerifierFailureAction Action) {
+  //Function *F = unwrap<Function>(Fn);
+  return LLVMVerifyFunction(Fn, fromRust(Action));
+}


Uhh, sorry, but looks like after rebase this reverted back to wrong LLVMBool\bool types. And stray commendted out code.

Thank you, I also formated the c++ file.

ZuseZ4 · 2025-01-01T22:06:55Z

thank you to everyone for the various rounds of review!

traviscross · 2025-01-01T22:20:44Z

@bors r=oli-obk

bors · 2025-01-01T22:20:47Z

📌 Commit d753cbf has been approved by oli-obk

It is now in the queue for this repository.

bors · 2025-01-02T00:21:01Z

⌛ Testing commit d753cbf with merge 504f4f5...

bors · 2025-01-02T03:04:26Z

☀️ Test successful - checks-actions
Approved by: oli-obk
Pushing 504f4f5 to master...

rust-timer · 2025-01-02T04:20:31Z

Finished benchmarking commit (504f4f5): comparison URL.

Overall result: ❌✅ regressions and improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	0.3%	[0.3%, 0.3%]	1
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-0.2%	[-0.3%, -0.2%]	2
All ❌✅ (primary)	-	-	0

Max RSS (memory usage)

Results (primary 0.2%, secondary -2.2%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	1.8%	[0.8%, 2.8%]	2
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-2.9%	[-2.9%, -2.9%]	1
Improvements ✅ (secondary)	-2.2%	[-2.2%, -2.2%]	1
All ❌✅ (primary)	0.2%	[-2.9%, 2.8%]	3

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 762.325s -> 764.563s (0.29%)
Artifact size: 325.51 MiB -> 325.55 MiB (0.01%)

Add link attribute for Enzyme's LLVMRust FFI Since rust-lang#133429 landed, the compiler doesn't build with `-Zcross-crate-inline-threshold=always`. I don't expect anyone else to test or fix issues with that goofy configuration, so I'm fixing it. This PR adds a link attribute just like rust-lang#118142 for all the new LLVMRust functions. They were actually added in rust-lang#130060 but weren't used until just now.

Rollup merge of rust-lang#136374 - saethlin:enzyme-linkage, r=oli-obk Add link attribute for Enzyme's LLVMRust FFI Since rust-lang#133429 landed, the compiler doesn't build with `-Zcross-crate-inline-threshold=always`. I don't expect anyone else to test or fix issues with that goofy configuration, so I'm fixing it. This PR adds a link attribute just like rust-lang#118142 for all the new LLVMRust functions. They were actually added in rust-lang#130060 but weren't used until just now.

rustbot assigned fee1-dead Sep 7, 2024

rustbot added S-waiting-on-review T-bootstrap T-compiler labels Sep 7, 2024

This comment has been minimized.

Sign in to view

Urgau reviewed Sep 7, 2024

View reviewed changes

jieyouxu added the F-autodiff label Sep 7, 2024

rustbot assigned michaelwoerister and unassigned fee1-dead Sep 8, 2024

ZuseZ4 mentioned this pull request Sep 12, 2024

Expose experimental LLVM features for GPU offloading rust-lang/rust-project-goals#109

Open

4 tasks

traviscross mentioned this pull request Sep 7, 2024

Tracking Issue for autodiff #124509

Open

7 tasks

ZuseZ4 force-pushed the enzyme-cg-llvm branch from e26b02a to 8726c99 Compare October 1, 2024 00:31

This comment has been minimized.

Sign in to view

This comment was marked as resolved.

Sign in to view

rustbot assigned davidtwco and unassigned michaelwoerister Oct 7, 2024

nikic reviewed Oct 7, 2024

View reviewed changes

davidtwco added S-waiting-on-author and removed S-waiting-on-review labels Oct 11, 2024

alex-semenyuk added S-waiting-on-author and removed S-waiting-on-author labels Nov 13, 2024

ZuseZ4 force-pushed the enzyme-cg-llvm branch 3 times, most recently from 11c3bae to 78297a9 Compare November 21, 2024 01:21

klensy reviewed Dec 29, 2024

View reviewed changes