Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autodiff Upstreaming - rustc_codegen_llvm changes #178

Open
wants to merge 4,974 commits into
base: master
Choose a base branch
from
Open

Conversation

ZuseZ4
Copy link
Member

@ZuseZ4 ZuseZ4 commented Sep 7, 2024

Now that the autodiff/Enzyme backend is merged, this is an upstream PR for the rustc_codegen_llvm changes.
It also includes small changes to three files under compiler/rustc_ast, which overlap with my frontend PR.
Here I only include minimal definitions of structs and enums to be able to build this backend code.
The same goes for minimal changes to compiler/rustc_codegen_ssa, the majority of changes there will be in another PR, once either this or the frontend gets merged.

We currently have 68 files left to merge, 19 in the frontend PR, 21 (+3 from the frontend) in this PR, and then ~30 in the middle-end.

To already highlight the things which reviewers might want to discuss:

  1. enzyme_ffi.rs: I do have a fallback module to make sure that we don't link rustc against Enzyme when we build rustc without autodiff support.

  2. add_panic_msg_to_global was a pain to write and I currently can't even use it. Enzyme writes gradients into shadow memory. Pass in one float scalar? We'll allocate and return an extra float telling you how this float affected the output. Pass in a slice of floats? We'll let you allocate the vector and pass in a mutable reference to a float slice, we'll then write the gradient into that slice. It should be at least as large as your original slice, so we check that and panic if not. Currently we panic silently, but I already generate a nicer panic message with this function. I just don't know how to print it to the user. yet. I discussed this with a few rustc devs and the best we could come up with (for now), was to look for mangled panic calls in the IR and pick one, which works surprising reliably. If someone knows a good way to clean this up and print the panic message I'm all in, otherwise I can remove the code that writes the nicer panic message and keep the silent panic, since it's enough for soundness. Especially since this PR is already a bit larger.

  3. SanitizeHWAddress: When differentiating C++, Enzyme can use TBAA to "understand" enums/unions, but for Rust we don't have this information. LLVM might to speculative loads which (without TBAA) confuse Enzyme, so we disable those with this attribute. This attribute is only set during the first opt run before Enzyme differentiates code. We then remove it again once we are done with autodiff and run the opt pipeline a second time. Since enums are everywhere in Rust, support for them is crucial, but if this looks too cursed I can remove these ~100 lines and keep them in my fork for now, we can then discuss them separately to make this PR simpler?

  4. Duplicated llvm-opt runs: Differentiating already optimized code (and being able to do additional optimizations on the fly, e.g. for GPU code) is the reason why Enzyme is so fast, so the compile time is acceptable for autodiff users: https://enzyme.mit.edu/talks/Publications/ (There are also algorithmic issues in Enzyme core which are more serious than running opt twice).

  5. I assume that if we merge these minimal cg_ssa changes here already, I also need to fix the other backends (GCC and cliff) to have dummy implementations, correct?

  6. I'm happy to split this PR up further if reviewers have recommendations on how to.

For the full implementation, see: rust-lang#129175

Tracking:

@ZuseZ4 ZuseZ4 force-pushed the master branch 3 times, most recently from 4f0389e to 012062a Compare October 28, 2024 02:32
@ZuseZ4 ZuseZ4 force-pushed the enzyme-cg-llvm branch 12 times, most recently from 469aaf0 to 79a3e2b Compare November 24, 2024 04:09
bors and others added 11 commits December 22, 2024 16:09
Asserts the maximum value that can be returned from `Vec::len`

Currently, casting `Vec<i32>` to `Vec<u32>` takes O(1) time:

```rust
// See <https://godbolt.org/z/hxq3hnYKG> for assembly output.
pub fn cast(vec: Vec<i32>) -> Vec<u32> {
    vec.into_iter().map(|e| e as _).collect()
}
```

But the generated assembly is not the same as the identity function, which prevents us from casting `Vec<Vec<i32>>` to `Vec<Vec<u32>>` within O(1) time:

```rust
// See <https://godbolt.org/z/7n48bxd9f> for assembly output.
pub fn cast(vec: Vec<Vec<i32>>) -> Vec<Vec<u32>> {
    vec.into_iter()
        .map(|e| e.into_iter().map(|e| e as _).collect())
        .collect()
}
```

This change tries to fix the problem. You can see the comparison here: <https://godbolt.org/z/jdManrKvx>.
The root user can write to files without any (write) access bits set. But this is not taken into account by `std::fs::Permissions.readonly()`.
…ewjasper

Delete `Rvalue::Len` 🎉

Everything's moved to `PtrMetadata`, so we can get rid of the `Len` variant now.

~~Depends on rust-lang#134326, so draft until that lands~~ Ready!

r? mir
… and `SyncUnsafeCell`.

Implementing `PointerLike` for `UnsafeCell` enables the possibility of
interior mutable `dyn*` values. Since this means potentially exercising
new codegen behavior, I added a test for it in `tests/ui/dyn-star/cell.rs`.

Also updated UI tests to account for the `isize` implementation changing
error messages.
`recursive_remove` is intended to be a wrapper around
`std::fs::remove_dir_all`, but which also allows the removal target to
be a non-directory entry, i.e. a file or a symlink. It also tries to
remove read-only attributes from filesystem entities on Windows for
non-dir entries, as `std::fs::remove_dir_all` handles the dir (and its
children) read-only cases.

Co-authored-by: Chris Denton <chris@chrisdenton.dev>
`aggressive_rm_rf` does not correctly handle distinction between
symlink-to-file vs symlink-to-dir on Windows.
bors and others added 25 commits December 28, 2024 05:48
…eemdev

Disable `backtrace-debuginfo.rs` on windows-gnu

This test appears still flaky cf. rust-lang#117097 on `i686-mingw` as observed in rust-lang#131244 (comment).

r? compiler (or anyone, really)
…-base, r=compiler-errors

Implement `default_overrides_default_fields` lint

Detect when a manual `Default` implementation isn't using the existing default field values and suggest using `..` instead:

```
error: `Default` impl doesn't use the declared default field values
  --> $DIR/manual-default-impl-could-be-derived.rs:14:1
   |
LL | / impl Default for A {
LL | |     fn default() -> Self {
LL | |         A {
LL | |             y: 0,
   | |                - this field has a default value
...  |
LL | | }
   | |_^
   |
   = help: use the default values in the `impl` with `Struct { mandatory_field, .. }` to avoid them diverging over time
note: the lint level is defined here
  --> $DIR/manual-default-impl-could-be-derived.rs:5:9
   |
LL | #![deny(default_overrides_default_fields)]
   |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
```

r? `@compiler-errors`

This is a simpler version of rust-lang#134441, detecting the simpler case when a field with a default should have not been specified in the manual `Default::default()`, instead using `..` for it. It doesn't provide any suggestions, nor the checks for "equivalences" nor whether the value used in the imp being used would be suitable as a default field value.
…check-IBT, r=lqd

Migrate `branch-protection-check-IBT` to rmake.rs

- The Makefile version *never* ran because of Makefile syntax confusion because `ifeq ($(filter x86,$(LLVM_COMPONENTS)),x86_64)` [compares `x86` to `x86_64`, which always evaluates to false](rust-lang#126720 (comment)).
- The test would've always failed because precompiled std is not built with `-Z cf-protection=branch`, but linkers require all input object files to indicate IBT support in order to enable IBT for the executable, which is not the case for std.
- Thus, the test input file is instead changed to a `no_std` program.
- The test is currently limited to only `x86_64-unknown-linux-gnu` host, there are various other problems when the test is cross-compiled that I didn't want to fix atm, and is left as an exercise for the `-Z cf-protection` implementers.

The GNU property note was added by rust-lang#110304 in order to address rust-lang#103001.

Partially supersedes rust-lang#129156.
The rmake.rs port was initially authored by `@Rejyr` in rust-lang#126720.
This PR is co-authored with `@Oneirical` and `@Rejyr.`

r? `@bjorn3` or reroll

try-job: x86_64-mingw-1
try-job: x86_64-mingw-2
try-job: x86_64-msvc
try-job: x86_64-apple-1
try-job: x86_64-apple-2
…inks, r=lqd

Migrate `libs-through-symlink` to rmake.rs

Part of rust-lang#121876.

This PR migrates `tests/run-make/libs-through-symlink/` to use rmake.rs.

- Regression test for rust-lang#13890.
- Original fix PR is rust-lang#13903.
- Document test intent, backlink to rust-lang#13890 and fix PR rust-lang#13903.
- Fix the test logic: the `Makefile` version seems to not actually be exercising the "library search traverses symlink" logic, because the actual symlinked-to-library is present under the `$(TMPDIR)` directory tree when `bar.rs` is compiled, because the `$(RUSTC)` invocation has an implicit `-L $(TMPDIR)`. The symlink itself was actually broken, i.e. it should've been `ln -nsf $(TMPDIR)/outdir/$(NAME) $(TMPDIR)` but it used `ln -nsf outdir/$(NAME) $(TMPDIR)`. The rmake.rs version now explicitly separates the two directory trees and sets the CWD of the `bar.rs` rustc invocation so that the actual library is *not* present under its CWD tree.

I.e. it is now

```
$test_output/           # rustc foo.rs -o actual_lib_dir/libfoo.rlib
    actual_lib_dir/
        libfoo.rlib
    symlink_lib_dir/    # CWD set; rustc -L . bar.rs
        libfoo.rlib --> $test_output/actual_lib_dir/libfoo.rlib
```

Partially supersedes rust-lang#129011.
This PR is co-authored with `@Oneirical.`

r? compiler
compiletest: Only pass the post-colon value to `parse_normalize_rule`

Addresses one of the FIXMEs noted in rust-lang#134759.

I started working on the other FIXME, but it became complex enough that I wanted to split it off from this PR.

r? jieyouxu
Rollup of 5 pull requests

Successful merges:

 - rust-lang#134737 (Implement `default_overrides_default_fields` lint)
 - rust-lang#134760 (Migrate `branch-protection-check-IBT` to rmake.rs)
 - rust-lang#134829 (Migrate `libs-through-symlink` to rmake.rs)
 - rust-lang#134832 (Update `compiler-builtins` to 0.1.140)
 - rust-lang#134840 (compiletest: Only pass the post-colon value to `parse_normalize_rule`)

r? `@ghost`
`@rustbot` modify labels: rollup
This doesn't quite handle the more exotic path suffix matches that test
filters seem to accept (e.g. `library/test` can be matched with
`--exclude test`), so avoid warning on non-existent top-level test
suites for now. A proper fix will need to possibly query test `Step`s
for their exclude logic.
…, r=DianQK

Consider empty-unreachable otherwise branches in MatchBranchSimplification

Fixes rust-lang#131219
There are no tests in `tests/debuginfo` that use this prefix.
Unify fs::copy and io::copy on Linux

Currently, `fs::copy` first tries a regular file copy (via copy_file_range) and then falls back to userspace read/write copying. We should use `io::copy` instead as it tries copy_file_range, sendfile, and splice before falling back to userspace copying. This was discovered here: SUPERCILEX/fuc#40

Perf impact: `fs::copy` will now have two additional statx calls to decide which syscall to use. I wonder if we should get rid of the statx calls and only continue down the next fallback when the relevant syscalls say the FD isn't supported.
replace bootstrap-self-test feature flag with cfg(test)

This makes it in more rusty way.
…zkan

bootstrap: drop warning for top-level test suite path check due to false positives

The current top-level test suite directory does not exist warning logic doesn't quite handle the more exotic path suffix matches that test filters seem to accept (e.g. `library/test` can be matched with `--exclude test`), so avoid warning on non-existent top-level test suites for now. To avoid false positives, we probably need to query test `Step`s for their `should_run(exclude_filter)` logic.

This retains the fix for the Windows path handling (unlike rust-lang#134843).

r? `@onur-ozkan`
bootstrap: Allow `./x check compiletest`

Did you know that bootstrap didn't support `./x check compiletest`? Well, now it does!

Manually add `"compiletest"` to your `rust-analyzer.check.overrideCommand` check command to get error/warning integration when modifying compiletest.
…,clubby789,jieyouxu

compiletest: Slightly simplify the handling of debugger directive prefixes

The `cdbg-` prefix is not used by any tests in `tests/debuginfo`, and perhaps there never were any tests that used it.

Getting rid of it also lets us get rid of the code for parsing multiple prefixes at the same time, since every debugger now has exactly one prefix.
…r=bjorn3

Document virality of `feature(rustc_private)`

Closes rust-lang#134825.

r? `@bjorn3`
Added a codegen test for optimization with const arrays

Closes rust-lang#107208
compiler & tools dependencies:
     Locking 8 packages to latest compatible versions
    Updating anyhow v1.0.94 -> v1.0.95
    Updating glob v0.3.1 -> v0.3.2
    Updating quote v1.0.37 -> v1.0.38
    Updating rustversion v1.0.18 -> v1.0.19
    Updating serde v1.0.216 -> v1.0.217
    Updating serde_derive v1.0.216 -> v1.0.217
    Updating syn v2.0.90 -> v2.0.93
    Updating unicase v2.8.0 -> v2.8.1
note: pass `--verbose` to see 36 unchanged dependencies behind latest

library dependencies:
     Locking 0 packages to latest compatible versions
note: pass `--verbose` to see 5 unchanged dependencies behind latest

rustbook dependencies:
     Locking 7 packages to latest compatible versions
    Updating anyhow v1.0.94 -> v1.0.95
    Updating cc v1.2.5 -> v1.2.6
    Updating quote v1.0.37 -> v1.0.38
    Updating serde v1.0.216 -> v1.0.217
    Updating serde_derive v1.0.216 -> v1.0.217
    Updating syn v2.0.90 -> v2.0.93
    Updating unicase v2.8.0 -> v2.8.1
Rollup of 3 pull requests

Successful merges:

 - rust-lang#134849 (compiletest: Slightly simplify the handling of debugger directive prefixes)
 - rust-lang#134850 (Document virality of `feature(rustc_private)`)
 - rust-lang#134852 (Added a codegen test for optimization with const arrays)

r? `@ghost`
`@rustbot` modify labels: rollup
Weekly `cargo update`

Automation to keep dependencies in `Cargo.lock` current.

The following is the output from `cargo update`:

```txt

compiler & tools dependencies:
     Locking 8 packages to latest compatible versions
    Updating anyhow v1.0.94 -> v1.0.95
    Updating glob v0.3.1 -> v0.3.2
    Updating quote v1.0.37 -> v1.0.38
    Updating rustversion v1.0.18 -> v1.0.19
    Updating serde v1.0.216 -> v1.0.217
    Updating serde_derive v1.0.216 -> v1.0.217
    Updating syn v2.0.90 -> v2.0.93
    Updating unicase v2.8.0 -> v2.8.1
note: pass `--verbose` to see 36 unchanged dependencies behind latest

library dependencies:
     Locking 0 packages to latest compatible versions
note: pass `--verbose` to see 5 unchanged dependencies behind latest

rustbook dependencies:
     Locking 7 packages to latest compatible versions
    Updating anyhow v1.0.94 -> v1.0.95
    Updating cc v1.2.5 -> v1.2.6
    Updating quote v1.0.37 -> v1.0.38
    Updating serde v1.0.216 -> v1.0.217
    Updating serde_derive v1.0.216 -> v1.0.217
    Updating syn v2.0.90 -> v2.0.93
    Updating unicase v2.8.0 -> v2.8.1
```
Avoid ICE in borrowck

Provide a fallback in `best_blame_constraint` when `find_constraint_paths_between_regions` doesn't have a result. This code is due a rework to avoid the letf-over `unwrap()`, but avoids the ICE caused by the repro.

Fix rust-lang#133252.
@ZuseZ4 ZuseZ4 force-pushed the enzyme-cg-llvm branch 2 times, most recently from 642a1f7 to c8d716a Compare December 29, 2024 11:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.