Skip to content

Conversation

@ZuseZ4
Copy link
Member

@ZuseZ4 ZuseZ4 commented Aug 23, 2024

This is an upstream PR for the autodiff rustc_builtin_macro that is part of the autodiff feature.

For the full implementation, see: #129175

Content:
It contains a new #[autodiff(<args>)] rustc_builtin_macro, as well as a #[rustc_autodiff] builtin attribute.
The autodiff macro is applied on function f and will expand to a second function df (name given by user).
It will add a dummy body to df to make sure it type-checks. The body will later be replaced by enzyme on llvm-ir level,
we therefore don't really care about the content. Most of the changes (700 from 1.2k) are in compiler/rustc_builtin_macros/src/autodiff.rs, which expand the macro. Nothing except expansion is implemented for now.
I have a fallback implementation for relevant functions in case that rustc should be build without autodiff support. The default for now will be off, although we want to flip it later (once everything landed) to on for nightly. For the sake of CI, I have flipped the defaults, I'll revert this before merging.

Dummy function Body:
The first line is an inline_asm nop to make inlining less likely (I have additional checks to prevent this in the middle end of rustc. If f gets inlined too early, we can't pass it to enzyme and thus can't differentiate it.
If df gets inlined too early, the call site will just compute this dummy code instead of the derivatives, a correctness issue. The following black_box lines make sure that none of the input arguments is getting optimized away before we replace the body.

Motivation:
The user facing autodiff macro can verify the user input. Then I write it as args to the rustc_attribute, so from here on I can know that these values should be sensible. A rustc_attribute also turned out to be quite nice to attach this information to the corresponding function and carry it till the backend.
This is also just an experiment, I expect to adjust the user facing autodiff macro based on user feedback, to improve usability.

As a simple example of what this will do, we can see this expansion:
From:

#[autodiff(df, Reverse, Duplicated, Const, Active)]
pub fn f1(x: &[f64], y: f64) -> f64 {
    unimplemented!()
}

to

#[rustc_autodiff]
#[inline(never)]
pub fn f1(x: &[f64], y: f64) -> f64 {
    ::core::panicking::panic("not implemented")
}
#[rustc_autodiff(Reverse, Duplicated, Const, Active,)]
#[inline(never)]
pub fn df(x: &[f64], dx: &mut [f64], y: f64, dret: f64) -> f64 {
    unsafe { asm!("NOP"); };
    ::core::hint::black_box(f1(x, y));
    ::core::hint::black_box((dx, dret));
    ::core::hint::black_box(f1(x, y))
}

I will add a few more tests once I figured out why rustc rebuilds every time I touch a test.

Tracking:

try-job: dist-x86_64-msvc

@rustbot
Copy link
Collaborator

rustbot commented Aug 23, 2024

r? @pnkfelix

rustbot has assigned @pnkfelix.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Aug 23, 2024
@rust-log-analyzer

This comment has been minimized.

@traviscross traviscross added the F-autodiff `#![feature(autodiff)]` label Aug 23, 2024
@ZuseZ4 ZuseZ4 changed the title implement a working autodiff frontend Autodiff Upstreaming - enzyme frontend Aug 23, 2024
@bjorn3 bjorn3 mentioned this pull request Aug 24, 2024
7 tasks
@ZuseZ4 ZuseZ4 marked this pull request as ready for review August 24, 2024 20:25
@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Collaborator

bors commented Aug 25, 2024

☔ The latest upstream changes (presumably #129563) made this pull request unmergeable. Please resolve the merge conflicts.

@jieyouxu jieyouxu self-assigned this Aug 30, 2024
Copy link
Member

@jieyouxu jieyouxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work on autodiff! I have some feedback and questions. As you may have gathered, I am not very knowledgeable about autodiff. If I ask some questions for more context / explanation, it might be good to encode them as comments in the impl itself for more context. So that if someone else (or even yourself) comes back later to try to change this impl, they are better equipped to figure out what this is doing.

EDIT: please ignore panic! -> bug! suggestions as that might not be available yet in macro expansion here.

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 30, 2024
@ZuseZ4 ZuseZ4 force-pushed the enzyme-frontend branch 3 times, most recently from 9d8ec28 to a989f28 Compare September 3, 2024 02:18
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@jieyouxu
Copy link
Member

jieyouxu commented Oct 14, 2024

That doesn't seem great, since it fails in the post-optimization test suite. If this PR really did not change anything that is actually used, that might signify some miscompilation in PGO, or something sinister like that.

This is what I'm concerned about too, because AFAICT this PR really only touches the front-end things, and the code paths are gated behind unstable feature gates. Nothing is sticking out to me as something that can explode compile times.

Comment on lines +387 to +408
let blackbox_path = ecx.std_path(&[sym::hint, sym::black_box]);
let noop = ast::InlineAsm {
asm_macro: ast::AsmMacro::Asm,
template: vec![ast::InlineAsmTemplatePiece::String("NOP".into())],
template_strs: Box::new([]),
operands: vec![],
clobber_abis: vec![],
options: ast::InlineAsmOptions::PURE | ast::InlineAsmOptions::NOMEM,
line_spans: vec![],
};
let noop_expr = ecx.expr_asm(span, P(noop));
let unsf = ast::BlockCheckMode::Unsafe(ast::UnsafeSource::CompilerGenerated);
let unsf_block = ast::Block {
stmts: thin_vec![ecx.stmt_semi(noop_expr)],
id: ast::DUMMY_NODE_ID,
tokens: None,
rules: unsf,
span,
could_be_bare_literal: false,
};
let unsf_expr = ecx.expr_block(P(unsf_block));
let blackbox_call_expr = ecx.expr_path(ecx.path(span, blackbox_path));
Copy link
Member

@jieyouxu jieyouxu Oct 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory, if this part is wrong, then we may generate a test that never terminates due to UB. I'm not sure exactly on which test (suite) the job gets stuck on. It seems unlikely for crashes tests because they don't enable the autodiff feature gates?

@jieyouxu
Copy link
Member

jieyouxu commented Oct 14, 2024

That doesn't seem great, since it fails in the post-optimization test suite. If this PR really did not change anything that is actually used, that might signify some miscompilation in PGO, or something sinister like that.

Do you have any "insider" infra info on which test suite this job gets stuck on? And do you know if any of the CI jobs have enzyme available (esp. dist-x86_64-msvc)?

@Kobzol
Copy link
Member

Kobzol commented Oct 14, 2024

Hmm, I only have the same info as you, the log of the CI run. From that it's pretty clear where it is wrong - it's Testing stage0 compiletest suite=crashes mode=crashes, and specifically the test tests\crashes\23707.rs, which does not seem to terminate.

@jieyouxu
Copy link
Member

jieyouxu commented Oct 14, 2024

That's the funny part. That crashes test looks like a trait system thing, which this PR doesn't really touch. Worst part is I can't repro the failure locally (it passes just fine).

@rust-timer

This comment was marked as resolved.

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Oct 14, 2024
@Kobzol
Copy link
Member

Kobzol commented Oct 14, 2024

Might be interesting to see if this also happens on Linux. You'd need to disable DIST_TRY_BUILD in jobs.yml and then do a normal try build.

@jieyouxu
Copy link
Member

Doing that over in #131685.

@jieyouxu
Copy link
Member

Didn't fail in try job. So what if we try again. Surely changing nothing will lead to a different outcome.

@bors r+

@bors
Copy link
Collaborator

bors commented Oct 15, 2024

💡 This pull request was already approved, no need to approve it again.

@bors
Copy link
Collaborator

bors commented Oct 15, 2024

📌 Commit 7c37d2d has been approved by jieyouxu

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 15, 2024
@bors
Copy link
Collaborator

bors commented Oct 15, 2024

⌛ Testing commit 7c37d2d with merge 785c830...

@bors
Copy link
Collaborator

bors commented Oct 15, 2024

☀️ Test successful - checks-actions
Approved by: jieyouxu
Pushing 785c830 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Oct 15, 2024
@bors bors merged commit 785c830 into rust-lang:master Oct 15, 2024
@rustbot rustbot added this to the 1.84.0 milestone Oct 15, 2024
@jieyouxu
Copy link
Member

jieyouxu commented Oct 15, 2024

...? ok bors whatever u say

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (785c830): comparison URL.

Overall result: ✅ improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-0.5% [-0.8%, -0.2%] 2
All ❌✅ (primary) - - 0

Max RSS (memory usage)

Results (secondary 1.3%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
4.0% [2.2%, 6.6%] 5
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-3.1% [-5.6%, -1.9%] 3
All ❌✅ (primary) - - 0

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

Results (primary 0.5%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.5% [0.5%, 0.5%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.5% [0.5%, 0.5%] 1

Bootstrap: 781.98s -> 782.278s (0.04%)
Artifact size: 332.66 MiB -> 332.62 MiB (-0.01%)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

F-autodiff `#![feature(autodiff)]` merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.