Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Differentiation functions with Box<dyn Trait> args fails #193

Open
motabbara opened this issue Jan 18, 2025 · 6 comments
Open

Differentiation functions with Box<dyn Trait> args fails #193

motabbara opened this issue Jan 18, 2025 · 6 comments

Comments

@motabbara
Copy link

Please see https://fwd.gymni.ch/eTJnUQ

Fail on "Attempting to call an indirect active function whose runtime value is inactive".

#![feature(autodiff)]
use std::autodiff::autodiff;
use std::fmt;

#[derive(Debug)]
struct Foo {
    pub test: f64
}

pub trait Cool: fmt::Debug {
    fn gen(&self) -> f64;
}

impl Cool for Foo {
    fn gen(&self) -> f64 {
        self.test * self.test
    }
}


#[autodiff(dsquare, Reverse, Duplicated, Duplicated)]
pub fn square(num: &Foo, result: &mut f64) {
    *result = num.gen()
}

#[autodiff(dsquare2, Reverse, Duplicated, Duplicated)]
pub fn square2(num: &Box<dyn Cool>, result: &mut f64) {
    *result = num.gen()
}

Incidentally, generic functions fail to differentiate even without the box e.g,.,

#[autodiff(dsquare3, Reverse, Duplicated, Duplicated)]
pub fn square3<U: Cool>(num: &U, result: &mut f64) {
    *result = num.gen()
}
@motabbara
Copy link
Author

@ZuseZ4, any recommendations about where in the codebase to look to examine calling traits through Box? Happy to attempt to try something myself with some pointers.

@ZuseZ4
Copy link
Member

ZuseZ4 commented Jan 20, 2025

I'm currently traveling, but I'll be back at my laptop on the 23rd, then I can look closer at the runtime inactivity. In the meantime, if you have a local build, can you run cargo +expand and post the ad macro expansions? Otherwise there might be flags to get the output from the explorer.

Support for Generics should be easy to add, we had support in an earlier implementation. You need to adjust the frontend to not error on generics, and adjust the autodiff function body to call the generic primal function. I will look up the two locations in my frontend pr that you'd need to modify for that.

@motabbara
Copy link
Author

Here it is:

#![feature(prelude_import)]
#![feature(autodiff)]
#[prelude_import]
use std::prelude::rust_2021::*;
#[macro_use]
extern crate std;
use std::autodiff::autodiff;
use std::fmt;
struct Foo {
    pub test: f64,
}
#[automatically_derived]
impl ::core::fmt::Debug for Foo {
    #[inline]
    fn fmt(&self, f: &mut ::core::fmt::Formatter) -> ::core::fmt::Result {
        ::core::fmt::Formatter::debug_struct_field1_finish(f, "Foo", "test", &&self.test)
    }
}
pub trait Cool: fmt::Debug {
    fn gen(&self) -> f64;
}
impl Cool for Foo {
    fn gen(&self) -> f64 {
        self.test * self.test
    }
}
#[rustc_autodiff]
#[inline(never)]
pub fn square(num: &Foo, result: &mut f64) {
    *result = num.gen();
}
#[rustc_autodiff(Reverse, Duplicated, Duplicated, None)]
#[inline(never)]
pub fn dsquare(num: &Foo, dnum: &mut Foo, result: &mut f64, dresult: &mut f64) {
    unsafe {
        asm!("NOP", options(pure, nomem));
    };
    ::core::hint::black_box(square(num, result));
    ::core::hint::black_box((dnum, dresult));
}
#[rustc_autodiff]
#[inline(never)]
pub fn square2(num: &Box<dyn Cool>, result: &mut f64) {
    *result = num.gen();
}
#[rustc_autodiff(Reverse, Duplicated, Duplicated, None)]
#[inline(never)]
pub fn dsquare2(
    num: &Box<dyn Cool>,
    dnum: &mut Box<dyn Cool>,
    result: &mut f64,
    dresult: &mut f64,
) {
    unsafe {
        asm!("NOP", options(pure, nomem));
    };
    ::core::hint::black_box(square2(num, result));
    ::core::hint::black_box((dnum, dresult));
}
fn main() {
    for i in 0..5 {
        let mut d_foo = Foo { test: 0.0 };
        let f = Foo { test: i as f64 };
        let mut c = 0.0;
        let mut d_c = 1.0;
        let r = dsquare(&f, &mut d_foo, &mut c, &mut d_c);
        {
            ::std::io::_print(format_args!("d_foo {0:?}\n", d_foo));
        };
        let mut d_foo: Box<dyn Cool> = Box::new(Foo { test: 0.0 });
        let f: Box<dyn Cool> = Box::new(Foo { test: i as f64 });
        let r = dsquare2(&f, &mut d_foo, &mut c, &mut d_c);
        {
            ::std::io::_print(format_args!("d_foo {0:?}\n", d_foo));
        };
    }
}

@ZuseZ4
Copy link
Member

ZuseZ4 commented Feb 6, 2025

@wsmoses Any suggestions?

@KMJ-007
Copy link

KMJ-007 commented Apr 4, 2025

I’ve been looking into this Box<dyn Trait> differentiation failure - it’s an interesting one. The error - "Attempting to call an indirect active function whose runtime value is inactive" - suggests something’s off with how autodiff handles dynamic dispatch. For square with a concrete Foo, the macro can resolve num.gen() statically - no problem there. But with square2 and Box<dyn Cool>, it’s all runtime vtable lookups - maybe that’s where the macro or runtime loses track of the computation graph.

Looking at the expanded code - dsquare2 calls square2 via black_box, but I’m wondering if the trait object’s indirection breaks the "active" state tracking needed for gradients. Is this a limitation in the frontend where the macro processes dyn Trait? Or is it lower down - maybe in the backend with how rustc_autodiff interacts with Rust/LLVM for these cases?

@ZuseZ4 - Any suggestions on where in the codebase to start investigating trait object support? I’m guessing either the macro expansion logic - or wherever runtime activity is managed - but I’m not sure where to focus. If there’s a simple test I could run - like bypassing dynamic dispatch to narrow it down - I’d be happy to try it with some guidance. Also, you noted generics needing frontend tweaks - would trait objects follow a similar fix?

I’d like to dig deeper into this. Thanks for any pointers you can share!

@ZuseZ4
Copy link
Member

ZuseZ4 commented Apr 7, 2025

@KMJ-007 The error message comes from Enzyme:

➜  enzyme git:(a35f4f7) rg "Attempting to call an indirect active"                                         
Enzyme/AdjointGenerator.h
4938:            "Attempting to call an indirect active function "
5279:              "Attempting to call an indirect active function "

but I don't see explanations on the website (enzyme.mit.edu), or on the code next to it.
You can generally verify if something is caused by Enzyme by generating an LLVM reproducer as described here: https://enzyme.mit.edu/rust/debug_backend.html#reporting-backend-crashes

I just remembered that a lot of people were confused about this in the past, and I did find a few issues of users (including me, lol) asking about it. If you learn anything about it, please make a PR against github.com/EnzymeAD/rustbook to update our docs here, either under chapter 4 or 12. Even if we don't find a full solution directly, we'd want to avoid that the next person spending time on it has to start from zero.

To figure out how to use it, I'd recommend to start by lowering a reproducer (like the one in the first post) to LLVM-IR, and reproducing it through opt first. Once you managed that, you can try to manually rewrite the llvm-ir to include the virtualreverse thing, and see if that fixes anything. Maybe EnzymeAd/Enzyme also has testcases using it, which could help with understanding usages. IF you manage to get anything to work (or are stuck) just ping me, and we can go backwards from there, trying to generate the right code from Rust to automate what you did by hand.
If you notice that you can't find some needed Rust code in LLVM-IR, then you can try to use std::hint::black_box() to wrap Rust variables, this way rust and llvm shouldn't optimize them away, and you can use them when manually experimenting with LLVM-IR. You can also use extern "Rust" (or C) if you want to see how a declaration get's lowered to LLVM-IR (or you can copy the __enzyme_autodiff declarations which should already exist in the module).

EnzymeAD/Enzyme#316
EnzymeAD/Enzyme#1455
EnzymeAD/Enzyme#929
EnzymeAD/Enzyme#891
EnzymeAD/Enzyme#737
EnzymeAD/Enzyme#2178
(Not sure if related:) https://enzyme.mit.edu/julia/stable/faq/#Runtime-Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants