Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Report exception issue on nested codegen #619

Merged
merged 1 commit into from
Aug 26, 2024
Merged

Conversation

wsmoses
Copy link
Contributor

@wsmoses wsmoses commented Aug 26, 2024

No description provided.

Copy link
Member

@maleadt maleadt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That shouldn't be happening, and isn't a correct fix I think.

The problem with gc_alloc is that it gets introduced by optimization passes after linking the runtime, because optimization happens late. In deferred compilation mode (toplevel=false -> libraries=false && optimize=false), we don't link libraries or optimize, so the only library link step should happen at a point after which no new runtime calls can be introduced. Well, with the exception of gc_alloc, but definitely not report_exception.

Adding another exception is not safe, and will violate other assumptions in the compiler driver. For example, the kernel state rewrite pass often changes the arguments to the runtime functions, after they have been linked. Enzyme shouldn't be using deferred compilation mode in a way that doesn't match how it's been designed here (the other known error is linking optimized IR, which I've already mentioned before).

@wsmoses
Copy link
Contributor Author

wsmoses commented Aug 26, 2024

recopying from slack to avoid the slackhole in the future:

so either it feels like there's then a fundemntal phase ordering problem, or I very much misunderstand something
so my impression of how things work is

  1. CUDA.jl emits cuda LLVM, running linking and other things in entirety [tho we manage to disable optimization]
  2. Enzyme gets this IR, at this point the defn of report exception is already there as far as i can tell
    we then do a pass rewriting the code [more specifically adding a new fn for the derivative]
    we do need to call report exception, since there could be an instruction we dont know how to handle, and we can't do a jl_throw on a cuda device

its this point at which we hit the issue above
[which was working successfully across cuda.jl/enzyme.jl tests until the new gpucompiler came out with the error message -- so idk how we ought catch this in the future since idk if gpucompiler.jl runs the cuda+_enzyme extension tests in its downstream ci]
i dont really care how we manage to emit exceptions from enzyme, so far as we have a way to do so
which my understanding of the phase ordering rn is that enzyme runs fully after cuda, including all the runtime linking, etc

@maleadt
Copy link
Member

maleadt commented Aug 26, 2024

From discussion on Slack: Enzyme is incorrectly using deferred compilation mode (not driving it within a single compilation job, which it expects), however, this affects user' code so let's merge this hack until there's a better fix in Enzyme.

@maleadt maleadt merged commit 6ed7594 into JuliaGPU:master Aug 26, 2024
1 check passed
@wsmoses wsmoses deleted the rtexc branch August 26, 2024 07:01
@maleadt maleadt mentioned this pull request Aug 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants