dont let dynamo inline inside of NF4 constructors or __torch_dispatch__ #544
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This fixes a few known "gotchas" for
NF4Tensor
when getting it to work with compile. Enumerating:(1) the NF4Tensor() constructor needs to be wrapped in a function that is annotated with
torch._dynamo.allow_in_graph
. We want to make this work more automatically soon, see tracking issue: pytorch/pytorch#114389(2)
NF4Tensor.__new__
andNF4Tensor.__torch_dispatch__
both need to be marked withtorch._dynamo.disable
.There isn't really a great fix for (2) today. One thing to note though is that as long as your model doesn't have a million graph breaks, you are unlikely to run into problems with (2). In the torchtune repro, there were a lot of deeply nested graph breaks in the FSDP2 code that caused dynamo to try to e.g. install itself directly onto the
NF4Tensor.__new__
frame, which causes problems.