Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pr is to support the use case of
where the 0th dimension of
input
can be dynamic. I fixed a couple things in this prCurrently we remove the
symints
from theargs
but filter on the tensor type, This is actually problematic because the FX graph actually expectssymints
to be part of the input. This will likely incur the edge cases where the order/number of args are different from what fx expectsWe currently cache each graph by all input shapes. I took a profile and this can be expensive(~1ms for 16 layer decoder). Instead pytorch actually pass us the dynamic dimension as input so we can cache on that instead. If there is no dynamic dimension we will cache on an empty tuple which is cheap.
with
dynamic=False
andmark_dynami
we can no longer easily tell if current graph will be dynamic or not. Since with the2
above, the caching will be very cheap, so I do thegraph_var
lookup for every graph instead of dynamic graph onlyWe can not run the partitioner when there are symints as inputs because partitioner will put the symint and its return into a separate graph which mess up the dynamic dimension look up. For now I just skip partitioner when there is an symint inputs.