Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[testing] Enable generating cached native code by default #2350

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

alexbaden
Copy link
Contributor

Checking for functional correctness - we need to decide if this is something we're willing to do or if we want PyTorch to pass the cached native code flag for AOT inductor - and I'd like to wait to land this anyway until they have done some testing between systems, as there is some concern about native code portability (though I believe the SPIR-V is also bundled as part of the native code package)

Also, we will likely need better support for ocloc to detect the device architecture for the -device flag.

close #1792

This is a bit of a hack, but I did not want to pollute build_flags with something that is not a real build_flag and there is no other way to pass params to loadBinary. We could probably duplicate loadBinary - I am considering that too, but for now the hack should let the tests run.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Research][PyTorch 2.6] Save compiled triton kernel as device binary code
1 participant