-
-
Notifications
You must be signed in to change notification settings - Fork 11.5k
torch.compile based model optimizer #6377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
torch.compile based model optimizer #6377
Conversation
79d604d to
66f0159
Compare
|
@bnellnm I'd like to ask, I didn't see any pattern matching related code in the PR. Did I miss it? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a small question. The operator has already been registered to torch in torch_binding.cpp above, why does it need to be registered again here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This version of silu_mul_quant includes allocating the output/tmp buffers. The function names here don't really mean anything since we will be going by the torch op name (but probably should be changed to avoid confusion).
We don't use pytorch's pattern matching routines. The code in fusion.py searches for and combines fusable operations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be
if isinstance(n, float):
ty = "float"
elif isinstance(n, int):
ty = "int"
elif n.type is not None:
...
so that the proper C++/Python type is returned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the signature of this function should remain as it was before so that we could do a proper job of generating a meta function at some point in the future. The compose stuff doesn't really work for non-unary meta functions.
9a9af98 to
962d9bb
Compare
e2fa8b8 to
355436f
Compare
|
@bnellnm is this feature still ongoing? |
It is, although this PR will probably be superceded by others. |
|
Closing as |
This PR implements the first cut of the optimizer decribed here: [RFC] A Graph Optimization System in vLLM using torch.compile
Issue: #6378