-
Notifications
You must be signed in to change notification settings - Fork 83
MHA fusion cleanup #2481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MHA fusion cleanup #2481
Conversation
Signed-off-by: Ganesan Ramalingam <grama@microsoft.com>
Signed-off-by: Ganesan Ramalingam <grama@microsoft.com>
Signed-off-by: Ganesan Ramalingam <grama@microsoft.com>
Signed-off-by: Ganesan Ramalingam <grama@microsoft.com>
Signed-off-by: Ganesan Ramalingam <grama@microsoft.com>
❌ 9 Tests Failed:
View the top 3 failed test(s) by shortest run time
To view more test analytics, go to the Test Analytics Dashboard |
Signed-off-by: Ganesan Ramalingam <grama@microsoft.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR cleans up MHA (Multi-Head Attention) fusion rules by simplifying pattern variations and improving scale attribute handling. The main improvements include removing redundant rule variations, introducing proper scale attribute handling through the new AttrVar
pattern, and adding common subexpression elimination.
Key changes:
- Eliminates redundant MHA fusion rule variations by removing
transpose_4d
andpre_scale_q
parameters - Introduces
AttrVar
pattern withcan_match_none
support for optional attributes like scale - Fixes scale attribute handling across MHA fusion patterns
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
onnxscript/rewriter/pattern.py | Exports new AttrVar pattern for use in fusion rules |
onnxscript/rewriter/_pattern_ir.py | Implements AttrVar pattern with can_match_none support for optional attributes |
onnxscript/rewriter/ort_fusions/mha.py | Removes redundant rule variations and simplifies MHA pattern generation |
onnxscript/rewriter/ort_fusions/mha_bias.py | Updates to use AttrVar for scale attribute handling |
onnxscript/rewriter/ort_fusions/attention.py | Extracts scale from matched nodes and uses named outputs |
onnxscript/rewriter/ort_fusions/_core.py | Adds common subexpression elimination pass |
Signed-off-by: Ganesan Ramalingam <grama@microsoft.com>
…cript into rama/mha-cleanup
Signed-off-by: Ganesan Ramalingam <grama@microsoft.com>
Uh oh!
There was an error while loading. Please reload this page.