-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug: (libxsmm) fix the little matmul use case #3956
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #3956 +/- ##
=======================================
Coverage 91.31% 91.31%
=======================================
Files 468 468
Lines 58144 58144
Branches 5582 5580 -2
=======================================
+ Hits 53094 53096 +2
+ Misses 3623 3622 -1
+ Partials 1427 1426 -1 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure how to review this as (apart from the multiple issues), I have no way of verifying (easily) the above.
There are two aspects:
- Maybe it is time to invest in an "experiments repo"; I can see this being easily called in from a
main()
stub that provides the inputs. Moreover, I have a machine here that can run those if needed. - I can't pretend that I understand the assembly code's performance, just by reading it on a PR, although it is still easily parsable by a human
Based on these take my "Approval" at face value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks good to me
Fully agree on your idea to enhance the testing framework. Do you think this "validation" step should be performed in a separate repo or in a subdirectory of xdsl/tests ? |
I don't think there is a perfect solution, but we've had success with a separate repo when using a setup that has quite hefty dependencies. I think the case here is much simpler, although not without overhead (syncing the repos, etc.), but such is life :-) |
The use case contained 4 bugs according to computing a matmul:
1- Bad order of vfmadd parameters (AT&T syntax instead of Intel syntax)
2- vfmadd231pd instead of vfmadd231ps (double precision instead of single precision)
3- Incosistency between B dimensions (4x2) and C dimensions (4x4)
4- (performance bug, avoidable dependency) Load all B columns in the same register