-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Tuning] rel-4.5 fdb update for develop #1196
Conversation
Is this from 4.5 tuning? |
I think so, and the source branch name also double confirms |
In performance testing right now... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fdb duplicate IDs for gfx906/60
Duplicate IDs for gfx906/60/HIPIt seems like the root reason of #1133 (comment) is not yet fixed. Error messages
Examples of duplicates256-28-28-1x1-64-56-56-8-0x0-2x2-1x1-0-NCHW-BF16-B=
128-28-28-1x1-512-28-28-8-0x0-1x1-1x1-0-NCHW-FP32-F=
|
Missing fdb records for gfx906/60/HIPgfx906_60.HIP.missing.fp32.txt I suspect that FP32 and FP16 configs are required (not sure about BF16, though) |
Performance testing results🟢 These are good! ROCm 4.0, gfx906/60, Performance gain, based on sum of all times/directions:
|
VerdictI tested this only on gfx906/60. Assuming other GPUs perform similarly, I think this can be merged provided that "Duplicate ID" issues are resolved. |
Thanks Artem, I'll start looking into the source of the duplication. |
fdb entries no longer contain duplicate algorithms. |
This comment has been minimized.
This comment has been minimized.
Please re-review. fdb entries no longer contain duplicate algorithms.
Testing on MI100 is ongoing. |
@atamazov @JehandadKhan |
MI100 testing results (ROCm 4.3.1)🟡 Looks acceptable. Develop vs cderb/fdb_tuning_dev-4.5, tested with MI100 & ROCm 4.3.1. Performance gain, based on sum of all times/directions:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
No description provided.