-
Notifications
You must be signed in to change notification settings - Fork 666
[ROCm] more hipify v2 fixes #4854
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ Deploy Preview for pytorch-fbgemm-docs ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Hi @jeffdaily could you resolve the branch conflicts? Otherwise I think the PR looks good for landing |
@q10 done. |
#include "kernels/fp8_rowwise_grouped_kernel_manifest.h" | ||
#include "kernels/fp8_rowwise_grouped_heuristic.hpp" | ||
#include "fbgemm_gpu/quantize/tuning_cache.hpp" | ||
#include "fbgemm_gpu/quantize/tuning_cache_hip.hpp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jeffdaily This is breaking internal builds
fatal error: 'fbgemm_gpu/quantize/tuning_cache_hip.hpp' file not found
26 | #include "fbgemm_gpu/quantize/tuning_cache_hip.hpp"
I think the internal hipification doesn't require updating the filepath.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I build locally this file does exist. It is created by the hipify step during cmake. Does your internal build run the same hipify step that cmake does? The point here is that the tuning_cache.hpp file is hipified to a new file with _hip suffix and the contents of that hipified file are correct instead of needing the clunky manual #ifdefs that were in the original header.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jeffdaily I think I understand the internal situation now.
The internal build system only hipifies files with .cu or .cuh file extension, and since fbgemm_gpu/quantize/tuning_cache.hpp is just a regular C++ file with no CUDA syntax, it does not hipify it. This is probably why we had to rely on the ifdefs in the first place.
So there appears to be two possible solutions:
- Keep it as .hpp file, along with the ifdefs.
- Rename the file with .cuh extension, and update the sources that #include this file accordingly.
I'm currently verifying the second solution with the internal CI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. How did that internal test go?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jeffdaily Option 2 appears to work with the internal CI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jeffdaily hmm, so even though option 2 works with the internal CI, it breaks in OSS, bc the renamed file, tuning_cache.cuh, isn't HIPified - see build logs in https://github.com/pytorch/FBGEMM/actions/runs/17969709031/job/51109221018?pr=4921
This means we would have to revert back to using #ifdef USE_ROCM at least within that file...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried renaming the file fbgemm_gpu/experimental/gen_ai/src/quantize/common/include/fbgemm_gpu/quantize/tuning_cache.hpp to tuning_cache.cuh, updated all #include statements, and my local build succeeded. In your log above I see that for the CK source file you're including tuning_cache.cuh
, but the file should be named tuning_cache_hip.cuh
after hipify runs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed my change 148c5d7.
Summary: X-link: facebookresearch/FBGEMM#1898 Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources. The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces. Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs. Pull Request resolved: pytorch#4854 Reviewed By: atalman Differential Revision: D82186865 Pulled By: q10
Summary: Pull Request resolved: pytorch#4921 X-link: facebookresearch/FBGEMM#1898 Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources. The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces. Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs. Pull Request resolved: pytorch#4854 Reviewed By: atalman Differential Revision: D82186865 Pulled By: q10
Summary: Pull Request resolved: pytorch#4921 X-link: facebookresearch/FBGEMM#1898 Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources. The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces. Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs. Pull Request resolved: pytorch#4854 Reviewed By: atalman Differential Revision: D82186865 Pulled By: q10
Summary: Pull Request resolved: pytorch#4921 X-link: facebookresearch/FBGEMM#1898 Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources. The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces. Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs. Pull Request resolved: pytorch#4854 Reviewed By: atalman Differential Revision: D82186865 Pulled By: q10
Summary: Pull Request resolved: pytorch#4921 X-link: facebookresearch/FBGEMM#1898 Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources. The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces. Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs. Pull Request resolved: pytorch#4854 Reviewed By: atalman Differential Revision: D82186865 Pulled By: q10
Summary: Pull Request resolved: pytorch#4921 X-link: facebookresearch/FBGEMM#1898 Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources. The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces. Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs. Pull Request resolved: pytorch#4854 Reviewed By: atalman Differential Revision: D82186865 Pulled By: q10
Summary: Pull Request resolved: pytorch#4921 X-link: facebookresearch/FBGEMM#1898 Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources. The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces. Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs. Pull Request resolved: pytorch#4854 Reviewed By: atalman Differential Revision: D82186865 Pulled By: q10
Summary: Pull Request resolved: pytorch#4921 X-link: facebookresearch/FBGEMM#1898 Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources. The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces. Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs. Pull Request resolved: pytorch#4854 Reviewed By: atalman Differential Revision: D82186865 Pulled By: q10
Summary: X-link: facebookresearch/FBGEMM#1969 Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources. The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces. Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs. Pull Request resolved: pytorch#4854 Differential Revision: D83519493 Pulled By: q10
Summary: Pull Request resolved: pytorch#4947 X-link: facebookresearch/FBGEMM#1969 Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources. The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces. Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs. Pull Request resolved: pytorch#4854 Differential Revision: D83519493 Pulled By: q10
Summary: Pull Request resolved: pytorch#4947 X-link: facebookresearch/FBGEMM#1969 Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources. The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces. Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs. Pull Request resolved: pytorch#4854 Differential Revision: D83519493 Pulled By: q10
Summary: Pull Request resolved: pytorch#4947 X-link: facebookresearch/FBGEMM#1969 Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources. The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces. Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs. Pull Request resolved: pytorch#4854 Differential Revision: D83519493 Pulled By: q10
Summary: Pull Request resolved: pytorch#4947 X-link: facebookresearch/FBGEMM#1969 Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources. The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces. Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs. Pull Request resolved: pytorch#4854 Differential Revision: D83519493 Pulled By: q10
Summary: Pull Request resolved: pytorch#4947 X-link: facebookresearch/FBGEMM#1969 Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources. The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces. Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs. Pull Request resolved: pytorch#4854 Differential Revision: D83519493 Pulled By: q10
Summary: Pull Request resolved: pytorch#4947 X-link: facebookresearch/FBGEMM#1969 Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources. The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces. Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs. Pull Request resolved: pytorch#4854 Differential Revision: D83519493 Pulled By: q10
Summary: Pull Request resolved: pytorch#4947 X-link: facebookresearch/FBGEMM#1969 Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources. The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces. Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs. Pull Request resolved: pytorch#4854 Differential Revision: D83519493 Pulled By: q10
Summary: Pull Request resolved: pytorch#4947 X-link: facebookresearch/FBGEMM#1969 Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources. The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces. Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs. Pull Request resolved: pytorch#4854 Differential Revision: D83519493 Pulled By: q10
Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources. The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces. Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs.