Skip to content

Conversation

jeffdaily
Copy link
Contributor

Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources. The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces. Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs.

Copy link

netlify bot commented Sep 10, 2025

Deploy Preview for pytorch-fbgemm-docs ready!

Name Link
🔨 Latest commit 148c5d7
🔍 Latest deploy log https://app.netlify.com/projects/pytorch-fbgemm-docs/deploys/68dae1964ab705000893d7f1
😎 Deploy Preview https://deploy-preview-4854--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@meta-cla meta-cla bot added the cla signed label Sep 10, 2025
@facebook-github-bot
Copy link
Contributor

@q10 has imported this pull request. If you are a Meta employee, you can view this in D82186865.

@facebook-github-bot
Copy link
Contributor

@atalman has imported this pull request. If you are a Meta employee, you can view this in D82186865.

@q10
Copy link
Contributor

q10 commented Sep 11, 2025

Hi @jeffdaily could you resolve the branch conflicts? Otherwise I think the PR looks good for landing

@jeffdaily
Copy link
Contributor Author

@q10 done.

@facebook-github-bot
Copy link
Contributor

@q10 has imported this pull request. If you are a Meta employee, you can view this in D82186865.

#include "kernels/fp8_rowwise_grouped_kernel_manifest.h"
#include "kernels/fp8_rowwise_grouped_heuristic.hpp"
#include "fbgemm_gpu/quantize/tuning_cache.hpp"
#include "fbgemm_gpu/quantize/tuning_cache_hip.hpp"
Copy link
Contributor

@q10 q10 Sep 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeffdaily This is breaking internal builds

fatal error: 'fbgemm_gpu/quantize/tuning_cache_hip.hpp' file not found
   26 | #include "fbgemm_gpu/quantize/tuning_cache_hip.hpp"

I think the internal hipification doesn't require updating the filepath.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I build locally this file does exist. It is created by the hipify step during cmake. Does your internal build run the same hipify step that cmake does? The point here is that the tuning_cache.hpp file is hipified to a new file with _hip suffix and the contents of that hipified file are correct instead of needing the clunky manual #ifdefs that were in the original header.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeffdaily I think I understand the internal situation now.

The internal build system only hipifies files with .cu or .cuh file extension, and since fbgemm_gpu/quantize/tuning_cache.hpp is just a regular C++ file with no CUDA syntax, it does not hipify it. This is probably why we had to rely on the ifdefs in the first place.

So there appears to be two possible solutions:

  1. Keep it as .hpp file, along with the ifdefs.
  2. Rename the file with .cuh extension, and update the sources that #include this file accordingly.

I'm currently verifying the second solution with the internal CI.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. How did that internal test go?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeffdaily Option 2 appears to work with the internal CI

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeffdaily hmm, so even though option 2 works with the internal CI, it breaks in OSS, bc the renamed file, tuning_cache.cuh, isn't HIPified - see build logs in https://github.com/pytorch/FBGEMM/actions/runs/17969709031/job/51109221018?pr=4921

This means we would have to revert back to using #ifdef USE_ROCM at least within that file...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried renaming the file fbgemm_gpu/experimental/gen_ai/src/quantize/common/include/fbgemm_gpu/quantize/tuning_cache.hpp to tuning_cache.cuh, updated all #include statements, and my local build succeeded. In your log above I see that for the CK source file you're including tuning_cache.cuh, but the file should be named tuning_cache_hip.cuh after hipify runs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed my change 148c5d7.

q10 pushed a commit to q10/FBGEMM that referenced this pull request Sep 23, 2025
Summary:
X-link: facebookresearch/FBGEMM#1898

Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources.  The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces.  Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs.

Pull Request resolved: pytorch#4854

Reviewed By: atalman

Differential Revision: D82186865

Pulled By: q10
q10 pushed a commit to q10/FBGEMM that referenced this pull request Sep 23, 2025
Summary:
Pull Request resolved: pytorch#4921

X-link: facebookresearch/FBGEMM#1898

Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources.  The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces.  Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs.

Pull Request resolved: pytorch#4854

Reviewed By: atalman

Differential Revision: D82186865

Pulled By: q10
q10 pushed a commit to q10/FBGEMM that referenced this pull request Sep 23, 2025
Summary:
Pull Request resolved: pytorch#4921

X-link: facebookresearch/FBGEMM#1898

Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources.  The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces.  Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs.

Pull Request resolved: pytorch#4854

Reviewed By: atalman

Differential Revision: D82186865

Pulled By: q10
q10 pushed a commit to q10/FBGEMM that referenced this pull request Sep 23, 2025
Summary:
Pull Request resolved: pytorch#4921

X-link: facebookresearch/FBGEMM#1898

Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources.  The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces.  Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs.

Pull Request resolved: pytorch#4854

Reviewed By: atalman

Differential Revision: D82186865

Pulled By: q10
q10 pushed a commit to q10/FBGEMM that referenced this pull request Sep 23, 2025
Summary:
Pull Request resolved: pytorch#4921

X-link: facebookresearch/FBGEMM#1898

Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources.  The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces.  Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs.

Pull Request resolved: pytorch#4854

Reviewed By: atalman

Differential Revision: D82186865

Pulled By: q10
q10 pushed a commit to q10/FBGEMM that referenced this pull request Sep 24, 2025
Summary:
Pull Request resolved: pytorch#4921

X-link: facebookresearch/FBGEMM#1898

Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources.  The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces.  Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs.

Pull Request resolved: pytorch#4854

Reviewed By: atalman

Differential Revision: D82186865

Pulled By: q10
q10 pushed a commit to q10/FBGEMM that referenced this pull request Sep 29, 2025
Summary:
Pull Request resolved: pytorch#4921

X-link: facebookresearch/FBGEMM#1898

Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources.  The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces.  Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs.

Pull Request resolved: pytorch#4854

Reviewed By: atalman

Differential Revision: D82186865

Pulled By: q10
q10 pushed a commit to q10/FBGEMM that referenced this pull request Sep 29, 2025
Summary:
Pull Request resolved: pytorch#4921

X-link: facebookresearch/FBGEMM#1898

Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources.  The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces.  Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs.

Pull Request resolved: pytorch#4854

Reviewed By: atalman

Differential Revision: D82186865

Pulled By: q10
q10 pushed a commit to q10/FBGEMM that referenced this pull request Sep 30, 2025
Summary:
X-link: facebookresearch/FBGEMM#1969

Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources.  The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces.  Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs.

Pull Request resolved: pytorch#4854

Differential Revision: D83519493

Pulled By: q10
q10 pushed a commit to q10/FBGEMM that referenced this pull request Sep 30, 2025
Summary:
Pull Request resolved: pytorch#4947

X-link: facebookresearch/FBGEMM#1969

Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources.  The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces.  Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs.

Pull Request resolved: pytorch#4854

Differential Revision: D83519493

Pulled By: q10
q10 pushed a commit to q10/FBGEMM that referenced this pull request Sep 30, 2025
Summary:
Pull Request resolved: pytorch#4947

X-link: facebookresearch/FBGEMM#1969

Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources.  The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces.  Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs.

Pull Request resolved: pytorch#4854

Differential Revision: D83519493

Pulled By: q10
q10 pushed a commit to q10/FBGEMM that referenced this pull request Sep 30, 2025
Summary:
Pull Request resolved: pytorch#4947

X-link: facebookresearch/FBGEMM#1969

Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources.  The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces.  Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs.

Pull Request resolved: pytorch#4854

Differential Revision: D83519493

Pulled By: q10
q10 pushed a commit to q10/FBGEMM that referenced this pull request Sep 30, 2025
Summary:
Pull Request resolved: pytorch#4947

X-link: facebookresearch/FBGEMM#1969

Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources.  The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces.  Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs.

Pull Request resolved: pytorch#4854

Differential Revision: D83519493

Pulled By: q10
q10 pushed a commit to q10/FBGEMM that referenced this pull request Sep 30, 2025
Summary:
Pull Request resolved: pytorch#4947

X-link: facebookresearch/FBGEMM#1969

Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources.  The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces.  Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs.

Pull Request resolved: pytorch#4854

Differential Revision: D83519493

Pulled By: q10
q10 pushed a commit to q10/FBGEMM that referenced this pull request Sep 30, 2025
Summary:
Pull Request resolved: pytorch#4947

X-link: facebookresearch/FBGEMM#1969

Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources.  The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces.  Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs.

Pull Request resolved: pytorch#4854

Differential Revision: D83519493

Pulled By: q10
q10 pushed a commit to q10/FBGEMM that referenced this pull request Sep 30, 2025
Summary:
Pull Request resolved: pytorch#4947

X-link: facebookresearch/FBGEMM#1969

Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources.  The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces.  Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs.

Pull Request resolved: pytorch#4854

Differential Revision: D83519493

Pulled By: q10
q10 pushed a commit to q10/FBGEMM that referenced this pull request Sep 30, 2025
Summary:
Pull Request resolved: pytorch#4947

X-link: facebookresearch/FBGEMM#1969

Prior to the pytorch hipify v2 PR is landed, additional fixes are needed for the experimental gen_ai HIP sources.  The fbgemm_gpu *.hip sources do not undergo additional hipify steps and they were written to assume pytorch's hipify v1 interfaces.  Some small changes are necessary to make the sources more flexible to either hipify v1 or v2 torch APIs.

Pull Request resolved: pytorch#4854

Differential Revision: D83519493

Pulled By: q10
@facebook-github-bot
Copy link
Contributor

@q10 merged this pull request in 072e323.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants