Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ROCm] Enable Tiled layout extension and minor changes to setup #3

Merged

Conversation

lcskrishna
Copy link
Collaborator

No description provided.

Copy link
Owner

@petrex petrex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lcskrishna I am seeing env issue with latest pytorch docker. Did I miss anything?

To be more specific, the following bf16 types are not converted.
("__nv_bfloat16", ("__hip_bfloat16", CONV_TYPE, API_RUNTIME)),
("__nv_bfloat162", ("__hip_bfloat162", CONV_TYPE, API_RUNTIME)),

__nv_bfloat162 scale2 = __bfloat162bfloat162(pSZ[0]);
__nv_bfloat162 zero2 = __bfloat162bfloat162(pSZ[1]);
if (scales_and_zeros) {
const auto&sz = *scales_and_zeros;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lcskrishna This change fixes compilation for me, please refer to my scratch space : https://github.com/petrex/ao/tree/rocm_tensor_tile

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @petrex for taking a look. I will update it accordingly.

Copy link
Owner

@petrex petrex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also thinking about host side micro-arch check due to subtle differences in the ISAs? maybe later

@lcskrishna
Copy link
Collaborator Author

@lcskrishna I am seeing env issue with latest pytorch docker. Did I miss anything?

To be more specific, the following bf16 types are not converted. ("__nv_bfloat16", ("__hip_bfloat16", CONV_TYPE, API_RUNTIME)), ("__nv_bfloat162", ("__hip_bfloat162", CONV_TYPE, API_RUNTIME)),

Ah!! forgot to mention, use pytorch nightly. This seems to work after PT 2.5.

@lcskrishna
Copy link
Collaborator Author

I am also thinking about host side micro-arch check due to subtle differences in the ISAs? maybe later

Yes let's seperate that out and do it later.

@petrex
Copy link
Owner

petrex commented Oct 24, 2024

could you share the docker image you are using? and maybe a little clean up/lint on the code before merge. thx

@petrex petrex merged commit c86880e into petrex:rocm_enablement_staging Oct 29, 2024
5 of 17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants