-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
find rocm components individually #2567
find rocm components individually #2567
Conversation
9011d62
to
abaaa4d
Compare
retest Ubuntu-GPU-multi please |
retest Ubuntu-CPU please |
retest Ubuntu-GPU-single please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
retest Ubuntu-GPU-single please |
2 similar comments
retest Ubuntu-GPU-single please |
retest Ubuntu-GPU-single please |
I'm not able to reproduce the ci build failure for Ubuntu-GPU-single on my local machine, any ideas what the issue might be? |
retest Ubuntu-GPU-single please |
I think it is landing on a node where the rocm driver is not stable which causes the tests to fail. If you can add the logs to your local run, here we can disregard the CI failure and merge this change. |
retest Ubuntu-GPU-single please |
1 similar comment
retest Ubuntu-GPU-single please |
It's because one of nodes has driver issue, we already marked it offline and I'm tring to retest it another node. |
Here's a log of my test run on mi100. There are only 2 test failures, one of which is that the test failed to allocate enough memory on the device. The errors I was getting on the ci are not present in the local run. |
retest Ubuntu-GPU-single please |
Yes, it wouldn't have any problem because I can see gpu-pycpp is green |
Now that the pr is merged, this change needs to be backported to the r2.14-rocm-enhanced-spack branch. Also another branch called r2.16-rocm-enhanced-spack should be created with this change and f4f4e86 |
f4f4e86 |
Sounds good, I'll try that out. |
Required for spack since all the rocm components are located in different paths