Skip to content

[SYCL][E2E] Use ROCM_PATH from the system environment #18468

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: sycl
Choose a base branch
from

Conversation

npmiller
Copy link
Contributor

@npmiller npmiller commented May 14, 2025

For ROCm installations that aren't just in /opt/rocm, clang uses ROCM_PATH at runtime to find the ROCm device libraries, and this is done even in regular SYCL compilation. So this is needed for lit tests in certain configurations.

This was accidentally removed in #17692 it doesn't cause any issues in the CI because the CI has ROCm installed in the standard /opt/rocm, but it causes issues on local setups.

Also remove --rocm-path on Windows, this should be covered by ROCM_PATH.

For ROCm installations that aren't just in `/opt/rocm`, `clang` uses
`ROCM_PATH` at runtime to find the ROCm device libraries, and this is
done even in regular SYCL compilation. So this is needed for lit tests
in certain configurations.
@npmiller npmiller requested a review from a team as a code owner May 14, 2025 14:56
@npmiller npmiller requested a review from steffenlarsen May 14, 2025 14:56
@npmiller npmiller temporarily deployed to WindowsCILock May 14, 2025 14:56 — with GitHub Actions Inactive
@npmiller npmiller requested a review from sarnex May 14, 2025 14:56
@@ -541,6 +541,10 @@ def open_check_file(file_name):
else:
config.substitutions.append(("%hip_options", ""))

# Add ROCM_PATH from system environment, this is used by clang to find ROCm
# libraries in non-standard installation locations.
llvm_config.with_system_environment("ROCM_PATH")
Copy link
Contributor

@sarnex sarnex May 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we instead update the CMake code here? When users pass %hip_options in tests on Windows it also passes --rocm-path based on that CMake var here, if needed we could make this happen for Linux too

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I'm seeing now we should already check that envvar to set the cmake var, so maybe the problem is just that we don't pass --rocm-path on linux?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed for every single test not just when %hip_options are passed.

There's two different things going on, the %hip_options are used specifically for interop tests that use HIP code alongside SYCL code, they're used to link the runtime HIP libraries and include the HIP runtime headers.

But then for all SYCL tests, when compiling for AMD, clang uses some bitcode libraries provided by ROCm so it also needs to find the ROCm installation.

This patch is fixing that second use case, we could potentially use --rocm-path instead but that would have to be added to the regular %{build} substitution not just %hip_options, which may be slightly cleaner, but using ROCM_PATH is a bit easier and more flexible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it, thanks. i didnt know the rocm path was needed when compiling for amd when not using hip.

i would prefer we minimize environment var usage in tests so i would prefer we remove the passing of --rocm-path as part of hip_options on Windows and just always add it to the %{build} substitution on all OSs when compiling for AMD

@aelovikov-intel Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, we can use this and remove all --rocm-paths. User's env is supposed to have ROCM_PATH (we copy it from there, after all), so they won't need to add extra env variables when reproducing outside LIT.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah it seems like ROCM_PATH must always exist and point to the rocm install, i thought it was optional, in that case im fine to use the envvar

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did have a look at the CMake and on top of that the HIP CMake is annoying so it's not straightforward to get the rocm path from it, so this is actually a lot easier.

I don't have a setup with Windows + ROCm, I can remove the --rocm-path there, but it's untested, and I'm not sure if ROCM_PATH works the same on Windows. That's the only place that uses --rocm-path as far as I can tell.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's try removing windows, if it fails just add it back, i vaguely remember the rocm detection on windows in upstream clang being basically non-existent so we might need it there

This may be covered by ROCM_PATH
Copy link
Contributor

@sarnex sarnex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

patch lgtm and if CI fails on windows lgtm to add --rocm-path back without another review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants