Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SWDEV-492880] Add ROCm path for hipRTC #3391

Merged
merged 9 commits into from
Jan 21, 2025

Conversation

huanrwan-amd
Copy link
Contributor

This fix is for #3314 .

As in the issue: "hipRTC/Comgr automatically added -I /opt/rocm/include to kernel compilations. However, this behavior is being removed, as hipRTC has no guarantees about a system ROCm installation during kernel compilation/execution."

src/comgr.cpp Outdated
@@ -990,7 +990,7 @@ void BuildHip(const std::string& name,
return StartsWith(s, "--std=") || StartsWith(s, "-std=");
}))
opts.push_back("-std=c++17");

opts.push_back("-I/opt/rocm/include");
Copy link
Contributor Author

@huanrwan-amd huanrwan-amd Nov 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open for suggestions: For dealing with no symbolic link to the rocm directory cases

Copy link
Contributor

@cderb cderb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll approve because this will be correct for the case with the symbolic link present and doesn't present complications otherwise.
In the case where rocm is installed somewhere else, this change would not be sufficient. That path might be part of the CMAKE_PREFIX_PATH.

@huanrwan-amd
Copy link
Contributor Author

huanrwan-amd commented Nov 18, 2024

I'll approve because this will be correct for the case with the symbolic link present and doesn't present complications otherwise. In the case where rocm is installed somewhere else, this change would not be sufficient. That path might be part of the CMAKE_PREFIX_PATH.

Thanks for the input. For my understanding, the CMAKE_PREFIX_PATH is for building the MIOpendriver binary and *.so. This function BuildHip() is to using hipRTC for an online compilation.
One possibility here is to check ROCM_PATH and ROCM_HOME env:

        const char* rocm_path = std::getenv("ROCM_PATH");
        if(rocm_path == nullptr)
        {
            rocm_path = "/opt/rocm";
        }
        const char* rocm_home = std::getenv("ROCM_HOME");
        if(rocm_home == nullptr)
        {
            rocm_home = rocm_path;
        }
        opts.push_back(std::string("-I") + rocm_home + "/include");

I see people using both ROCM_PATH and ROCM_HOME at: cupy/cupy#4493, But ROCM_PATH is recommended. Any suggestions?

@lamb-j
Copy link

lamb-j commented Nov 19, 2024

I've seen ROCM_PATH used as well, not familiar with ROCM_HOME

@huanrwan-amd
Copy link
Contributor Author

Hi @lamb-j and @cderb updated as we discussed. Using ROCM_PATH for a user specified ROCm path and log the build options for debugging.

Copy link
Contributor

@cderb cderb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@BrianHarrisonAMD
Copy link
Contributor

Greetings @huanrwan-amd

Heads up, there was a formatting error on these changes.
I fixed it to allow CI to run, but here is the script I run in case you need it in the future:

#!/bin/bash
find . -iname '*.h' \
    -o -iname '*.hpp' \
    -o -iname '*.cpp' \
    -o -iname '*.h.in' \
    -o -iname '*.hpp.in' \
    -o -iname '*.cpp.in' \
    -o -iname '*.cl' \
| grep -v -E '(build/)|(install/)' \
| xargs -n 1 -P $(nproc) -I{} -t clang-format-12 -style=file {} -i 2>/dev/null

src/comgr.cpp Outdated
@@ -991,6 +991,19 @@ void BuildHip(const std::string& name,
}))
opts.push_back("-std=c++17");

const char* rocm_path = std::getenv("ROCM_PATH");
Copy link
Contributor

@BrianHarrisonAMD BrianHarrisonAMD Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an error on this line for the checker, I think we need to use the MIOpen Macros instead.

Ex:

// at top of file
MIOPEN_DECLARE_ENV_VAR_STR(MIOPEN_CUSTOM_CACHE_DIR)

// etc.

// usage example
const auto& custom = env::value(MIOPEN_CUSTOM_CACHE_DIR);
if(!custom.empty())
{
    p = ExpandUser(custom);
}

It will also be a string type, and you can avoid some of the below manual conversions to std::string.

Copy link
Contributor

@CAHEK7 CAHEK7 Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the most important part, it caches the results, that all the consequence compilations with get that path much faster.

Also it supports default values, so path.empty() check is not required - it either returns user-provided path or default /opt/rocm and that result is cached.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @CAHEK7 , thank you very much for providing comments and instructions. There would be some cases where ROCM_PATH is not there at all. I checked with a clean installation of ROCm6.2.4. No ROCM_PATH...
So I think give a default value here is useful here to use hipRTC out of box, although we can capture this built options strings through logs printed below and add ROCM_PATH manually.

@CAHEK7
Copy link
Contributor

CAHEK7 commented Nov 21, 2024

Greetings @huanrwan-amd

Heads up, there was a formatting error on these changes. I fixed it to allow CI to run, but here is the script I run in case you need it in the future:

#!/bin/bash
find . -iname '*.h' \
    -o -iname '*.hpp' \
    -o -iname '*.cpp' \
    -o -iname '*.h.in' \
    -o -iname '*.hpp.in' \
    -o -iname '*.cpp.in' \
    -o -iname '*.cl' \
| grep -v -E '(build/)|(install/)' \
| xargs -n 1 -P $(nproc) -I{} -t clang-format-12 -style=file {} -i 2>/dev/null

It's also here https://github.com/ROCm/MIOpen/wiki/How-to-format-code

@BrianHarrisonAMD
Copy link
Contributor

@huanrwan-amd Looks like some new formatting errors on this PR.
Likely need to run the clang script.

@huanrwan-amd
Copy link
Contributor Author

@huanrwan-amd Looks like some new formatting errors on this PR. Likely need to run the clang script.

Thank you for the reminder, ran scripts and pushed the changes.

@BrianHarrisonAMD
Copy link
Contributor

Once we get a clean CI run we can merge this.

@BrianHarrisonAMD
Copy link
Contributor

Getting some testing failures on this PR:
testing_failures.txt

Going to restart CI just in case it was a random blip, but they seemed odd.

@huanrwan-amd
Copy link
Contributor Author

Getting some testing failures on this PR: testing_failures.txt

Going to restart CI just in case it was a random blip, but they seemed odd.

Thanks @BrianHarrisonAMD, it is noticed that it also failed on 1 test below:
2025.01.14 Jenkins failed logs.txt
Let review after the restarted CI.

@huanrwan-amd huanrwan-amd force-pushed the huanrwan/hipRTC_COMGR_path branch from a084735 to 6579a76 Compare January 20, 2025 17:53
@huanrwan-amd huanrwan-amd force-pushed the huanrwan/hipRTC_COMGR_path branch from 5361688 to 1eb257a Compare January 20, 2025 20:19
@BrianHarrisonAMD BrianHarrisonAMD merged commit ec8ec1d into develop Jan 21, 2025
29 of 146 checks passed
@BrianHarrisonAMD BrianHarrisonAMD deleted the huanrwan/hipRTC_COMGR_path branch January 21, 2025 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants