Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CUDA][HIP] Support CUID in new driver #122859

Merged
merged 1 commit into from
Jan 15, 2025
Merged

Conversation

yxsamliu
Copy link
Collaborator

CUID is needed by CUDA/HIP for supporting accessing static device variables in host function.

Currently CUID is only supported by the old driver for CUDA/HIP. The new driver does not support it, which causes CUDA/HIP programs using static device variables in host functions to fail with the new driver for CUDA/HIP.

This patch refactors the CUID support in the old driver so that CUID is supported by both the old and the new drivers for CUDA/HIP.

@yxsamliu yxsamliu requested review from Artem-B and jhuber6 January 14, 2025 05:13
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels Jan 14, 2025
@llvmbot
Copy link
Member

llvmbot commented Jan 14, 2025

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-driver

Author: Yaxun (Sam) Liu (yxsamliu)

Changes

CUID is needed by CUDA/HIP for supporting accessing static device variables in host function.

Currently CUID is only supported by the old driver for CUDA/HIP. The new driver does not support it, which causes CUDA/HIP programs using static device variables in host functions to fail with the new driver for CUDA/HIP.

This patch refactors the CUID support in the old driver so that CUID is supported by both the old and the new drivers for CUDA/HIP.


Full diff: https://github.com/llvm/llvm-project/pull/122859.diff

3 Files Affected:

  • (modified) clang/include/clang/Driver/Driver.h (+29)
  • (modified) clang/lib/Driver/Driver.cpp (+64-47)
  • (modified) clang/test/Driver/hip-cuid.hip (+11)
diff --git a/clang/include/clang/Driver/Driver.h b/clang/include/clang/Driver/Driver.h
index 80bce574a3b647..8c3b00e8a2da79 100644
--- a/clang/include/clang/Driver/Driver.h
+++ b/clang/include/clang/Driver/Driver.h
@@ -72,6 +72,29 @@ enum ModuleHeaderMode {
   HeaderMode_System
 };
 
+/// Options for specifying CUID used by CUDA/HIP for uniquely identifying
+/// compilation units.
+class CUIDOptions {
+public:
+  enum class Kind { Hash, Random, Fixed, None, Invalid };
+
+  CUIDOptions() = default;
+  CUIDOptions(const CUIDOptions &) = default;
+  CUIDOptions(llvm::opt::DerivedArgList &Args, const Driver &D);
+
+  // Get the CUID for an input string
+  std::string getCUID(StringRef InputFile,
+                      llvm::opt::DerivedArgList &Args) const;
+
+  bool isEnabled() const {
+    return UseCUID != Kind::None && UseCUID != Kind::Invalid;
+  }
+
+private:
+  Kind UseCUID = Kind::None;
+  StringRef FixedCUID;
+};
+
 /// Driver - Encapsulate logic for constructing compilation processes
 /// from a set of gcc-driver-like command line arguments.
 class Driver {
@@ -119,6 +142,9 @@ class Driver {
   /// LTO mode selected via -f(no-offload-)?lto(=.*)? options.
   LTOKind OffloadLTOMode;
 
+  /// Options for CUID
+  CUIDOptions CUIDOpts;
+
 public:
   enum OpenMPRuntimeKind {
     /// An unknown OpenMP runtime. We can't generate effective OpenMP code
@@ -728,6 +754,9 @@ class Driver {
   /// Get the specific kind of offload LTO being performed.
   LTOKind getOffloadLTOMode() const { return OffloadLTOMode; }
 
+  /// Get the CUID option.
+  const CUIDOptions &getCUIDOpts() const { return CUIDOpts; }
+
 private:
 
   /// Tries to load options from configuration files.
diff --git a/clang/lib/Driver/Driver.cpp b/clang/lib/Driver/Driver.cpp
index 57fa7c1110a68e..fb99cd513e01c0 100644
--- a/clang/lib/Driver/Driver.cpp
+++ b/clang/lib/Driver/Driver.cpp
@@ -197,6 +197,51 @@ std::string Driver::GetResourcesPath(StringRef BinaryPath) {
   return std::string(P);
 }
 
+CUIDOptions::CUIDOptions(llvm::opt::DerivedArgList &Args, const Driver &D)
+    : UseCUID(Kind::Hash) {
+  if (Arg *A = Args.getLastArg(options::OPT_fuse_cuid_EQ)) {
+    StringRef UseCUIDStr = A->getValue();
+    UseCUID = llvm::StringSwitch<Kind>(UseCUIDStr)
+                  .Case("hash", Kind::Hash)
+                  .Case("random", Kind::Random)
+                  .Case("none", Kind::None)
+                  .Default(Kind::Invalid);
+    if (UseCUID == Kind::Invalid) {
+      D.Diag(clang::diag::err_drv_invalid_value)
+          << A->getAsString(Args) << UseCUIDStr;
+    }
+  }
+
+  FixedCUID = Args.getLastArgValue(options::OPT_cuid_EQ);
+  if (!FixedCUID.empty())
+    UseCUID = Kind::Fixed;
+}
+
+std::string CUIDOptions::getCUID(StringRef InputFile,
+                                 llvm::opt::DerivedArgList &Args) const {
+  std::string CUID = FixedCUID.str();
+  if (CUID.empty()) {
+    if (UseCUID == Kind::Random)
+      CUID = llvm::utohexstr(llvm::sys::Process::GetRandomNumber(),
+                             /*LowerCase=*/true);
+    else if (UseCUID == Kind::Hash) {
+      llvm::MD5 Hasher;
+      llvm::MD5::MD5Result Hash;
+      SmallString<256> RealPath;
+      llvm::sys::fs::real_path(InputFile, RealPath,
+                               /*expand_tilde=*/true);
+      Hasher.update(RealPath);
+      for (auto *A : Args) {
+        if (A->getOption().matches(options::OPT_INPUT))
+          continue;
+        Hasher.update(A->getAsString(Args));
+      }
+      Hasher.final(Hash);
+      CUID = llvm::utohexstr(Hash.low(), /*LowerCase=*/true);
+    }
+  }
+  return CUID;
+}
 Driver::Driver(StringRef ClangExecutable, StringRef TargetTriple,
                DiagnosticsEngine &Diags, std::string Title,
                IntrusiveRefCntPtr<llvm::vfs::FileSystem> VFS)
@@ -875,6 +920,9 @@ void Driver::CreateOffloadingDeviceToolChains(Compilation &C,
     C.addOffloadDeviceToolChain(HIPTC, OFK);
   }
 
+  if (IsCuda || IsHIP)
+    CUIDOpts = CUIDOptions(C.getArgs(), *this);
+
   //
   // OpenMP
   //
@@ -3156,19 +3204,15 @@ class OffloadingActionBuilder final {
     /// Default GPU architecture if there's no one specified.
     OffloadArch DefaultOffloadArch = OffloadArch::UNKNOWN;
 
-    /// Method to generate compilation unit ID specified by option
-    /// '-fuse-cuid='.
-    enum UseCUIDKind { CUID_Hash, CUID_Random, CUID_None, CUID_Invalid };
-    UseCUIDKind UseCUID = CUID_Hash;
-
-    /// Compilation unit ID specified by option '-cuid='.
-    StringRef FixedCUID;
+    /// Compilation unit ID specified by option '-fuse-cuid=' or'-cuid='.
+    const CUIDOptions &CUIDOpts;
 
   public:
     CudaActionBuilderBase(Compilation &C, DerivedArgList &Args,
                           const Driver::InputList &Inputs,
                           Action::OffloadKind OFKind)
-        : DeviceActionBuilder(C, Args, Inputs, OFKind) {
+        : DeviceActionBuilder(C, Args, Inputs, OFKind),
+          CUIDOpts(C.getDriver().getCUIDOpts()) {
 
       CompileDeviceOnly = C.getDriver().offloadDeviceOnly();
       Relocatable = Args.hasFlag(options::OPT_fgpu_rdc,
@@ -3199,28 +3243,8 @@ class OffloadingActionBuilder final {
         // Set the flag to true, so that the builder acts on the current input.
         IsActive = true;
 
-        std::string CUID = FixedCUID.str();
-        if (CUID.empty()) {
-          if (UseCUID == CUID_Random)
-            CUID = llvm::utohexstr(llvm::sys::Process::GetRandomNumber(),
-                                   /*LowerCase=*/true);
-          else if (UseCUID == CUID_Hash) {
-            llvm::MD5 Hasher;
-            llvm::MD5::MD5Result Hash;
-            SmallString<256> RealPath;
-            llvm::sys::fs::real_path(IA->getInputArg().getValue(), RealPath,
-                                     /*expand_tilde=*/true);
-            Hasher.update(RealPath);
-            for (auto *A : Args) {
-              if (A->getOption().matches(options::OPT_INPUT))
-                continue;
-              Hasher.update(A->getAsString(Args));
-            }
-            Hasher.final(Hash);
-            CUID = llvm::utohexstr(Hash.low(), /*LowerCase=*/true);
-          }
-        }
-        IA->setId(CUID);
+        if (CUIDOpts.isEnabled())
+          IA->setId(CUIDOpts.getCUID(IA->getInputArg().getValue(), Args));
 
         if (CompileHostOnly)
           return ABRT_Success;
@@ -3346,21 +3370,6 @@ class OffloadingActionBuilder final {
       CompileHostOnly = C.getDriver().offloadHostOnly();
       EmitLLVM = Args.getLastArg(options::OPT_emit_llvm);
       EmitAsm = Args.getLastArg(options::OPT_S);
-      FixedCUID = Args.getLastArgValue(options::OPT_cuid_EQ);
-      if (Arg *A = Args.getLastArg(options::OPT_fuse_cuid_EQ)) {
-        StringRef UseCUIDStr = A->getValue();
-        UseCUID = llvm::StringSwitch<UseCUIDKind>(UseCUIDStr)
-                      .Case("hash", CUID_Hash)
-                      .Case("random", CUID_Random)
-                      .Case("none", CUID_None)
-                      .Default(CUID_Invalid);
-        if (UseCUID == CUID_Invalid) {
-          C.getDriver().Diag(diag::err_drv_invalid_value)
-              << A->getAsString(Args) << UseCUIDStr;
-          C.setContainsError();
-          return true;
-        }
-      }
 
       // --offload and --offload-arch options are mutually exclusive.
       if (Args.hasArgNoClaim(options::OPT_offload_EQ) &&
@@ -4360,6 +4369,10 @@ void Driver::BuildActions(Compilation &C, DerivedArgList &Args,
     // Build the pipeline for this file.
     Action *Current = C.MakeAction<InputAction>(*InputArg, InputType);
 
+    if (CUIDOpts.isEnabled() && types::isSrcFile(InputType))
+      cast<InputAction>(Current)->setId(
+          CUIDOpts.getCUID(InputArg->getValue(), Args));
+
     // Use the current host action in any of the offloading actions, if
     // required.
     if (!UseNewOffloadingDriver)
@@ -4805,8 +4818,12 @@ Action *Driver::BuildOffloadingActions(Compilation &C,
         TCAndArchs.push_back(std::make_pair(TC, Arch));
     }
 
-    for (unsigned I = 0, E = TCAndArchs.size(); I != E; ++I)
-      DeviceActions.push_back(C.MakeAction<InputAction>(*InputArg, InputType));
+    for (unsigned I = 0, E = TCAndArchs.size(); I != E; ++I) {
+      auto *IA = C.MakeAction<InputAction>(*InputArg, InputType);
+      if (CUIDOpts.isEnabled())
+        IA->setId(CUIDOpts.getCUID(IA->getInputArg().getValue(), Args));
+      DeviceActions.push_back(IA);
+    }
 
     if (DeviceActions.empty())
       return HostAction;
diff --git a/clang/test/Driver/hip-cuid.hip b/clang/test/Driver/hip-cuid.hip
index 2e38c59ccf5ef1..48264950b22344 100644
--- a/clang/test/Driver/hip-cuid.hip
+++ b/clang/test/Driver/hip-cuid.hip
@@ -80,6 +80,17 @@
 // RUN:   %S/Inputs/hip_multiple_inputs/b.hip \
 // RUN: 2>&1 | FileCheck -check-prefixes=DEVICE %s
 
+// Check cuid is supported by the new driver.
+// RUN: %clang -### -x hip \
+// RUN:   --target=x86_64-unknown-linux-gnu \
+// RUN:   --no-offload-new-driver \
+// RUN:   --offload-arch=gfx900 \
+// RUN:   --offload-arch=gfx906 \
+// RUN:   -c -nogpuinc -nogpulib --offload-new-driver \
+// RUN:   %S/Inputs/hip_multiple_inputs/a.cu \
+// RUN:   %S/Inputs/hip_multiple_inputs/b.hip \
+// RUN: 2>&1 | FileCheck -check-prefixes=COMMON,HEX %s
+
 // INVALID: invalid value 'invalid' in '-fuse-cuid=invalid'
 
 // COMMON: "-cc1"{{.*}} "-triple" "amdgcn-amd-amdhsa"

Copy link
Contributor

@jhuber6 jhuber6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the new driver just got some entropy from the current source file's inode entry? I'm not opposed, since I guess it would make sense to give the user a way to override it.

@yxsamliu
Copy link
Collaborator Author

I thought the new driver just got some entropy from the current source file's inode entry? I'm not opposed, since I guess it would make sense to give the user a way to override it.

Sorry I did not quite get your question. Do you suggest that we need some option to override the CUID behavior? Currently clang already has options to disable it or set it to a fixed value.

@jhuber6
Copy link
Contributor

jhuber6 commented Jan 14, 2025

I thought the new driver just got some entropy from the current source file's inode entry? I'm not opposed, since I guess it would make sense to give the user a way to override it.

Sorry I did not quite get your question. Do you suggest that we need some option to override the CUID behavior? Currently clang already has options to disable it or set it to a fixed value.

I just mean that we do 'handle' static variables currently in the new driver without CUID, it just makes a random string based off the file's inode value. The CUID would then be relevant here if the user needed to override that for some reason. (Maybe some utterly bizarre parallel compilation on a remote server or compiling the same file twice with a different macro set?)

@yxsamliu
Copy link
Collaborator Author

I thought the new driver just got some entropy from the current source file's inode entry? I'm not opposed, since I guess it would make sense to give the user a way to override it.

Sorry I did not quite get your question. Do you suggest that we need some option to override the CUID behavior? Currently clang already has options to disable it or set it to a fixed value.

I just mean that we do 'handle' static variables currently in the new driver without CUID, it just makes a random string based off the file's inode value. The CUID would then be relevant here if the user needed to override that for some reason. (Maybe some utterly bizarre parallel compilation on a remote server or compiling the same file twice with a different macro set?)

Some HIP apps (e.g. rccl) compiles the same file with different options to generate different object files. inode itself is not sufficient to identify a TU. A combination of file ID with compile options is needed to identify a TU. For CUDA/HIP, currently static device var needs to be externalized with a postfix of CUID to be able to be accessed in host code. This mechanism requires CUID to be set.

I would suggest we use the same mechanism to support static device variable in the old and new driver.

@jhuber6
Copy link
Contributor

jhuber6 commented Jan 14, 2025

I thought the new driver just got some entropy from the current source file's inode entry? I'm not opposed, since I guess it would make sense to give the user a way to override it.

Sorry I did not quite get your question. Do you suggest that we need some option to override the CUID behavior? Currently clang already has options to disable it or set it to a fixed value.

I just mean that we do 'handle' static variables currently in the new driver without CUID, it just makes a random string based off the file's inode value. The CUID would then be relevant here if the user needed to override that for some reason. (Maybe some utterly bizarre parallel compilation on a remote server or compiling the same file twice with a different macro set?)

Some HIP apps (e.g. rccl) compiles the same file with different options to generate different object files. inode itself is not sufficient to identify a TU. A combination of file ID with compile options is needed to identify a TU. For CUDA/HIP, currently static device var needs to be externalized with a postfix of CUID to be able to be accessed in host code. This mechanism requires CUID to be set.

I would suggest we use the same mechanism to support static device variable in the old and new driver.

Yeah, that's fine, but we should just default to the inode method if the argument isn't present I think. That way it works without needing this extra argument passed to both the compile phases.

Copy link
Contributor

@jhuber6 jhuber6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get a new test showing CUDA? CUDA uses the new driver by default so cuid stuff will be missing there.

@yxsamliu
Copy link
Collaborator Author

Yeah, that's fine, but we should just default to the inode method if the argument isn't present I think. That way it works without needing this extra argument passed to both the compile phases.

That would break rccl. We should use CUID's that work in general, including apps that compile the same file with different options.

@jhuber6
Copy link
Contributor

jhuber6 commented Jan 14, 2025

That would break rccl. We should use CUID's that work in general, including apps that compile the same file with different options.

Shouldn't be broken if the driver handles the CUID, I just mean that it would work in cc1 if not passed.

@yxsamliu
Copy link
Collaborator Author

Can we get a new test showing CUDA? CUDA uses the new driver by default so cuid stuff will be missing there.

Added.

Copy link
Member

@Artem-B Artem-B left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in principle, but I'll defer to @jhuber6 as the new driver owner.

clang/lib/Driver/Driver.cpp Outdated Show resolved Hide resolved
clang/lib/Driver/Driver.cpp Outdated Show resolved Hide resolved
CUID is needed by CUDA/HIP for supporting accessing static device variables
in host function.

Currently CUID is only supported by the old driver for CUDA/HIP.
The new clang driver does not support it, which causes CUDA/HIP programs
using static device variables in host functions to fail with
the new driver for CUDA/HIP.

This patch refactors the CUID support in the old driver so that CUID
is supported by both the old and the new drivers for CUDA/HIP.
@yxsamliu yxsamliu merged commit 8ac35bd into llvm:main Jan 15, 2025
8 checks passed
paulhuggett pushed a commit to paulhuggett/llvm-project that referenced this pull request Jan 16, 2025
CUID is needed by CUDA/HIP for supporting accessing static device
variables in host function.

Currently CUID is only supported by the old driver for CUDA/HIP. The new
driver does not support it, which causes CUDA/HIP programs using static
device variables in host functions to fail with the new driver for
CUDA/HIP.

This patch refactors the CUID support in the old driver so that CUID is
supported by both the old and the new drivers for CUDA/HIP.
DKLoehr pushed a commit to DKLoehr/llvm-project that referenced this pull request Jan 17, 2025
CUID is needed by CUDA/HIP for supporting accessing static device
variables in host function.

Currently CUID is only supported by the old driver for CUDA/HIP. The new
driver does not support it, which causes CUDA/HIP programs using static
device variables in host functions to fail with the new driver for
CUDA/HIP.

This patch refactors the CUID support in the old driver so that CUID is
supported by both the old and the new drivers for CUDA/HIP.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants