Skip to content
This repository has been archived by the owner on Apr 23, 2021. It is now read-only.

[ROCm] Adding pass to generate the HSACO binary blob from the GPU kernel function #181

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,13 @@ endif()

set(MLIR_CUDA_RUNNER_ENABLED 0 CACHE BOOL "Enable building the mlir CUDA runner")

# Build the ROCM conversions if the AMDGPU backend is available
if ("AMDGPU" IN_LIST LLVM_TARGETS_TO_BUILD)
set(MLIR_ROCM_CONVERSIONS_ENABLED 1)
else()
set(MLIR_ROCM_CONVERSIONS_ENABLED 0)
endif()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't correct I believe: at the moment you have a dependency on the environment (lld, ROCm runtime bitcode libraries, etc.)

As mentioned elsewhere: there should be a step detecting the environment (and we should probably put this behind an opt-in CMake flag by the way) and generating a rocm_config.h that can be included in the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have address this in the CMakeLists.txt file in the GPUToROCM dir

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This is much better :)


include_directories( "include")
include_directories( ${MLIR_INCLUDE_DIR})

Expand Down
1 change: 1 addition & 0 deletions include/mlir/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,4 @@ add_subdirectory(Analysis)
add_subdirectory(Dialect)
add_subdirectory(EDSC)
add_subdirectory(Transforms)
add_subdirectory(Conversion/GPUToROCM)
33 changes: 33 additions & 0 deletions include/mlir/Conversion/GPUToROCM/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
if(MLIR_ROCM_CONVERSIONS_ENABLED)

# Check whether the ROCm installation dir exists
set(ROCM_INSTALL_DIR "/opt/rocm" CACHE STRING "ROCm installation directory")
if (EXISTS ${ROCM_INSTALL_DIR})
message("-- ROCm Install Dir - ${ROCM_INSTALL_DIR}")
else()
message(SEND_ERROR "-- NOT FOUND : ROCm Install Dir - ${ROCM_INSTALL_DIR}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't issue an error when this will be the normal behavior (a complete config on a machine without ROCM).

We should differentiate when the user is setting the value and when it is the default one. When the user provides the path to ROCM and we don't find it, we should just use FATAL_ERROR, otherwise a normal STATUS should be enough.

Can you also set the variables to empty in the else branch?

Finally, if ROCM is correctly detected, it seems like we should set a variable so that we can distinguish on it in the code and in lit tests as well: basically we have two levels that we should distinguish:

  1. AMDGPU backend is built-in: we can run the test pass.
  2. We have 1) and ROCM is installed/configured on the system. In this case we can run another test that exercise the linker invocation as well.

In lit you have config.run_rocm_tests at the moment, which is only indicating 1) above. We should have another flag for 2), and a macro in the code as well ideally.

endif()

# Check whether the ROCm device library dir exists
set(ROCM_DEVICE_LIB_DIR ${ROCM_INSTALL_DIR}/lib)
if (EXISTS ${ROCM_DEVICE_LIB_DIR})
message("-- ROCm Device Library Dir - ${ROCM_DEVICE_LIB_DIR}")
else ()
message(SEND_ERROR "-- NOT FOUND : ROCm Device Library Dir - ${ROCM_DEVICE_LIB_DIR}")
endif()

# Check whether the ROCm HCC linker exists
set(ROCM_HCC_LINKER ${ROCM_INSTALL_DIR}/hcc/bin/ld.lld)
if (EXISTS ${ROCM_HCC_LINKER})
message("-- ROCm HCC Linker - ${ROCM_HCC_LINKER}")
else ()
message(SEND_ERROR "-- NOT FOUND : ROCm HCC Linker - ${ROCM_HCC_LINKER}")
endif()

# Generate the ROCm Configuration header file
configure_file(
"${CMAKE_CURRENT_SOURCE_DIR}/ROCMConfig.h.in"
"${CMAKE_CURRENT_BINARY_DIR}/ROCMConfig.h"
)

endif()
92 changes: 92 additions & 0 deletions include/mlir/Conversion/GPUToROCM/GPUToROCMPass.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
//===- GPUToROCmPass.h - MLIR ROCm runtime support --------------*- C++ -*-===//
//
// Copyright 2019 The MLIR Authors.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
// =============================================================================
#ifndef MLIR_CONVERSION_GPUTOROCM_GPUTOROCMPASS_H_
#define MLIR_CONVERSION_GPUTOROCM_GPUTOROCMPASS_H_

#include <functional>
#include <memory>
#include <string>
#include <vector>

#include "mlir/Conversion/GPUToROCM/ROCMConfig.h"

namespace mlir {

namespace rocm {

/// string constants used by the ROCM backend
static constexpr const char *kHSACOAnnotation = "amdgpu.hsaco";
static constexpr const char *kHSACOGetterAnnotation = "amdgpu.hsacogetter";
static constexpr const char *kHSACOGetterSuffix = "_hsaco";
static constexpr const char *kHSACOStorageSuffix = "_hsaco_cst";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find a use for kHSACOGetterSuffix and kHSACOStorageSuffix?

Also are all these definition sintended to be defined in the public header of are they more internal detail of the pass (and could be in the implementation file).


/// enum to represent the AMD GPU versions supported by the ROCM backend
enum class AMDGPUVersion { GFX900 };

/// enum to represent the HSA Code Object versions supported by the ROCM backend
enum class HSACOVersion { V3 };

/// Configurable parameters for generating the HSACO blobs from GPU Kernels
struct HSACOGeneratorConfig {

/// Constructor - sets the default values for the configurable parameters
HSACOGeneratorConfig(bool isTestMode)
: testMode(isTestMode), amdgpuVersion(AMDGPUVersion::GFX900),
hsacoVersion(HSACOVersion::V3), rocdlDir(ROCM_DEVICE_LIB_DIR),
linkerPath(ROCM_HCC_LINKER) {}

/// testMode == true will result in skipping the HASCO generation process, and
/// simply return the string "HSACO" as the HSACO blob
bool testMode;

/// the AMDGPU version for which to generate the HSACO
AMDGPUVersion amdgpuVersion;

/// the code object version for the generated HSACO
HSACOVersion hsacoVersion;

/// the directory containing the ROCDL bitcode libraries
std::string rocdlDir;

/// the path the ld.lld linker to use when generating the HSACO
std::string linkerPath;
};

} // namespace rocm

// unique pointer to the HSA Code Object (which is stored as char vector)
using OwnedHSACO = std::unique_ptr<std::vector<char>>;
joker-eph marked this conversation as resolved.
Show resolved Hide resolved

class ModuleOp;
template <typename T>
class OpPassBase;

/// Creates a pass to convert kernel functions into HSA Code Object blobs.
///
/// This transformation takes the body of each function that is annotated with
/// the amdgpu_kernel calling convention, copies it to a new LLVM module,
/// compiles the module with help of the AMDGPU backend to GCN ISA, and then
/// invokes lld to produce a binary blob in HSA Code Object format. Such blob
/// is then attached as a string attribute named 'amdgpu.hsaco' to the kernel
/// function. After the transformation, the body of the kernel function is
/// removed (i.e., it is turned into a declaration).
std::unique_ptr<OpPassBase<ModuleOp>> createConvertGPUKernelToHSACOPass(
rocm::HSACOGeneratorConfig hsacoGeneratorConfig);

} // namespace mlir

#endif // MLIR_CONVERSION_GPUTOROCM_GPUTOROCMPASS_H_
30 changes: 30 additions & 0 deletions include/mlir/Conversion/GPUToROCM/ROCMConfig.h.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
//===- ROCMConfig.h - ROCm Configuration Header -----------------*- C++ -*-===//
//
// Copyright 2019 The MLIR Authors.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
// =============================================================================
#ifndef MLIR_CONVERSION_GPUTOROCM_ROCMCONFIG_H_
#define MLIR_CONVERSION_GPUTOROCM_ROCMCONFIG_H_

/// The code to generate the HSACO binary blobs (corresponding the GPU kernels)
/// assumes the presense of ROCm libraries/utilities. The location of these
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// assumes the presense of ROCm libraries/utilities. The location of these
/// assumes the presence of ROCm libraries/utilities. The location of these

/// tools is configured via cmake

/// Path to the ROCm Device Library dir in the ROCM install
#cmakedefine ROCM_DEVICE_LIB_DIR "@ROCM_DEVICE_LIB_DIR@"

/// Path to the HCC Linker in the ROCM install
#cmakedefine ROCM_HCC_LINKER "@ROCM_HCC_LINKER@"

#endif // MLIR_CONVERSION_GPUTOROCM_ROCMCONFIG_H_
1 change: 1 addition & 0 deletions lib/Conversion/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
add_subdirectory(GPUToCUDA)
add_subdirectory(GPUToNVVM)
add_subdirectory(GPUToROCM)
add_subdirectory(GPUToROCDL)
add_subdirectory(GPUToSPIRV)
add_subdirectory(LoopsToGPU)
Expand Down
15 changes: 15 additions & 0 deletions lib/Conversion/GPUToROCM/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
if(MLIR_ROCM_CONVERSIONS_ENABLED)
llvm_map_components_to_libnames(amdgpu "AMDGPU")

add_llvm_library(MLIRGPUtoROCMTransforms
ConvertKernelFuncToHSACO.cpp
)
target_link_libraries(MLIRGPUtoROCMTransforms
MLIRGPU
MLIRLLVMIR
MLIRROCDLIR
MLIRPass
MLIRTargetROCDLIR
${amdgpu}
)
endif()
Loading