Skip to content

Commit 058d9ba

Browse files
yxsamliumschwaig
authored andcommitted
backport 7e28234
Reland "[HIP] Support compressing device binary" Original PR: llvm#67162 The commit was reverted due to UB detected by santizer: https://lab.llvm.org/buildbot/#/builders/238/builds/5955 clang/lib/Driver/OffloadBundler.cpp:1012:25: runtime error: load of misaligned address 0xaaaae2d90e7c for type 'const uint64_t' (aka 'const unsigned long'), which requires 8 byte alignment It was fixed by using memcpy instead of dereferencing int* casted from unaligned char*. Co-Authored-By: Martin Schwaighofer <mschwaig@users.noreply.github.com> (only did the backport)
1 parent af27734 commit 058d9ba

File tree

16 files changed

+659
-60
lines changed

16 files changed

+659
-60
lines changed

Diff for: clang/docs/ClangOffloadBundler.rst

+27
Original file line numberDiff line numberDiff line change
@@ -498,3 +498,30 @@ target by comparing bundle ID's. Two bundle ID's are considered compatible if:
498498
Verbose printing of matched/unmatched comparisons between bundle entry id of
499499
a device binary from HDA and bundle entry ID of a given target processor
500500
(see :ref:`compatibility-bundle-entry-id`).
501+
502+
Compression and Decompression
503+
=============================
504+
505+
``clang-offload-bundler`` provides features to compress and decompress the full
506+
bundle, leveraging inherent redundancies within the bundle entries. Use the
507+
`-compress` command-line option to enable this compression capability.
508+
509+
The compressed offload bundle begins with a header followed by the compressed binary data:
510+
511+
- **Magic Number (4 bytes)**:
512+
This is a unique identifier to distinguish compressed offload bundles. The value is the string 'CCOB' (Compressed Clang Offload Bundle).
513+
514+
- **Version Number (16-bit unsigned int)**:
515+
This denotes the version of the compressed offload bundle format. The current version is `1`.
516+
517+
- **Compression Method (16-bit unsigned int)**:
518+
This field indicates the compression method used. The value corresponds to either `zlib` or `zstd`, represented as a 16-bit unsigned integer cast from the LLVM compression enumeration.
519+
520+
- **Uncompressed Binary Size (32-bit unsigned int)**:
521+
This is the size (in bytes) of the binary data before it was compressed.
522+
523+
- **Hash (64-bit unsigned int)**:
524+
This is a 64-bit truncated MD5 hash of the uncompressed binary data. It serves for verification and caching purposes.
525+
526+
- **Compressed Data**:
527+
The actual compressed binary data follows the header. Its size can be inferred from the total size of the file minus the header size.

Diff for: clang/include/clang/Driver/OffloadBundler.h

+37
Original file line numberDiff line numberDiff line change
@@ -19,18 +19,23 @@
1919

2020
#include "llvm/Support/Error.h"
2121
#include "llvm/TargetParser/Triple.h"
22+
#include <llvm/Support/MemoryBuffer.h>
2223
#include <string>
2324
#include <vector>
2425

2526
namespace clang {
2627

2728
class OffloadBundlerConfig {
2829
public:
30+
OffloadBundlerConfig();
31+
2932
bool AllowNoHost = false;
3033
bool AllowMissingBundles = false;
3134
bool CheckInputArchive = false;
3235
bool PrintExternalCommands = false;
3336
bool HipOpenmpCompatible = false;
37+
bool Compress = false;
38+
bool Verbose = false;
3439

3540
unsigned BundleAlignment = 1;
3641
unsigned HostInputIndex = ~0u;
@@ -82,6 +87,38 @@ struct OffloadTargetInfo {
8287
std::string str() const;
8388
};
8489

90+
// CompressedOffloadBundle represents the format for the compressed offload
91+
// bundles.
92+
//
93+
// The format is as follows:
94+
// - Magic Number (4 bytes) - A constant "CCOB".
95+
// - Version (2 bytes)
96+
// - Compression Method (2 bytes) - Uses the values from
97+
// llvm::compression::Format.
98+
// - Uncompressed Size (4 bytes).
99+
// - Truncated MD5 Hash (8 bytes).
100+
// - Compressed Data (variable length).
101+
102+
class CompressedOffloadBundle {
103+
private:
104+
static inline const size_t MagicSize = 4;
105+
static inline const size_t VersionFieldSize = sizeof(uint16_t);
106+
static inline const size_t MethodFieldSize = sizeof(uint16_t);
107+
static inline const size_t SizeFieldSize = sizeof(uint32_t);
108+
static inline const size_t HashFieldSize = 8;
109+
static inline const size_t HeaderSize = MagicSize + VersionFieldSize +
110+
MethodFieldSize + SizeFieldSize +
111+
HashFieldSize;
112+
static inline const llvm::StringRef MagicNumber = "CCOB";
113+
static inline const uint16_t Version = 1;
114+
115+
public:
116+
static llvm::Expected<std::unique_ptr<llvm::MemoryBuffer>>
117+
compress(const llvm::MemoryBuffer &Input, bool Verbose = false);
118+
static llvm::Expected<std::unique_ptr<llvm::MemoryBuffer>>
119+
decompress(const llvm::MemoryBuffer &Input, bool Verbose = false);
120+
};
121+
85122
} // namespace clang
86123

87124
#endif // LLVM_CLANG_DRIVER_OFFLOADBUNDLER_H

Diff for: clang/include/clang/Driver/Options.td

+5
Original file line numberDiff line numberDiff line change
@@ -984,6 +984,11 @@ def fconvergent_functions : Flag<["-"], "fconvergent-functions">, Group<f_Group>
984984
def gpu_use_aux_triple_only : Flag<["--"], "gpu-use-aux-triple-only">,
985985
InternalDriverOpt, HelpText<"Prepare '-aux-triple' only without populating "
986986
"'-aux-target-cpu' and '-aux-target-feature'.">;
987+
988+
def offload_compress : Flag<["--"], "offload-compress">,
989+
HelpText<"Compress offload device binaries (HIP only)">;
990+
def no_offload_compress : Flag<["--"], "no-offload-compress">;
991+
987992
def cuda_include_ptx_EQ : Joined<["--"], "cuda-include-ptx=">, Flags<[NoXarchOption]>,
988993
HelpText<"Include PTX for the following GPU architecture (e.g. sm_35) or 'all'. May be specified more than once.">;
989994
def no_cuda_include_ptx_EQ : Joined<["--"], "no-cuda-include-ptx=">, Flags<[NoXarchOption]>,

0 commit comments

Comments
 (0)