-
Notifications
You must be signed in to change notification settings - Fork 12.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RISCV] CodeGen of RVE and ilp32e/lp64e ABIs #76777
Conversation
@llvm/pr-subscribers-clang-driver @llvm/pr-subscribers-clang Author: Wang Pengcheng (wangpc-pp) ChangesThis commit includes the necessary changes to clang and LLVM to support The differences between
The central changes in ilp32e/lp64e ABI, compared to ilp32/lp64 are:
If To be compatible with the implementation of ilp32e in GCC, we don't use FastCC is also supported on RVE, while GHC isn't since there is only one Differential Revision: https://reviews.llvm.org/D70401 Patch is 422.14 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/76777.diff 45 Files Affected:
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 0c8fec691bf3c9..aa8d27f5c17551 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -971,6 +971,8 @@ RISC-V Support
^^^^^^^^^^^^^^
- Unaligned memory accesses can be toggled by ``-m[no-]unaligned-access`` or the
aliases ``-m[no-]strict-align``.
+- CodeGen of RV32E/RV64E are supported experimentally.
+- CodeGen of ilp32e/lp64e are supported experimentally.
- Default ABI with F but without D was changed to ilp32f for RV32 and to lp64f
for RV64.
diff --git a/clang/lib/Basic/Targets/RISCV.cpp b/clang/lib/Basic/Targets/RISCV.cpp
index 6bc57a83a2d5ae..8db989df04c87b 100644
--- a/clang/lib/Basic/Targets/RISCV.cpp
+++ b/clang/lib/Basic/Targets/RISCV.cpp
@@ -154,7 +154,7 @@ void RISCVTargetInfo::getTargetDefines(const LangOptions &Opts,
else
Builder.defineMacro("__riscv_float_abi_soft");
- if (ABIName == "ilp32e")
+ if (ABIName == "ilp32e" || ABIName == "lp64e")
Builder.defineMacro("__riscv_abi_rve");
Builder.defineMacro("__riscv_arch_test");
@@ -215,6 +215,13 @@ void RISCVTargetInfo::getTargetDefines(const LangOptions &Opts,
Builder.defineMacro("__riscv_misaligned_fast");
else
Builder.defineMacro("__riscv_misaligned_avoid");
+
+ if (ISAInfo->hasExtension("e")) {
+ if (Is64Bit)
+ Builder.defineMacro("__riscv_64e");
+ else
+ Builder.defineMacro("__riscv_32e");
+ }
}
static constexpr Builtin::Info BuiltinInfo[] = {
@@ -386,6 +393,11 @@ bool RISCVTargetInfo::handleTargetFeatures(std::vector<std::string> &Features,
if (llvm::is_contained(Features, "+experimental"))
HasExperimental = true;
+ if (ABI == "ilp32e" && ISAInfo->hasExtension("d")) {
+ Diags.Report(diag::err_invalid_feature_combination)
+ << "ILP32E must not be used with the D ISA extension";
+ return false;
+ }
return true;
}
diff --git a/clang/lib/Basic/Targets/RISCV.h b/clang/lib/Basic/Targets/RISCV.h
index f98c88cd45f831..bfbdafb682c851 100644
--- a/clang/lib/Basic/Targets/RISCV.h
+++ b/clang/lib/Basic/Targets/RISCV.h
@@ -132,6 +132,12 @@ class LLVM_LIBRARY_VISIBILITY RISCV32TargetInfo : public RISCVTargetInfo {
}
bool setABI(const std::string &Name) override {
+ if (Name == "ilp32e") {
+ ABI = Name;
+ resetDataLayout("e-m:e-p:32:32-i64:64-n32-S32");
+ return true;
+ }
+
if (Name == "ilp32" || Name == "ilp32f" || Name == "ilp32d") {
ABI = Name;
return true;
@@ -156,6 +162,12 @@ class LLVM_LIBRARY_VISIBILITY RISCV64TargetInfo : public RISCVTargetInfo {
}
bool setABI(const std::string &Name) override {
+ if (Name == "lp64e") {
+ ABI = Name;
+ resetDataLayout("e-m:e-p:64:64-i64:64-i128:128-n32:64-S64");
+ return true;
+ }
+
if (Name == "lp64" || Name == "lp64f" || Name == "lp64d") {
ABI = Name;
return true;
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index d78f2594a23764..4e999ddf9fe027 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -229,7 +229,8 @@ createTargetCodeGenInfo(CodeGenModule &CGM) {
ABIFLen = 32;
else if (ABIStr.ends_with("d"))
ABIFLen = 64;
- return createRISCVTargetCodeGenInfo(CGM, XLen, ABIFLen);
+ bool EABI = ABIStr.endswith("e");
+ return createRISCVTargetCodeGenInfo(CGM, XLen, ABIFLen, EABI);
}
case llvm::Triple::systemz: {
diff --git a/clang/lib/CodeGen/TargetInfo.h b/clang/lib/CodeGen/TargetInfo.h
index 0c0781a2d5ab9d..7682f197041c74 100644
--- a/clang/lib/CodeGen/TargetInfo.h
+++ b/clang/lib/CodeGen/TargetInfo.h
@@ -496,7 +496,8 @@ createPPC64_SVR4_TargetCodeGenInfo(CodeGenModule &CGM, PPC64_SVR4_ABIKind Kind,
bool SoftFloatABI);
std::unique_ptr<TargetCodeGenInfo>
-createRISCVTargetCodeGenInfo(CodeGenModule &CGM, unsigned XLen, unsigned FLen);
+createRISCVTargetCodeGenInfo(CodeGenModule &CGM, unsigned XLen, unsigned FLen,
+ bool EABI);
std::unique_ptr<TargetCodeGenInfo>
createCommonSPIRTargetCodeGenInfo(CodeGenModule &CGM);
diff --git a/clang/lib/CodeGen/Targets/RISCV.cpp b/clang/lib/CodeGen/Targets/RISCV.cpp
index 1e1d249b37ac06..0851d1993d0c0f 100644
--- a/clang/lib/CodeGen/Targets/RISCV.cpp
+++ b/clang/lib/CodeGen/Targets/RISCV.cpp
@@ -25,8 +25,9 @@ class RISCVABIInfo : public DefaultABIInfo {
// ISA might have a wider FLen than the selected ABI (e.g. an RV32IF target
// with soft float ABI has FLen==0).
unsigned FLen;
- static const int NumArgGPRs = 8;
- static const int NumArgFPRs = 8;
+ const int NumArgGPRs;
+ const int NumArgFPRs;
+ const bool EABI;
bool detectFPCCEligibleStructHelper(QualType Ty, CharUnits CurOff,
llvm::Type *&Field1Ty,
CharUnits &Field1Off,
@@ -34,8 +35,10 @@ class RISCVABIInfo : public DefaultABIInfo {
CharUnits &Field2Off) const;
public:
- RISCVABIInfo(CodeGen::CodeGenTypes &CGT, unsigned XLen, unsigned FLen)
- : DefaultABIInfo(CGT), XLen(XLen), FLen(FLen) {}
+ RISCVABIInfo(CodeGen::CodeGenTypes &CGT, unsigned XLen, unsigned FLen,
+ bool EABI)
+ : DefaultABIInfo(CGT), XLen(XLen), FLen(FLen), NumArgGPRs(EABI ? 6 : 8),
+ NumArgFPRs(FLen != 0 ? 8 : 0), EABI(EABI) {}
// DefaultABIInfo's classifyReturnType and classifyArgumentType are
// non-virtual, but computeInfo is virtual, so we overload it.
@@ -86,7 +89,7 @@ void RISCVABIInfo::computeInfo(CGFunctionInfo &FI) const {
}
int ArgGPRsLeft = IsRetIndirect ? NumArgGPRs - 1 : NumArgGPRs;
- int ArgFPRsLeft = FLen ? NumArgFPRs : 0;
+ int ArgFPRsLeft = NumArgFPRs;
int NumFixedArgs = FI.getNumRequiredArgs();
int ArgNum = 0;
@@ -396,9 +399,12 @@ ABIArgInfo RISCVABIInfo::classifyArgumentType(QualType Ty, bool IsFixed,
// Determine the number of GPRs needed to pass the current argument
// according to the ABI. 2*XLen-aligned varargs are passed in "aligned"
// register pairs, so may consume 3 registers.
+ // TODO: To be compatible with GCC's behaviors, we don't align registers
+ // currently if we are using ILP32E calling convention. This behavior may be
+ // changed when RV32E/ILP32E is ratified.
int NeededArgGPRs = 1;
if (!IsFixed && NeededAlign == 2 * XLen)
- NeededArgGPRs = 2 + (ArgGPRsLeft % 2);
+ NeededArgGPRs = 2 + (EABI && XLen == 32 ? 0 : (ArgGPRsLeft % 2));
else if (Size > XLen && Size <= 2 * XLen)
NeededArgGPRs = 2;
@@ -480,6 +486,13 @@ Address RISCVABIInfo::EmitVAArg(CodeGenFunction &CGF, Address VAListAddr,
auto TInfo = getContext().getTypeInfoInChars(Ty);
+ // TODO: To be compatible with GCC's behaviors, we force arguments with
+ // 2×XLEN-bit alignment and size at most 2×XLEN bits like `long long`,
+ // `unsigned long long` and `double` to have 4-byte alignment. This
+ // behavior may be changed when RV32E/ILP32E is ratified.
+ if (EABI && XLen == 32)
+ TInfo.Align = std::min(TInfo.Align, CharUnits::fromQuantity(4));
+
// Arguments bigger than 2*Xlen bytes are passed indirectly.
bool IsIndirect = TInfo.Width > 2 * SlotSize;
@@ -499,8 +512,9 @@ namespace {
class RISCVTargetCodeGenInfo : public TargetCodeGenInfo {
public:
RISCVTargetCodeGenInfo(CodeGen::CodeGenTypes &CGT, unsigned XLen,
- unsigned FLen)
- : TargetCodeGenInfo(std::make_unique<RISCVABIInfo>(CGT, XLen, FLen)) {}
+ unsigned FLen, bool EABI)
+ : TargetCodeGenInfo(
+ std::make_unique<RISCVABIInfo>(CGT, XLen, FLen, EABI)) {}
void setTargetAttributes(const Decl *D, llvm::GlobalValue *GV,
CodeGen::CodeGenModule &CGM) const override {
@@ -526,6 +540,7 @@ class RISCVTargetCodeGenInfo : public TargetCodeGenInfo {
std::unique_ptr<TargetCodeGenInfo>
CodeGen::createRISCVTargetCodeGenInfo(CodeGenModule &CGM, unsigned XLen,
- unsigned FLen) {
- return std::make_unique<RISCVTargetCodeGenInfo>(CGM.getTypes(), XLen, FLen);
+ unsigned FLen, bool EABI) {
+ return std::make_unique<RISCVTargetCodeGenInfo>(CGM.getTypes(), XLen, FLen,
+ EABI);
}
diff --git a/clang/lib/Driver/ToolChains/Arch/RISCV.cpp b/clang/lib/Driver/ToolChains/Arch/RISCV.cpp
index 0717e3b813e1e2..76fedc576b8656 100644
--- a/clang/lib/Driver/ToolChains/Arch/RISCV.cpp
+++ b/clang/lib/Driver/ToolChains/Arch/RISCV.cpp
@@ -210,6 +210,7 @@ StringRef riscv::getRISCVABI(const ArgList &Args, const llvm::Triple &Triple) {
// rv32e -> ilp32e
// rv32* -> ilp32
// rv64g | rv64*d -> lp64d
+ // rv64e -> lp64e
// rv64* -> lp64
StringRef Arch = getRISCVArch(Args, Triple);
@@ -285,6 +286,7 @@ StringRef riscv::getRISCVArch(const llvm::opt::ArgList &Args,
// 3. Choose a default based on `-mabi=`
//
// ilp32e -> rv32e
+ // lp64e -> rv64e
// ilp32 | ilp32f | ilp32d -> rv32imafdc
// lp64 | lp64f | lp64d -> rv64imafdc
if (const Arg *A = Args.getLastArg(options::OPT_mabi_EQ)) {
@@ -292,6 +294,8 @@ StringRef riscv::getRISCVArch(const llvm::opt::ArgList &Args,
if (MABI.equals_insensitive("ilp32e"))
return "rv32e";
+ else if (MABI.starts_with_insensitive("lp64e"))
+ return "rv64e";
else if (MABI.starts_with_insensitive("ilp32"))
return "rv32imafdc";
else if (MABI.starts_with_insensitive("lp64")) {
diff --git a/clang/test/CodeGen/RISCV/riscv32-abi.c b/clang/test/CodeGen/RISCV/riscv32-abi.c
index 040ae500fc60e4..3645e15ccefcdc 100644
--- a/clang/test/CodeGen/RISCV/riscv32-abi.c
+++ b/clang/test/CodeGen/RISCV/riscv32-abi.c
@@ -5,6 +5,8 @@
// RUN: | FileCheck -check-prefixes=ILP32-ILP32F-ILP32D,ILP32F-ILP32D,ILP32-ILP32F,ILP32F %s
// RUN: %clang_cc1 -triple riscv32 -target-feature +f -target-feature +d -target-abi ilp32d -emit-llvm %s -o - \
// RUN: | FileCheck -check-prefixes=ILP32-ILP32F-ILP32D,ILP32F-ILP32D,ILP32D %s
+// RUN: %clang_cc1 -triple riscv32 -emit-llvm -target-abi ilp32e %s -o - \
+// RUN: | FileCheck -check-prefixes=ILP32-ILP32F-ILP32D,ILP32-ILP32F,ILP32,ILP32E %s
#include <stddef.h>
#include <stdint.h>
@@ -2064,4 +2066,5 @@ union float16_u f_ret_float16_u(void) {
}
//// NOTE: These prefixes are unused and the list is autogenerated. Do not add tests below this line:
+// ILP32E: {{.*}}
// ILP32F: {{.*}}
diff --git a/clang/test/CodeGen/RISCV/riscv32-ilp32e-error.c b/clang/test/CodeGen/RISCV/riscv32-ilp32e-error.c
new file mode 100644
index 00000000000000..86444112bde297
--- /dev/null
+++ b/clang/test/CodeGen/RISCV/riscv32-ilp32e-error.c
@@ -0,0 +1,4 @@
+// RUN: not %clang_cc1 -triple riscv32 -target-feature +d -emit-llvm -target-abi ilp32e %s 2>&1 \
+// RUN: | FileCheck -check-prefix=ILP32E-WITH-FD %s
+
+// ILP32E-WITH-FD: error: invalid feature combination: ILP32E must not be used with the D ISA extension
diff --git a/clang/test/CodeGen/RISCV/riscv32-vararg.c b/clang/test/CodeGen/RISCV/riscv32-vararg.c
index 02b1ed38e26556..1c4e41f2f54c8f 100644
--- a/clang/test/CodeGen/RISCV/riscv32-vararg.c
+++ b/clang/test/CodeGen/RISCV/riscv32-vararg.c
@@ -1,9 +1,11 @@
// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 2
// RUN: %clang_cc1 -triple riscv32 -emit-llvm %s -o - | FileCheck %s
// RUN: %clang_cc1 -triple riscv32 -target-feature +f -target-abi ilp32f -emit-llvm %s -o - \
-// RUN: | FileCheck %s
+// RUN: | FileCheck %s -check-prefixes=CHECK,CHECK-ILP32F
// RUN: %clang_cc1 -triple riscv32 -target-feature +d -target-feature +f -target-abi ilp32d -emit-llvm %s -o - \
-// RUN: | FileCheck %s
+// RUN: | FileCheck %s -check-prefixes=CHECK,CHECK-ILP32D
+// RUN: %clang_cc1 -triple riscv32 -target-abi ilp32e -emit-llvm %s -o - \
+// RUN: | FileCheck %s -check-prefixes=CHECK,CHECK-ILP32E
#include <stddef.h>
#include <stdint.h>
@@ -102,24 +104,60 @@ int f_va_1(char *fmt, ...) {
// used to pass varargs with 2x xlen alignment and 2x xlen size. Ensure the
// correct offsets are used.
-// CHECK-LABEL: define dso_local double @f_va_2
-// CHECK-SAME: (ptr noundef [[FMT:%.*]], ...) #[[ATTR0]] {
-// CHECK-NEXT: entry:
-// CHECK-NEXT: [[FMT_ADDR:%.*]] = alloca ptr, align 4
-// CHECK-NEXT: [[VA:%.*]] = alloca ptr, align 4
-// CHECK-NEXT: [[V:%.*]] = alloca double, align 8
-// CHECK-NEXT: store ptr [[FMT]], ptr [[FMT_ADDR]], align 4
-// CHECK-NEXT: call void @llvm.va_start(ptr [[VA]])
-// CHECK-NEXT: [[ARGP_CUR:%.*]] = load ptr, ptr [[VA]], align 4
-// CHECK-NEXT: [[TMP0:%.*]] = getelementptr inbounds i8, ptr [[ARGP_CUR]], i32 7
-// CHECK-NEXT: [[ARGP_CUR_ALIGNED:%.*]] = call ptr @llvm.ptrmask.p0.i32(ptr [[TMP0]], i32 -8)
-// CHECK-NEXT: [[ARGP_NEXT:%.*]] = getelementptr inbounds i8, ptr [[ARGP_CUR_ALIGNED]], i32 8
-// CHECK-NEXT: store ptr [[ARGP_NEXT]], ptr [[VA]], align 4
-// CHECK-NEXT: [[TMP1:%.*]] = load double, ptr [[ARGP_CUR_ALIGNED]], align 8
-// CHECK-NEXT: store double [[TMP1]], ptr [[V]], align 8
-// CHECK-NEXT: call void @llvm.va_end(ptr [[VA]])
-// CHECK-NEXT: [[TMP2:%.*]] = load double, ptr [[V]], align 8
-// CHECK-NEXT: ret double [[TMP2]]
+// CHECK-ILP32F-LABEL: define dso_local double @f_va_2
+// CHECK-ILP32F-SAME: (ptr noundef [[FMT:%.*]], ...) #[[ATTR0]] {
+// CHECK-ILP32F-NEXT: entry:
+// CHECK-ILP32F-NEXT: [[FMT_ADDR:%.*]] = alloca ptr, align 4
+// CHECK-ILP32F-NEXT: [[VA:%.*]] = alloca ptr, align 4
+// CHECK-ILP32F-NEXT: [[V:%.*]] = alloca double, align 8
+// CHECK-ILP32F-NEXT: store ptr [[FMT]], ptr [[FMT_ADDR]], align 4
+// CHECK-ILP32F-NEXT: call void @llvm.va_start(ptr [[VA]])
+// CHECK-ILP32F-NEXT: [[ARGP_CUR:%.*]] = load ptr, ptr [[VA]], align 4
+// CHECK-ILP32F-NEXT: [[TMP0:%.*]] = getelementptr inbounds i8, ptr [[ARGP_CUR]], i32 7
+// CHECK-ILP32F-NEXT: [[ARGP_CUR_ALIGNED:%.*]] = call ptr @llvm.ptrmask.p0.i32(ptr [[TMP0]], i32 -8)
+// CHECK-ILP32F-NEXT: [[ARGP_NEXT:%.*]] = getelementptr inbounds i8, ptr [[ARGP_CUR_ALIGNED]], i32 8
+// CHECK-ILP32F-NEXT: store ptr [[ARGP_NEXT]], ptr [[VA]], align 4
+// CHECK-ILP32F-NEXT: [[TMP1:%.*]] = load double, ptr [[ARGP_CUR_ALIGNED]], align 8
+// CHECK-ILP32F-NEXT: store double [[TMP1]], ptr [[V]], align 8
+// CHECK-ILP32F-NEXT: call void @llvm.va_end(ptr [[VA]])
+// CHECK-ILP32F-NEXT: [[TMP2:%.*]] = load double, ptr [[V]], align 8
+// CHECK-ILP32F-NEXT: ret double [[TMP2]]
+//
+// CHECK-ILP32D-LABEL: define dso_local double @f_va_2
+// CHECK-ILP32D-SAME: (ptr noundef [[FMT:%.*]], ...) #[[ATTR0]] {
+// CHECK-ILP32D-NEXT: entry:
+// CHECK-ILP32D-NEXT: [[FMT_ADDR:%.*]] = alloca ptr, align 4
+// CHECK-ILP32D-NEXT: [[VA:%.*]] = alloca ptr, align 4
+// CHECK-ILP32D-NEXT: [[V:%.*]] = alloca double, align 8
+// CHECK-ILP32D-NEXT: store ptr [[FMT]], ptr [[FMT_ADDR]], align 4
+// CHECK-ILP32D-NEXT: call void @llvm.va_start(ptr [[VA]])
+// CHECK-ILP32D-NEXT: [[ARGP_CUR:%.*]] = load ptr, ptr [[VA]], align 4
+// CHECK-ILP32D-NEXT: [[TMP0:%.*]] = getelementptr inbounds i8, ptr [[ARGP_CUR]], i32 7
+// CHECK-ILP32D-NEXT: [[ARGP_CUR_ALIGNED:%.*]] = call ptr @llvm.ptrmask.p0.i32(ptr [[TMP0]], i32 -8)
+// CHECK-ILP32D-NEXT: [[ARGP_NEXT:%.*]] = getelementptr inbounds i8, ptr [[ARGP_CUR_ALIGNED]], i32 8
+// CHECK-ILP32D-NEXT: store ptr [[ARGP_NEXT]], ptr [[VA]], align 4
+// CHECK-ILP32D-NEXT: [[TMP1:%.*]] = load double, ptr [[ARGP_CUR_ALIGNED]], align 8
+// CHECK-ILP32D-NEXT: store double [[TMP1]], ptr [[V]], align 8
+// CHECK-ILP32D-NEXT: call void @llvm.va_end(ptr [[VA]])
+// CHECK-ILP32D-NEXT: [[TMP2:%.*]] = load double, ptr [[V]], align 8
+// CHECK-ILP32D-NEXT: ret double [[TMP2]]
+//
+// CHECK-ILP32E-LABEL: define dso_local double @f_va_2
+// CHECK-ILP32E-SAME: (ptr noundef [[FMT:%.*]], ...) #[[ATTR0]] {
+// CHECK-ILP32E-NEXT: entry:
+// CHECK-ILP32E-NEXT: [[FMT_ADDR:%.*]] = alloca ptr, align 4
+// CHECK-ILP32E-NEXT: [[VA:%.*]] = alloca ptr, align 4
+// CHECK-ILP32E-NEXT: [[V:%.*]] = alloca double, align 8
+// CHECK-ILP32E-NEXT: store ptr [[FMT]], ptr [[FMT_ADDR]], align 4
+// CHECK-ILP32E-NEXT: call void @llvm.va_start(ptr [[VA]])
+// CHECK-ILP32E-NEXT: [[ARGP_CUR:%.*]] = load ptr, ptr [[VA]], align 4
+// CHECK-ILP32E-NEXT: [[ARGP_NEXT:%.*]] = getelementptr inbounds i8, ptr [[ARGP_CUR]], i32 8
+// CHECK-ILP32E-NEXT: store ptr [[ARGP_NEXT]], ptr [[VA]], align 4
+// CHECK-ILP32E-NEXT: [[TMP0:%.*]] = load double, ptr [[ARGP_CUR]], align 4
+// CHECK-ILP32E-NEXT: store double [[TMP0]], ptr [[V]], align 8
+// CHECK-ILP32E-NEXT: call void @llvm.va_end(ptr [[VA]])
+// CHECK-ILP32E-NEXT: [[TMP1:%.*]] = load double, ptr [[V]], align 8
+// CHECK-ILP32E-NEXT: ret double [[TMP1]]
//
double f_va_2(char *fmt, ...) {
__builtin_va_list va;
@@ -133,40 +171,106 @@ double f_va_2(char *fmt, ...) {
// Two "aligned" register pairs.
-// CHECK-LABEL: define dso_local double @f_va_3
-// CHECK-SAME: (ptr noundef [[FMT:%.*]], ...) #[[ATTR0]] {
-// CHECK-NEXT: entry:
-// CHECK-NEXT: [[FMT_ADDR:%.*]] = alloca ptr, align 4
-// CHECK-NEXT: [[VA:%.*]] = alloca ptr, align 4
-// CHECK-NEXT: [[V:%.*]] = alloca double, align 8
-// CHECK-NEXT: [[W:%.*]] = alloca i32, align 4
-// CHECK-NEXT: [[X:%.*]] = alloca double, align 8
-// CHECK-NEXT: store ptr [[FMT]], ptr [[FMT_ADDR]], align 4
-// CHECK-NEXT: call void @llvm.va_start(ptr [[VA]])
-// CHECK-NEXT: [[ARGP_CUR:%.*]] = load ptr, ptr [[VA]], align 4
-// CHECK-NEXT: [[TMP0:%.*]] = getelementptr inbounds i8, ptr [[ARGP_CUR]], i32 7
-// CHECK-NEXT: [[ARGP_CUR_ALIGNED:%.*]] = call ptr @llvm.ptrmask.p0.i32(ptr [[TMP0]], i32 -8)
-// CHECK-NEXT: [[ARGP_NEXT:%.*]] = getelementptr inbounds i8, ptr [[ARGP_CUR_ALIGNED]], i32 8
-// CHECK-NEXT: store ptr [[ARGP_NEXT]], ptr [[VA]], align 4
-// CHECK-NEXT: [[TMP1:%.*]] = load double, ptr [[ARGP_CUR_ALIGNED]], align 8
-// CHECK-NEXT: store double [[TMP1]], ptr [[V]], align 8
-// CHECK-NEXT: [[ARGP_CUR1:%.*]] = load ptr, ptr [[VA]], align 4
-// CHECK-NEXT: [[ARGP_NEXT2:%.*]] = getelementptr inbounds i8, ptr [[ARGP_CUR1]], i32 4
-// CHECK-NEXT: store ptr [[ARGP_NEXT2]], ptr [[VA]], align 4
-// CHECK-NEXT: [[TMP2:%.*]] = load i32, ptr [[ARGP_CUR1]], align 4
-// CHECK-NEXT: store i32 [[TMP2]], ptr [[W]], align 4
-// CHECK-NEXT: [[ARGP_CUR3:%.*]] = load ptr, ptr [[VA]], align 4
-// CHECK-NEXT: [[TMP3:%.*]] = getelementptr inbounds i8, ptr [[ARGP_CUR3]], i32 7
-// CHECK-NEXT: [[ARGP_CUR3_ALIGNED:%.*]] = call ptr @llvm.ptrmask.p0.i32(ptr [[TMP3]], i32 -8)
-// CHECK-NEXT: [[ARGP_NEXT4:%.*]] = getelementptr inbounds i8, ptr [[ARGP_CUR3_ALIGNED]], i32 8
-// CHECK-NEXT: store ptr [[ARGP_NEXT4]], ptr [[VA]], align 4
-// CHECK-NEXT: [[TMP4:%.*]] = load double, ptr [[ARGP_CUR3_ALIGNED]], align 8
-// CHECK-NEXT: store double [[TMP4]], ptr [[X]], align 8
-// CHECK-NEXT: call void @llvm.va_end(ptr [[VA]])
-// CHECK-NEXT: [[TMP5:%.*]] = load double, ptr [[V]], align 8
-// CHECK-NEXT: [[TMP6:%.*]] = load double, ptr [[X]], align 8
-// CHECK-NEXT: [[ADD:%.*]] = fadd double [[TMP5]], [[TMP6]]
-// CHECK-NEXT: ret double [[ADD]]
+// CHECK-ILP32F-LABEL: define dso_local double @f_va_3
+// CHECK-ILP32F-SAME: (ptr noundef [[FMT:%.*]], ...) #[[ATTR0]] {
+// CHECK-ILP32F-NEXT: entry:
+// CHECK-ILP32F-NEXT: [[FMT_ADDR:%.*]] = alloca ptr, align 4
+// CHECK-ILP32F-NEXT: [[VA:%.*]] = alloca ptr, align 4
+// CHECK-ILP32F-NEXT: [[V:%.*]] = alloca double, align 8
+// CHECK-ILP32F-NEXT: [[W:%.*]] = alloca i32, align 4
+// CHECK-ILP32F-NEXT: [[X:%.*]] = alloca double, align 8
+// CHECK-ILP32F-NEXT: store ptr [[FMT]], ptr [[FMT_ADDR]], align 4
+// CHECK-ILP32F-NEXT: call void @llvm.va_start(ptr [[VA]])
+// CHECK-ILP32F-NEXT: [[ARGP_CUR:%.*]] = load ptr, ptr [[VA]], align 4
+// CHECK-ILP32F-NEXT: [[TMP0:%.*]] = getelementptr inbounds i8, ptr [[ARGP_CUR]], i32 7
+// CHECK-ILP32F-NEXT: [[ARGP_CUR_ALIGNED:%.*]] = call ptr @llvm.ptrmask.p0.i32(ptr [[TMP0]], i32 -8)
+// CHECK-ILP32F-NEXT: [[ARGP_NEXT:%.*]] = getelementptr inbounds i8, ptr [[ARGP_CUR_ALIGNED]], i32 8
+// CHECK-ILP32F-NEXT: store ptr [[ARGP_NEXT]], ptr [[VA]], align 4
+// CHECK-ILP32F-NEXT: [[TMP1:%.*]] = load double, ptr [[ARGP_CUR_ALIGNED]], align 8
+// CHECK-ILP32F-NEXT: store double [[...
[truncated]
|
What was the last bit of discussion on the phabricator review? I can no longer access it. |
It was basically just waiting for final approval. The agreement was that it is good enough to be merged in as experimental even with unfinalized ABI. I really hope we can get this in before end of this month to be able to use it in rustc. |
The conclusion from the previous review was this was OK to merge. I think I held it up by not responding to a ping (apologies). I've had another scan through and don't see a problem with merging this and considering it experimental once Craig's review comments are addressed. For the psABI, there's a gap here I think as GCC and LLVM need to deviate from what is currently written. I think riscv-non-isa/riscv-elf-psabi-doc#257 tried to partially address this. I wonder if we need a tracking issue in that repo to try to get this properly documented via that PR or otherwise (what do you think @jrtc27 @kito-cheng?). |
Hmmm, RISC-V ISA is growth after riscv-non-isa/riscv-elf-psabi-doc#257 again, I mean...we have zfinx and zdinx, which is also valid combination with rv32e/rv64e, so we may need to revise ilp32e ABI again on the psABI side, but my intention is not to block this PR. Anyway, I think we could track that on psABI repo, and we...already has an old issue to track that: riscv-non-isa/riscv-elf-psabi-doc#269 |
@wangpc-pp did you have interested on helping psABI side? it would be great if you can help since I suspect I don't have enough bandwidth to deal with that soon. |
Yes, I'm glad to. I think what we need to do is to fix some Zdinx issues? :-) And, I think I have to explain the background here: So, yeah, I will try my best to fix the ABI issue, but I think we still need the help of the RISC-V community. I will contact with T-head guys to see if they are interested in this. |
@nemanjai I'm curious if you have an interest / need to support RVE or not? |
To put our 3 cents here, we also need it. We are (well, it's mostly me right now since it's still a prototype) currently building a virtual machine that is based on RV32E (and in the future also RV64E), and, long story short, thanks to the magic of RV32E it achieves execution performance roughly the same as wasmtime (which is a state-of-art WASM VM) while compiling into native code over ~200 times faster and being orders of magnitude simpler (because you can mostly naively translate RV32E machine code 1-to-1 into native AMD64 code as the reduced number of registers makes this barely possible, and the code is still very fast). As far as testing this patch, as a fairly decent test case I've used one of the previous versions to compile DOOM into RV32E and ran it under our VM (in fact, you can play it here it you want), so the patch definitely works. Anyway, people are underestimating RV32E thinking it's only for bottom tier microcontrollers, but this couldn't be further from the truth. It's not perfect, but it's also a really great VM bytecode; dare I say, I think it's a better WASM than WASM for non-Web uses. |
Side note: shouldn't we also update diff --git a/compiler-rt/lib/builtins/riscv/restore.S b/compiler-rt/lib/builtins/riscv/restore.S
index 73f64a920d66..2e185ecae3f7 100644
--- a/compiler-rt/lib/builtins/riscv/restore.S
+++ b/compiler-rt/lib/builtins/riscv/restore.S
@@ -20,6 +20,46 @@
.text
+#if __riscv_abi_rve == 1
+
+#if __riscv_xlen == 32
+ .globl __riscv_restore_2
+ .type __riscv_restore_2,@function
+ .globl __riscv_restore_1
+ .type __riscv_restore_1,@function
+ .globl __riscv_restore_0
+ .type __riscv_restore_0,@function
+__riscv_restore_1:
+__riscv_restore_0:
+ lw s1, 4(sp)
+ lw s0, 8(sp)
+ lw ra, 12(sp)
+ addi sp, sp, 16
+ ret
+#elif __riscv_xlen == 64
+ .globl __riscv_restore_2
+ .type __riscv_restore_2,@function
+__riscv_restore_2:
+ ld s1, 8(sp)
+ addi sp, sp, 16
+ // fallthrough into __riscv_restore_1/0
+
+ .globl __riscv_restore_1
+ .type __riscv_restore_1,@function
+ .globl __riscv_restore_0
+ .type __riscv_restore_0,@function
+__riscv_restore_1:
+__riscv_restore_0:
+ ld s0, 0(sp)
+ ld ra, 8(sp)
+ addi sp, sp, 16
+ ret
+#else
+# error "xlen must be 32 or 64 for save-restore implementation
+#endif
+
+#else
+
#if __riscv_xlen == 32
.globl __riscv_restore_12
@@ -164,3 +204,5 @@ __riscv_restore_0:
#else
# error "xlen must be 32 or 64 for save-restore implementation
#endif
+
+#endif
diff --git a/compiler-rt/lib/builtins/riscv/save.S b/compiler-rt/lib/builtins/riscv/save.S
index 85501aeb4c2e..86b62a369cd0 100644
--- a/compiler-rt/lib/builtins/riscv/save.S
+++ b/compiler-rt/lib/builtins/riscv/save.S
@@ -16,6 +16,54 @@
.text
+#if __riscv_abi_rve == 1
+
+#if __riscv_xlen == 32
+ .globl __riscv_save_2
+ .type __riscv_save_2,@function
+ .globl __riscv_save_1
+ .type __riscv_save_1,@function
+ .globl __riscv_save_0
+ .type __riscv_save_0,@function
+__riscv_save_3:
+__riscv_save_2:
+__riscv_save_1:
+__riscv_save_0:
+ addi sp, sp, -16
+ sw s1, 4(sp)
+ sw s0, 8(sp)
+ sw ra, 12(sp)
+ jr t0
+#elif __riscv_xlen == 64
+ .globl __riscv_save_2
+ .type __riscv_save_2,@function
+__riscv_save_3:
+__riscv_save_2:
+ addi sp, sp, -112
+ li t1, 80
+.Lriscv_save_3_2:
+ sd s1, 88(sp)
+ sd s0, 96(sp)
+ sd ra, 104(sp)
+ add sp, sp, t1
+ jr t0
+
+ .globl __riscv_save_1
+ .type __riscv_save_1,@function
+ .globl __riscv_save_0
+ .type __riscv_save_0,@function
+__riscv_save_1:
+__riscv_save_0:
+ addi sp, sp, -16
+ sd s0, 0(sp)
+ sd ra, 8(sp)
+ jr t0
+#else
+# error "xlen must be 32 or 64 for save-restore implementation
+#endif
+
+#else
+
#if __riscv_xlen == 32
.globl __riscv_save_12
@@ -184,3 +232,5 @@ __riscv_save_0:
#else
# error "xlen must be 32 or 64 for save-restore implementation
#endif
+
+#endif (I don't remember why exactly since I did it a long time ago, but for some reason I do have this patch in my LLVM fork, so it probably was necessary for something.) |
Oh, it may be about stack alignment issue for libcalls. I may have fixed it. |
Yes, I mostly copy-pasted the existing code and removed all of the code dealing with registers not available on RV32E, so having only If I remember correctly I think I did this because otherwise compiling |
Oh, I see. My previous comment was wrong. |
Once this PR is merged, sure, I can make a PR. (With a better patch than my current hacky copy-paste job.) |
I most certainly do. Thank you for alerting me to this PR. |
The RISCVUsage change looks good to me, and it's fantastic that @nemanjai might be able to help push this forward further. I think all the review comments were addressed, so I'd be happy to merge if @topperc is happy his comments were resolved. Though also fine to wait longer if @nemanjai you wanted to review this as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not my intent to hold up approval of this patch. In addition to the minor comments I added, I plan to do some local testing but even if the testing reveals issues, they can be fixed on subsequent commits.
@@ -985,9 +1003,10 @@ void RISCVFrameLowering::determineCalleeSaves(MachineFunction &MF, | |||
}; | |||
|
|||
for (auto Reg : CSRegs) | |||
SavedRegs.set(Reg); | |||
if (Reg < RISCV::X16 || !Subtarget.isRVE()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if we are using ilp32e/lp64e
ABI on a subtarget that isn't RVE? Should these registers be saved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The psABI says:
If used with an ISA that has any of the registers x16-x31 and f0-f31, then these registers are considered temporaries.
So I think we should.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. Maybe just a little comment here to demonstrate to the user that this isn't an omission but a conscious decision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though it's nearly impossible to have such configuration in real application, I saved x15-x16 in 20ffba3.
Diags.Report(diag::err_invalid_feature_combination) | ||
<< "ILP32E cannot be used with the D ISA extension"; | ||
return false; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it also need to check ilp32e isn't compatible with zcmp?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think we should!
(Sigh...even RISC-V International has forgotten RVE, as many new ISA extensions don't even think about impacts on RVE 😞).
There are still many problems to fix, I will leave this zcmp issue there and fix it after revising the psABI part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for your persistence on this!
It's not easy to push this forward and thanks a lot! |
This commit includes the necessary changes to clang and LLVM to support codegen of `RVE` and the `ilp32e`/`lp64e` ABIs. The differences between `RVE` and `RVI` are: * `RVE` reduces the integer register count to 16(x0-x16). * The ABI should be `ilp32e` for 32 bits and `lp64e` for 64 bits. `RVE` can be combined with all current standard extensions. The central changes in ilp32e/lp64e ABI, compared to ilp32/lp64 are: * Only 6 integer argument registers (rather than 8). * Only 2 callee-saved registers (rather than 12). * A Stack Alignment of 32bits (rather than 128bits). * ilp32e isn't compatible with D ISA extension. If `ilp32e` or `lp64` is used with an ISA that has any of the registers x16-x31 and f0-f31, then these registers are considered temporaries. To be compatible with the implementation of ilp32e in GCC, we don't use aligned registers to pass variadic arguments and set stack alignment\ to 4-bytes for types with length of 2*XLEN. FastCC is also supported on RVE, while GHC isn't since there is only one avaiable register. Differential Revision: https://reviews.llvm.org/D70401
As a third-party bystander I just want to say, thank you to everyone involved! And especially @wangpc-pp, a huge thank you for pushing this through! |
This commit includes the necessary changes to clang and LLVM to support codegen of `RVE` and the `ilp32e`/`lp64e` ABIs. The differences between `RVE` and `RVI` are: * `RVE` reduces the integer register count to 16(x0-x16). * The ABI should be `ilp32e` for 32 bits and `lp64e` for 64 bits. `RVE` can be combined with all current standard extensions. The central changes in ilp32e/lp64e ABI, compared to ilp32/lp64 are: * Only 6 integer argument registers (rather than 8). * Only 2 callee-saved registers (rather than 12). * A Stack Alignment of 32bits (rather than 128bits). * ilp32e isn't compatible with D ISA extension. If `ilp32e` or `lp64` is used with an ISA that has any of the registers x16-x31 and f0-f31, then these registers are considered temporaries. To be compatible with the implementation of ilp32e in GCC, we don't use aligned registers to pass variadic arguments and set stack alignment\ to 4-bytes for types with length of 2*XLEN. FastCC is also supported on RVE, while GHC isn't since there is only one avaiable register. Differential Revision: https://reviews.llvm.org/D70401
Register spills (save/restore) in RISC-V embedded work differently because there are less registers and different stack alignment. [GCC equivalent ](https://github.com/gcc-mirror/gcc/blob/master/libgcc/config/riscv/save-restore.S#L298C16-L336) Follow up from #76777. --------- Signed-off-by: xermicus <cyrill@parity.io>
Register spills (save/restore) in RISC-V embedded work differently because there are less registers and different stack alignment. [GCC equivalent ](https://github.com/gcc-mirror/gcc/blob/master/libgcc/config/riscv/save-restore.S#L298C16-L336) Follow up from llvm#76777. --------- Signed-off-by: xermicus <cyrill@parity.io> (cherry picked from commit bd32aaa)
Register spills (save/restore) in RISC-V embedded work differently because there are less registers and different stack alignment. [GCC equivalent ](https://github.com/gcc-mirror/gcc/blob/master/libgcc/config/riscv/save-restore.S#L298C16-L336) Follow up from llvm#76777. --------- Signed-off-by: xermicus <cyrill@parity.io> (cherry picked from commit bd32aaa)
This commit includes the necessary changes to clang and LLVM to support
codegen of
RVE
and theilp32e
/lp64e
ABIs.The differences between
RVE
andRVI
are:RVE
reduces the integer register count to 16(x0-x16).ilp32e
for 32 bits andlp64e
for 64 bits.RVE
can be combined with all current standard extensions.The central changes in ilp32e/lp64e ABI, compared to ilp32/lp64 are:
If
ilp32e
orlp64
is used with an ISA that has any of the registersx16-x31 and f0-f31, then these registers are considered temporaries.
To be compatible with the implementation of ilp32e in GCC, we don't use
aligned registers to pass variadic arguments and set stack alignment
to 4-bytes for types with length of 2*XLEN.
FastCC is also supported on RVE, while GHC isn't since there is only one
avaiable register.
Differential Revision: https://reviews.llvm.org/D70401