[X86] Extend kCFI with a 3-bit arity indicator #121070

scottconstable · 2024-12-24T21:36:58Z

Kernel Control Flow Integrity (kCFI) is a feature that hardens indirect calls by comparing a 32-bit hash of the function pointer's type against a hash of the target function's type. If the hashes do not match, the kernel may panic (or log the hash check failure, depending on the kernel's configuration). These hashes are computed at compile time by applying the xxHash64 algorithm to each mangled canonical function (or function pointer) type, then truncating the result to 32 bits. This hash is written into each indirect-callable function header by encoding it as the 32-bit immediate operand to a MOVri instruction, e.g.:

__cfi_foo:
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	movl	$199571451, %eax                # hash of foo's type = 0xBE537FB
foo:
        ...

This PR extends x86-based kCFI with a 3-bit arity indicator encoded in the MOVri instruction's register (reg) field as follows:

Arity Indicator	Description	Encoding in reg field
0	0 parameters	EAX
1	1 parameter in RDI	ECX
2	2 parameters in RDI and RSI	EDX
3	3 parameters in RDI, RSI, and RDX	EBX
4	4 parameters in RDI, RSI, RDX, and RCX	ESP
5	5 parameters in RDI, RSI, RDX, RCX, and R8	EBP
6	6 parameters in RDI, RSI, RDX, RCX, R8, and R9	ESI
7	At least one parameter may be passed on the stack	EDI

For example, if foo takes 3 register arguments and no stack arguments then the MOVri instruction in its kCFI header would instead be written as:

	movl	$199571451, %ebx                # hash of foo's type = 0xBE537FB

This PR will benefit other CFI approaches that build on kCFI, such as FineIBT. For example, this proposed enhancement to FineIBT must be able to infer (at kernel init time) which registers are live at an indirect call target: https://lkml.org/lkml/2024/9/27/982. If the arity bits are available in the kCFI function header, then this information is trivial to infer.

Note that there is another existing PR proposal that includes the 3-bit arity within the existing 32-bit immediate field, which introduces different security properties: #117121.

llvmbot · 2024-12-24T21:37:31Z

@llvm/pr-subscribers-clang-driver
@llvm/pr-subscribers-clang

@llvm/pr-subscribers-backend-x86

Author: Scott Constable (scottconstable)

Changes

Kernel Control Flow Integrity (kCFI) is a feature that hardens indirect calls by comparing a 32-bit hash of the function pointer's type against a hash of the target function's type. If the hashes do not match, the kernel may panic (or log the hash check failure, depending on the kernel's configuration). These hashes are computed at compile time by applying the xxHash64 algorithm to each mangled canonical function (or function pointer) type, then truncating the result to 32 bits. This hash is written into each indirect-callable function header by encoding it as the 32-bit immediate operand to a MOVri instruction, e.g.:

__cfi_foo:
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	nop
	movl	$199571451, %eax                # hash of foo's type = 0xBE537FB
foo:
        ...

This PR extends x86-based kCFI with a 3-bit arity indicator encoded in the MOVri instruction's register (reg) field as follows:

Arity Indicator	Description	Encoding in reg field
0	0 parameters	EAX
1	1 parameter in RDI	ECX
2	2 parameters in RDI and RSI	EDX
3	3 parameters in RDI, RSI, and RDX	EBX
4	4 parameters in RDI, RSI, RDX, and RCX	ESP
5	5 parameters in RDI, RSI, RDX, RCX, and R8	EBP
6	6 parameters in RDI, RSI, RDX, RCX, R8, and R9	ESI
7	At least one parameter may be passed on the stack	EDI

For example, if foo takes 3 register arguments and no stack arguments then the MOVri instruction in its kCFI header would instead be written as:

	movl	$199571451, %ebx                # hash of foo's type = 0xBE537FB

This PR will benefit other CFI approaches that build on kCFI, such as FineIBT. For example, this proposed enhancement to FineIBT must be able to infer (at kernel init time) which registers are live at an indirect call target: https://lkml.org/lkml/2024/9/27/982. If the arity bits are available in the kCFI function header, then this information is trivial to infer.

Note that there is another existing PR proposal that includes the 3-bit arity within the existing 32-bit immediate field, which introduces different security properties: #117121.

Full diff: https://github.com/llvm/llvm-project/pull/121070.diff

4 Files Affected:

(modified) clang/include/clang/Basic/Features.def (+1)
(modified) llvm/lib/Target/X86/X86AsmPrinter.cpp (+19-1)
(modified) llvm/test/CodeGen/X86/kcfi-patchable-function-prefix.ll (+2-2)
(modified) llvm/test/CodeGen/X86/kcfi.ll (+60-3)

diff --git a/clang/include/clang/Basic/Features.def b/clang/include/clang/Basic/Features.def
index c82b6d9b5f6c10..dca8f4dc0fbf76 100644
--- a/clang/include/clang/Basic/Features.def
+++ b/clang/include/clang/Basic/Features.def
@@ -254,6 +254,7 @@ FEATURE(is_trivially_constructible, LangOpts.CPlusPlus)
 FEATURE(is_trivially_copyable, LangOpts.CPlusPlus)
 FEATURE(is_union, LangOpts.CPlusPlus)
 FEATURE(kcfi, LangOpts.Sanitize.has(SanitizerKind::KCFI))
+FEATURE(kcfi_x86_arity, LangOpts.Sanitize.has(SanitizerKind::KCFI))
 FEATURE(modules, LangOpts.Modules)
 FEATURE(safe_stack, LangOpts.Sanitize.has(SanitizerKind::SafeStack))
 FEATURE(shadow_call_stack,
diff --git a/llvm/lib/Target/X86/X86AsmPrinter.cpp b/llvm/lib/Target/X86/X86AsmPrinter.cpp
index f01e47b41cf5e4..cb21d9bb5f2879 100644
--- a/llvm/lib/Target/X86/X86AsmPrinter.cpp
+++ b/llvm/lib/Target/X86/X86AsmPrinter.cpp
@@ -181,8 +181,26 @@ void X86AsmPrinter::emitKCFITypeId(const MachineFunction &MF) {
   // Embed the type hash in the X86::MOV32ri instruction to avoid special
   // casing object file parsers.
   EmitKCFITypePadding(MF);
+
+  Register MovReg = X86::EAX;
+  const auto &Triple = MF.getTarget().getTargetTriple();
+  if (Triple.isArch64Bit() && Triple.isOSLinux()) {
+    // Determine the function's arity (i.e., the number of arguments) at the ABI
+    // level by counting the number of parameters that are passed
+    // as registers, such as pointers and 64-bit (or smaller) integers. The
+    // Linux x86-64 ABI allows up to 6 parameters to be passed in GPRs.
+    // Additional parameters or parameters larger than 64 bits may be passed on
+    // the stack, in which case the arity is denoted as 7.
+    const unsigned ArityToRegMap[8] = {X86::EAX, X86::ECX, X86::EDX, X86::EBX,
+                                      X86::ESP, X86::EBP, X86::ESI, X86::EDI};
+    int Arity = MF.getInfo<X86MachineFunctionInfo>()->getArgumentStackSize() > 0
+                    ? 7
+                    : MF.getRegInfo().liveins().size();
+    MovReg = ArityToRegMap[Arity];
+  }
+
   EmitAndCountInstruction(MCInstBuilder(X86::MOV32ri)
-                              .addReg(X86::EAX)
+                              .addReg(MovReg)
                               .addImm(MaskKCFIType(Type->getZExtValue())));
 
   if (MAI->hasDotTypeDotSizeDirective()) {
diff --git a/llvm/test/CodeGen/X86/kcfi-patchable-function-prefix.ll b/llvm/test/CodeGen/X86/kcfi-patchable-function-prefix.ll
index 1b7bd7835e890c..deababc7fd5379 100644
--- a/llvm/test/CodeGen/X86/kcfi-patchable-function-prefix.ll
+++ b/llvm/test/CodeGen/X86/kcfi-patchable-function-prefix.ll
@@ -3,7 +3,7 @@
 ; CHECK:          .p2align 4
 ; CHECK-LABEL:    __cfi_f1:
 ; CHECK-COUNT-11:   nop
-; CHECK-NEXT:       movl $12345678, %eax
+; CHECK-NEXT:       movl $12345678, %ecx
 ; CHECK-LABEL:    .Lcfi_func_end0:
 ; CHECK-NEXT:     .size   __cfi_f1, .Lcfi_func_end0-__cfi_f1
 ; CHECK-LABEL:    f1:
@@ -26,7 +26,7 @@ define void @f2(ptr noundef %x) {
 ; CHECK:          .p2align 4
 ; CHECK-LABEL:    __cfi_f3:
 ; CHECK-NOT:        nop
-; CHECK-NEXT:       movl $12345678, %eax
+; CHECK-NEXT:       movl $12345678, %ecx
 ; CHECK-COUNT-11:   nop
 ; CHECK-LABEL:    f3:
 define void @f3(ptr noundef %x) #0 !kcfi_type !1 {
diff --git a/llvm/test/CodeGen/X86/kcfi.ll b/llvm/test/CodeGen/X86/kcfi.ll
index 059efcc71b0eb8..91d706796b8e99 100644
--- a/llvm/test/CodeGen/X86/kcfi.ll
+++ b/llvm/test/CodeGen/X86/kcfi.ll
@@ -16,7 +16,7 @@
 ; ASM-NEXT:    nop
 ; ASM-NEXT:    nop
 ; ASM-NEXT:    nop
-; ASM-NEXT:    movl $12345678, %eax
+; ASM-NEXT:    movl $12345678, %ecx
 ; ASM-LABEL: .Lcfi_func_end0:
 ; ASM-NEXT:  .size   __cfi_f1, .Lcfi_func_end0-__cfi_f1
 define void @f1(ptr noundef %x) !kcfi_type !1 {
@@ -90,7 +90,7 @@ define void @f4(ptr noundef %x) #0 {
 
 ;; Ensure we emit Value + 1 for unwanted values (e.g. endbr64 == 4196274163).
 ; ASM-LABEL: __cfi_f5:
-; ASM: movl $4196274164, %eax # imm = 0xFA1E0FF4
+; ASM: movl $4196274164, %ecx # imm = 0xFA1E0FF4
 define void @f5(ptr noundef %x) !kcfi_type !2 {
 ; ASM-LABEL: f5:
 ; ASM: movl $98693132, %r10d # imm = 0x5E1F00C
@@ -100,7 +100,7 @@ define void @f5(ptr noundef %x) !kcfi_type !2 {
 
 ;; Ensure we emit Value + 1 for unwanted values (e.g. -endbr64 == 98693133).
 ; ASM-LABEL: __cfi_f6:
-; ASM: movl $98693134, %eax # imm = 0x5E1F00E
+; ASM: movl $98693134, %ecx # imm = 0x5E1F00E
 define void @f6(ptr noundef %x) !kcfi_type !3 {
 ; ASM-LABEL: f6:
 ; ASM: movl $4196274162, %r10d # imm = 0xFA1E0FF2
@@ -138,6 +138,60 @@ define void @f8() {
   ret void
 }
 
+%struct.S9 = type { [10 x i64] }
+
+;; Ensure that functions with large (e.g., greater than 8 bytes) arguments passed on the stack are assigned arity=7
+; ASM-LABEL: __cfi_f9:
+; ASM: movl	$199571451, %edi                # imm = 0xBE537FB
+define dso_local void @f9(ptr noundef byval(%struct.S9) align 8 %s) !kcfi_type !4  {
+entry:
+  ret void
+}
+
+;; Ensure that functions with fewer than 7 register arguments and no stack arguments are assigned arity<7
+; ASM-LABEL: __cfi_f10:
+; ASM: movl	$1046421190, %esi               # imm = 0x3E5F1EC6
+define dso_local void @f10(i32 noundef %v1, i32 noundef %v2, i32 noundef %v3, i32 noundef %v4, i32 noundef %v5, i32 noundef %v6) #0 !kcfi_type !5 {
+entry:
+  %v1.addr = alloca i32, align 4
+  %v2.addr = alloca i32, align 4
+  %v3.addr = alloca i32, align 4
+  %v4.addr = alloca i32, align 4
+  %v5.addr = alloca i32, align 4
+  %v6.addr = alloca i32, align 4
+  store i32 %v1, ptr %v1.addr, align 4
+  store i32 %v2, ptr %v2.addr, align 4
+  store i32 %v3, ptr %v3.addr, align 4
+  store i32 %v4, ptr %v4.addr, align 4
+  store i32 %v5, ptr %v5.addr, align 4
+  store i32 %v6, ptr %v6.addr, align 4
+  ret void
+}
+
+;; Ensure that functions with greater than 7 register arguments and no stack arguments are assigned arity=7
+; ASM-LABEL: __cfi_f11:
+; ASM: movl	$1342488295, %edi               # imm = 0x5004BEE7
+define dso_local void @f11(i32 noundef %v1, i32 noundef %v2, i32 noundef %v3, i32 noundef %v4, i32 noundef %v5, i32 noundef %v6, i32 noundef %v7, i32 noundef %v8) #0 !kcfi_type !6 {
+entry:
+  %v1.addr = alloca i32, align 4
+  %v2.addr = alloca i32, align 4
+  %v3.addr = alloca i32, align 4
+  %v4.addr = alloca i32, align 4
+  %v5.addr = alloca i32, align 4
+  %v6.addr = alloca i32, align 4
+  %v7.addr = alloca i32, align 4
+  %v8.addr = alloca i32, align 4
+  store i32 %v1, ptr %v1.addr, align 4
+  store i32 %v2, ptr %v2.addr, align 4
+  store i32 %v3, ptr %v3.addr, align 4
+  store i32 %v4, ptr %v4.addr, align 4
+  store i32 %v5, ptr %v5.addr, align 4
+  store i32 %v6, ptr %v6.addr, align 4
+  store i32 %v7, ptr %v7.addr, align 4
+  store i32 %v8, ptr %v8.addr, align 4
+  ret void
+}
+
 attributes #0 = { "target-features"="+retpoline-indirect-branches,+retpoline-indirect-calls" }
 
 !llvm.module.flags = !{!0}
@@ -145,3 +199,6 @@ attributes #0 = { "target-features"="+retpoline-indirect-branches,+retpoline-ind
 !1 = !{i32 12345678}
 !2 = !{i32 4196274163}
 !3 = !{i32 98693133}
+!4 = !{i32 199571451}
+!5 = !{i32 1046421190}
+!6 = !{i32 1342488295}

scottconstable · 2024-12-24T21:46:21Z

llvm/lib/Target/X86/X86AsmPrinter.cpp

+
+  Register MovReg = X86::EAX;
+  const auto &Triple = MF.getTarget().getTargetTriple();
+  if (Triple.isArch64Bit() && Triple.isOSLinux()) {


Is this check necessary? I notice that all of the x86-based kCFI tests use the x86_64-unknown-linux-gnu triple. If 64-bit Linux is the only supported x86 subtarget then I think this check is redundant. But I'm not 100% certain.

You can use assert to guarantee it won't be broken in the future.

@phoebewang Good suggestion! I updated the PR to use an assert.

scottconstable · 2024-12-24T22:04:24Z

clang/include/clang/Basic/Features.def

@@ -254,6 +254,7 @@ FEATURE(is_trivially_constructible, LangOpts.CPlusPlus)
 FEATURE(is_trivially_copyable, LangOpts.CPlusPlus)
 FEATURE(is_union, LangOpts.CPlusPlus)
 FEATURE(kcfi, LangOpts.Sanitize.has(SanitizerKind::KCFI))
+FEATURE(kcfi_x86_arity, LangOpts.Sanitize.has(SanitizerKind::KCFI))


Peter Zijlstra, one of the Linux FineIBT maintainers, suggested to me that __has_feature() might be the best interface for Kconfig to discover that clang supports this new kCFI arity extension. If this feature were to be added to rustc, then I think rustc would need a similar interface to expose the feature.

For Kconfig, most features are discovered via testing a flag (e.g. -fsanitize=kcfi), but if the kernel needs __has_feature() anyway here, then that can also be used, yeah. For rustc, we can do similar tests too (and since we only support one kind of Rust compiler so far, versions checks can be a fallback option too).

I still think it would be best to support both hash types via flags to avoid limiting which compiler versions work together.

@Darksonn This PR differs from #117121 in that this PR does not change the hash type. Instead, this PR uses the reg field in the MOVri32 instruction to encode some additional meta data about the function's arity.

Encoding arity information in the register argument solves the hash compatibility issue, so I definitely prefer this approach over #117121. As you noted, the FineIBT patching code still needs to know whether both compilers emit arity information, and while using __has_feature in C should be sufficient, how would the rustc feature test look like? A quick search didn't find any Kconfig examples for this. I assume we would need a Kconfig variable to pass this information to C code, but perhaps there are better options?

phoebewang · 2024-12-25T00:41:55Z

llvm/lib/Target/X86/X86AsmPrinter.cpp

+    // Additional parameters or parameters larger than 64 bits may be passed on
+    // the stack, in which case the arity is denoted as 7.
+    const unsigned ArityToRegMap[8] = {X86::EAX, X86::ECX, X86::EDX, X86::EBX,
+                                       X86::ESP, X86::EBP, X86::ESI, X86::EDI};


Astonished ESP can be used to pass the hash.

Hi @phoebewang, based on my understanding of kCFI, the MOVri instruction emitted by the compiler in each function header will never be executed. The purpose of this instruction is simply to insert the function's kCFI type ID into the header so that it can be verified at the call site before making the call. For example, this is how a kCFI-enforced indirect call looks when the kernel is configured with kCFI:

0000000000000010 <foo>: # foo loads the target address into rax ... 25: mov $0x70e5d6ba,%r10d # 0x70e5d6ba is the kCFI type ID that must be matched at the target 2b: add -0x4(%rax),%r10d # -0x4(%rax) refers to the immediate operand in the `MOVri` instruction, which holds the kCFI type ID 2f: je 33 <foo+0x23> # If the two type IDs match, then jump to the call; otherwise #UD 31: ud2 33: call *%rax # Note that this will jump directly to the target function, and will not execute the `MOVri`

Does this address your concern?

phoebewang · 2024-12-25T00:52:14Z

And you cannot use ESI when it's used to pass arguments. Suggested registers:

Arity Indicator	Description	Encoding in reg field
0	0 parameters	EDI
1	1 parameter in RDI	ESI
2	2 parameters in RDI and RSI	EDX
3	3 parameters in RDI, RSI, and RDX	ECX
4	4 parameters in RDI, RSI, RDX, and RCX	R8D
5	5 parameters in RDI, RSI, RDX, RCX, and R8	R9D
6	6 parameters in RDI, RSI, RDX, RCX, R8, and R9	R10D
7	At least one parameter may be passed on the stack	EAX

scottconstable · 2024-12-26T19:26:11Z

And you cannot use ESI when it's used to pass arguments.

@phoebewang I think that my #121070 (comment) also applies to this concern.

scottconstable · 2025-01-09T22:11:55Z

@sirmc @samitolvanen @Darksonn @lvwr @maurer @rcvalle @MaskRay A gentle reminder to please review this PR.

MaskRay · 2025-01-10T17:36:13Z

If hashing is changed, consider replacing xxhash64 with xxh3+_64bits

scottconstable · 2025-01-10T18:09:12Z

If hashing is changed, consider replacing xxhash64 with xxh3+_64bits

@MaskRay This PR does not change the hashing scheme at all.

MaskRay · 2025-01-11T07:26:11Z

maurer: I think their point is that even if you are not changing the hash scheme, you are proposing breaking compatibility of the identifier with existing code. Since we don't want to do this many times, if we are breaking compatibility with existing code, they would like to batch it with another breaking update so that it doesn't need to be done again.

(This isn't me reviewing this PR, just trying to clear up some confusion.)

This:) Thanks for the explanation.

After my pending lld/MachO change, kcfi will be the only user of the legacy xxHash64. I want to remove xxHash64 from llvm-project.

scottconstable · 2025-01-13T21:56:49Z

maurer: I think their point is that even if you are not changing the hash scheme, you are proposing breaking compatibility of the identifier with existing code. Since we don't want to do this many times, if we are breaking compatibility with existing code, they would like to batch it with another breaking update so that it doesn't need to be done again.
(This isn't me reviewing this PR, just trying to clear up some confusion.)

This:) Thanks for the explanation.

After my pending lld/MachO change, kcfi will be the only user of the legacy xxHash64. I want to remove xxHash64 from llvm-project.

Going back to @maurer 's comment about compatibility, I do not believe that this PR would break any compatibility with other compilers, or with other compiler versions. If a kernel is built with some of its functions having the 3-bit arity extension and the rest of its functions not having the 3-bit arity extension (e.g., because the user's clang compiler has the extension, and their rust compiler does not), then the kernel will behave the same as it would if all functions or no functions have the 3-bit arity extension. If the kernel is hot-patched with the register hardening scheme in https://lkml.org/lkml/2024/9/27/982, then only the functions that have the 3-bit arity extension will benefit from the register hardening scheme.

scottconstable · 2025-01-22T06:50:02Z

@sirmc @samitolvanen @Darksonn @lvwr @maurer @rcvalle A gentle reminder to please review this PR.

sirmc · 2025-01-22T08:19:17Z

I had a quick look (sorry for not getting to this earlier, was traveling around Vietnam for a few weeks). Looks good to me. Especially the trick of encoding the arity into the MOV32ri register seems very neat from a compatibility standpoint.

So if I understand this correctly, the only compatibility conflict this introduces is with the kernel's handling of rewriting the kCFI function prologue, which can be checked with __has_feature. Enabling this arity hardening in Rust, if desired, seems straight-forward to me.

My only concern for the current PR was that the change might be incompatible with the handling on the kernel side (i.e., that the kernel's rewrite code assumed EAX rather than any other value in the reg field), which could fail older kernel builds with a newer toolchain. But after a quick look it seems like the decoding doesn't create conflict (see https://github.com/torvalds/linux/blob/c4b9570cfb63501638db720f3bee9f6dfd044b82/arch/x86/kernel/alternative.c#L1112). And AFAIK there's no other users of kCFI besides the Linux kernel (but please correct me if I'm wrong). I do see some initial work on supporting kCFI in FreeBSD https://reviews.freebsd.org/D46193, but this also seems compatible to me too.

So in short, all seems ok to me.

samitolvanen · 2025-01-22T16:27:06Z

A gentle reminder to please review this PR.

Overall this looks fine to me, but I was hoping to get an answer to my question about how the Kconfig detection is actually going to be implemented: #121070 (comment)

samitolvanen · 2025-01-22T16:33:35Z

And AFAIK there's no other users of kCFI besides the Linux kernel (but please correct me if I'm wrong).

There are firmware projects that use KCFI, but AFAIK none of them run on x86. I'm also fairly certain that only Linux performs runtime patching based on the code sequence we emit, so I'm not particularly concerned about breaking other KCFI users with this change.

scottconstable · 2025-01-28T06:34:40Z

My only concern for the current PR was that the change might be incompatible with the handling on the kernel side (i.e., that the kernel's rewrite code assumed EAX rather than any other value in the reg field), which could fail older kernel builds with a newer toolchain.

@sirmc I shared this concern with Peter Zijlstra (who authored the Linux patches that will use this PR) and here is his response: "This is correct; decode_preamble_hash() currently assumes the 0xb8 opcode. This is of course trivial to fix. Also, I think we'll need a compiler flag to enable this option (see next point), and since old kernel on new compiler will not set this flag, no problem."

Overall this looks fine to me, but I was hoping to get an answer to my question about how the Kconfig detection is actually going to be implemented: #121070 (comment)

@samitolvanen I also shared this concern with Peter, and he said: "Yeah, so Kconfig will have to detect the compiler option is available for both C and Rust, and only select to use it when all (selected) languages support it. So if Rust is not enabled, we don't care if it supports the feature, and we can use it if clang supports it. But if Rust is enabled, both compilers need the feature present."

I just updated the PR with a new clang flag -fsanitize-kcfi-arity that enables the new behavior. @Darksonn @lvwr @maurer @rcvalle your review and feedback would be appreciated.

Darksonn · 2025-01-28T07:38:22Z

If we can control it with a flag, then that sounds good to me.

…et's arity (i.e., the number of live-in registers) in the function's __cfi header.

samitolvanen

Thank you for adding the command line flag. I agree that this should be sufficient to address compatibility concerns.

phoebewang

LGTM.

scottconstable · 2025-02-04T17:15:40Z

@lvwr @maurer @rcvalle A gentle reminder to please review this PR.

phoebewang · 2025-02-05T01:22:30Z

@lvwr @maurer @rcvalle A gentle reminder to please review this PR.

@scottconstable You don't need explicit approvals from all reviewers. Let's wait for 24 hours and land it if no objections.

Kernel Control Flow Integrity (kCFI) is a feature that hardens indirect calls by comparing a 32-bit hash of the function pointer's type against a hash of the target function's type. If the hashes do not match, the kernel may panic (or log the hash check failure, depending on the kernel's configuration). These hashes are computed at compile time by applying the xxHash64 algorithm to each mangled canonical function (or function pointer) type, then truncating the result to 32 bits. This hash is written into each indirect-callable function header by encoding it as the 32-bit immediate operand to a `MOVri` instruction, e.g.: ``` __cfi_foo: nop nop nop nop nop nop nop nop nop nop nop movl $199571451, %eax # hash of foo's type = 0xBE537FB foo: ... ``` This PR extends x86-based kCFI with a 3-bit arity indicator encoded in the `MOVri` instruction's register (reg) field as follows: | Arity Indicator | Description | Encoding in reg field | | --------------- | --------------- | --------------- | | 0 | 0 parameters | EAX | | 1 | 1 parameter in RDI | ECX | | 2 | 2 parameters in RDI and RSI | EDX | | 3 | 3 parameters in RDI, RSI, and RDX | EBX | | 4 | 4 parameters in RDI, RSI, RDX, and RCX | ESP | | 5 | 5 parameters in RDI, RSI, RDX, RCX, and R8 | EBP | | 6 | 6 parameters in RDI, RSI, RDX, RCX, R8, and R9 | ESI | | 7 | At least one parameter may be passed on the stack | EDI | For example, if `foo` takes 3 register arguments and no stack arguments then the `MOVri` instruction in its kCFI header would instead be written as: ``` movl $199571451, %ebx # hash of foo's type = 0xBE537FB ``` This PR will benefit other CFI approaches that build on kCFI, such as FineIBT. For example, this proposed enhancement to FineIBT must be able to infer (at kernel init time) which registers are live at an indirect call target: https://lkml.org/lkml/2024/9/27/982. If the arity bits are available in the kCFI function header, then this information is trivial to infer. Note that there is another existing PR proposal that includes the 3-bit arity within the existing 32-bit immediate field, which introduces different security properties: llvm#117121.

Adds KCFI arity indicator support to the Rust compiler (see rust-lang#138311 llvm/llvm-project#121070 and https://lore.kernel.org/lkml/CANiq72=3ghFxy8E=AU9p+0imFxKr5iU3sd0hVUXed5BA+KjdNQ@mail.gmail.com/).

Adds KCFI arity indicator support to the Rust compiler (see rust-lang#138311, llvm/llvm-project#121070, and https://lore.kernel.org/lkml/CANiq72=3ghFxy8E=AU9p+0imFxKr5iU3sd0hVUXed5BA+KjdNQ@mail.gmail.com/).

KCFI: Add KCFI arity indicator support Adds KCFI arity indicator support to the Rust compiler (see rust-lang#138311, llvm/llvm-project#121070, and https://lore.kernel.org/lkml/CANiq72=3ghFxy8E=AU9p+0imFxKr5iU3sd0hVUXed5BA+KjdNQ@mail.gmail.com/).

Rollup merge of rust-lang#138368 - rcvalle:rust-kcfi-arity, r=davidtwco KCFI: Add KCFI arity indicator support Adds KCFI arity indicator support to the Rust compiler (see rust-lang#138311, llvm/llvm-project#121070, and https://lore.kernel.org/lkml/CANiq72=3ghFxy8E=AU9p+0imFxKr5iU3sd0hVUXed5BA+KjdNQ@mail.gmail.com/).

llvmbot added clang Clang issues not falling into any other category backend:X86 clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Dec 24, 2024

scottconstable force-pushed the x86-kcfi-arity branch from 81ff927 to 70f1743 Compare December 24, 2024 21:38

scottconstable mentioned this pull request Dec 24, 2024

[X86] Enhance kCFI type IDs with a 3-bit arity indicator. #117121

Open

scottconstable commented Dec 24, 2024

View reviewed changes

phoebewang reviewed Dec 25, 2024

View reviewed changes

ojeda mentioned this pull request Dec 26, 2024

LLVM and Clang wanted features & bugfixes Rust-for-Linux/linux#1132

Open

3 tasks

scottconstable force-pushed the x86-kcfi-arity branch from 70f1743 to 73b6f63 Compare December 26, 2024 19:33

MaskRay requested a review from samitolvanen January 10, 2025 17:35

scottconstable force-pushed the x86-kcfi-arity branch from 73b6f63 to 4fb9709 Compare January 28, 2025 06:25

llvmbot added clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:codegen IR generation bugs: mangling, exceptions, etc. labels Jan 28, 2025

scottconstable force-pushed the x86-kcfi-arity branch from 4fb9709 to 54296a3 Compare January 28, 2025 17:44

Implement a new kcfi_arity feature that encodes an indirect call targ…

54296a3

…et's arity (i.e., the number of live-in registers) in the function's __cfi header.

samitolvanen approved these changes Feb 3, 2025

View reviewed changes

scottconstable requested review from phoebewang, ojeda and Darksonn February 3, 2025 20:23

phoebewang approved these changes Feb 4, 2025

View reviewed changes

phoebewang merged commit e223485 into llvm:main Feb 6, 2025
9 checks passed

ojeda mentioned this pull request Mar 11, 2025

KCFI: Add KCFI arity indicator support rust-lang/rust#138368

Merged

[X86] Extend kCFI with a 3-bit arity indicator #121070

[X86] Extend kCFI with a 3-bit arity indicator #121070

Uh oh!

Conversation

scottconstable commented Dec 24, 2024

Uh oh!

llvmbot commented Dec 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

phoebewang commented Dec 25, 2024

Uh oh!

scottconstable commented Dec 26, 2024

Uh oh!

scottconstable commented Jan 9, 2025

Uh oh!

MaskRay commented Jan 10, 2025

Uh oh!

scottconstable commented Jan 10, 2025

Uh oh!

MaskRay commented Jan 11, 2025

Uh oh!

scottconstable commented Jan 13, 2025

Uh oh!

scottconstable commented Jan 22, 2025

Uh oh!

sirmc commented Jan 22, 2025

Uh oh!

samitolvanen commented Jan 22, 2025

Uh oh!

samitolvanen commented Jan 22, 2025

Uh oh!

scottconstable commented Jan 28, 2025

Uh oh!

Darksonn commented Jan 28, 2025

Uh oh!

samitolvanen left a comment

Choose a reason for hiding this comment

Uh oh!

phoebewang left a comment

Choose a reason for hiding this comment

Uh oh!

scottconstable commented Feb 4, 2025

Uh oh!

phoebewang commented Feb 5, 2025

Uh oh!

Uh oh!

Uh oh!

llvmbot commented Dec 24, 2024 •

edited

Loading