Skip to content

[ARM64EC] The crash related to '/gaurd:cf' when protecting assembly implemented function. #165504

@coneco-cy

Description

@coneco-cy

I used clang-cl and lld-link to compile a simple program, but it crashed due to the /guard:cf option. Could anyone provide advice on resolving this issue? The program consists of only two files: arm64ec_test.cpp and asm_func.cc.

Arm64ec_test.cpp:

#include <iostream>

extern "C" void PushAllRegistersAndIterateStack(int*, int*,
                                                int*);
int main() {
    int a = 1;
    PushAllRegistersAndIterateStack(&a, &a, &a);
    std::cout << "hello world\n";
    return 0;
}

asm_func.cc:

// Copyright 2020 the V8 project authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.

// Push all callee-saved registers to get them on the stack for conservative
// stack scanning.
//
// See asm/x64/push_registers_clang.cc for why the function is not generated
// using clang.
//
// Do not depend on V8_TARGET_OS_* defines as some embedders may override the
// GN toolchain (e.g. ChromeOS) and not provide them.

// We maintain 16-byte alignment.
//
// Calling convention source:
// https://en.wikipedia.org/wiki/Calling_convention#ARM_(A64)

asm(
#if defined(__APPLE__)
    ".globl _PushAllRegistersAndIterateStack            \n"
    ".private_extern _PushAllRegistersAndIterateStack   \n"
    ".p2align 2                                         \n"
    "_PushAllRegistersAndIterateStack:                  \n"
#else  // !defined(__APPLE__)
    ".globl PushAllRegistersAndIterateStack             \n"
#if !defined(_WIN64)
    ".type PushAllRegistersAndIterateStack, %function   \n"
    ".hidden PushAllRegistersAndIterateStack            \n"
#endif  // !defined(_WIN64)
    ".p2align 2                                         \n"
    "PushAllRegistersAndIterateStack:                   \n"
#endif  // !defined(__APPLE__)
    // x19-x29 are callee-saved.
    "  stp x19, x20, [sp, #-16]!                        \n"
    "  stp x21, x22, [sp, #-16]!                        \n"
    "  stp x23, x24, [sp, #-16]!                        \n"
    "  stp x25, x26, [sp, #-16]!                        \n"
    "  stp x27, x28, [sp, #-16]!                        \n"
#ifdef V8_ENABLE_CONTROL_FLOW_INTEGRITY
    // Sign return address.
    "  paciasp                                          \n"
#endif
    "  stp fp, lr,   [sp, #-16]!                        \n"
    // Maintain frame pointer.
    "  mov fp, sp                                       \n"
    // Pass 1st parameter (x0) unchanged (Stack*).
    // Pass 2nd parameter (x1) unchanged (StackVisitor*).
    // Save 3rd parameter (x2; IterateStackCallback)
    "  mov x7, x2                                       \n"
    // Pass 3rd parameter as sp (stack pointer).
    "  mov x2, sp                                       \n"
    "  blr x7                                           \n"
    // Load return address and frame pointer.
    "  ldp fp, lr, [sp], #16                            \n"
#ifdef V8_ENABLE_CONTROL_FLOW_INTEGRITY
    // Authenticate return address.
    "  autiasp                                          \n"
#endif
    // Drop all callee-saved registers.
    "  add sp, sp, #80                                  \n"
    "  ret                                              \n");

The arm64ec_test.cpp file contains only a main function, which calls the PushAllRegistersAndIterateStack function exported from the asm_func.cc file. Windows' Control Flow Guard (CFG) feature, when enabled, constructs a bitmap where each bit corresponds to an 8-byte code region. If a bit is set to 1, the corresponding 8-byte code region is valid for indirect calls; otherwise, it is not.

When I build the simple program with the /guard:cf option enabled, the bit corresponding to the 8-byte code region containing the first instruction of the PushAllRegistersAndIterateStack function is set to 0. This indicates that the PushAllRegistersAndIterateStack function is invalid for indirect calls, causing the program to crash with the following call stack:

clang-cl command:

"<root_path>\clang+llvm-21.1.3-x86_64-pc-windows-msvc\clang+llvm-21.1.3-x86_64-pc-windows-msvc\bin\clang-cl.exe" /c /nologo /Zi /D "NDEBUG" /arm64EC /guard:cf arm64ec_test.cpp asm_func.cc

lld-link command:

"<root_path>\clang+llvm-21.1.3-x86_64-pc-windows-msvc\clang+llvm-21.1.3-x86_64-pc-windows-msvc\bin\lld-link.exe" /DEBUG /MACHINE:ARM64EC /guard:cf arm64ec_test.obj asm_func.obj

crash stack:

(2378.6f9c): Security check failure or stack buffer overrun - code c0000409 (!!! second chance !!!)
Subcode: 0xa FAST_FAIL_GUARD_ICALL_CHECK_FAILURE 
ntdll!RtlFailFast2:
00007ffe`647e4eb0 d43e0060 brk         #0xF003
0:000:ARM64EC> k
Arch   Child-SP          RetAddr               Call Site
00  ARM64EC 00000094`8f59fbc0 00007ffe`648be358     ntdll!RtlFailFast2
01  ARM64EC 00000094`8f59fbc0 00007ffe`64946330     ntdll!#RtlpHandleInvalidUserCallTarget+0x78
02  ARM64EC 00000094`8f59fbe0 00007ff7`cf440854     ntdll!#LdrpHandleInvalidUserCallTargetEC+0x40
*** WARNING: Unable to verify checksum for arm64ec_test.exe
03  ARM64EC 00000094`8f59fcc0 00007ff7`cf37103c     arm64ec_test!#PushAllRegistersAndIterateStack+0x20
04  ARM64EC 00000094`8f59fcd0 00007ff7`cf3afc58     arm64ec_test!main+0x38
05  ARM64EC (Inline Function) --------`--------     arm64ec_test!invoke_main+0x24 [D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 78] 
06  ARM64EC 00000094`8f59fd00 00007ff7`cf3af9b4     arm64ec_test!__scrt_common_main_seh+0x130 [D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 288] 
07  ARM64EC (Inline Function) --------`--------     arm64ec_test!__scrt_common_main+0x8 [D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 330] 
08  ARM64EC 00000094`8f59fd40 00007ffe`633015e8     arm64ec_test!mainCRTStartup+0x14 [D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_main.cpp @ 16] 
09  ARM64EC 00000094`8f59fd50 00007ffe`6488c120     KERNEL32!#BaseThreadInitThunk+0x48
0a  ARM64EC 00000094`8f59fd60 00000000`00000000     ntdll!#RtlUserThreadStart+0x70

At the same time, clang-cl and lld-link utilize the third call checker thunk(I think it's __os_arm64x_check_icall_cfg) to verify the CFG function:

0:000:ARM64EC> ub 00007ff7`cf440854
arm64ec_test!#PushAllRegistersAndIterateStack:
00007ff7`cf440834 f81f0ffe str         lr,[sp,#-0x10]!
00007ff7`cf440838 d00000e8 adrp        x8,arm64ec_test!std::time_put<char,std::ostreambuf_iterator<char,std::char_traits<char> > >::`RTTI Complete Object Locator'+0x20 (00007ff7`cf45e000)
00007ff7`cf44083c f9477108 ldr         x8,[x8,#0xEE0]
00007ff7`cf440840 90fff9ab adrp        x11,arm64ec_test!std::basic_string<char,std::char_traits<char>,std::allocator<char> >::_Tidy_deallocate+0xa8 (00007ff7`cf374000)
00007ff7`cf440844 9107a16b add         x11,x11,#0x1E8
00007ff7`cf440848 9000000a adrp        x10,arm64ec_test!#GetTimeZoneInformation+0x4 (00007ff7`cf440000)
00007ff7`cf44084c 9120414a add         x10,x10,#0x810
00007ff7`cf440850 d63f0100 blr         x8
0:000:ARM64EC> dq 00007ff7`cf45e000+EE0-10
00007ff7`cf45eed0  00007ffe`64aa23c0 00007ffe`64aa23c0
00007ff7`cf45eee0  00007ffe`64aa22c0 00007ffe`647f5240
00007ff7`cf45eef0  00007ffe`647f6780 00007ffe`634dba90
00007ff7`cf45ef00  00000000`00000000 00000000`00000000
00007ff7`cf45ef10  00000000`00000000 00000000`00000000
00007ff7`cf45ef20  00000000`00000000 00000000`00000000
00007ff7`cf45ef30  00007ffe`64aa3140 00007ffe`64aa3140
00007ff7`cf45ef40  00007ffe`64aa3040 00007ffe`64aa3040

Also, I tested the MSVC toolset, and it works without crash. Because it just uses the first call checker thunk:
build command:

<msvc_toolset_path>\cl.exe /c /nologo /Zi /D "NDEBUG" /arm64EC /guard:cf arm64ec_test.cpp
<msvc_toolset_path>\armasm64.exe -machine ARM64EC /nologo asm_func.S
<msvc_toolset_path>\link.exe /DEBUG /MACHINE:ARM64EC /guard:cf arm64ec_test.obj asm_func.obj

asm_func.S:

; Copyright 2020 the V8 project authors. All rights reserved.
; Use of this source code is governed by a BSD-style license that can be
; found in the LICENSE file.

; This file is exactly the same as push_registers_asm.cc, just formatted for
; the Microsoft Arm Assembler.

    AREA |.text|, CODE, ALIGN=4, READONLY
    EXPORT PushAllRegistersAndIterateStack
PushAllRegistersAndIterateStack
    ; x19-x29 are callee-saved
    STP x19, x20, [sp, #-16]!
    STP x21, x22, [sp, #-16]!
    STP x23, x24, [sp, #-16]!
    STP x25, x26, [sp, #-16]!
    STP x27, x28, [sp, #-16]!
    STP fp, lr, [sp, #-16]!
    ; Maintain frame pointer
    MOV fp, sp
    ; Pass 1st parameter (x0) unchanged (Stack*).
    ; Pass 2nd parameter (x1) unchanged (StackVisitor*).
    ; Save 3rd parameter (x2; IterateStackCallback)
    MOV x7, x2
    ; Pass 3rd parameter as sp (stack pointer)
    MOV x2, sp
    BLR x7
    ; Load return address
    LDR lr, [sp, #8]
    ; Restore frame pointer and pop all callee-saved registers.
    LDR fp, [sp], #96
    RET
    END
0:000:ARM64EC> x arm64ec_test!*pushall*
*** WARNING: Unable to verify checksum for arm64ec_test.exe
00007ff7`3417ac48 arm64ec_test!PushAllRegistersAndIterateStack$exit_thunk (void)
00007ff7`34091000 arm64ec_test!PushAllRegistersAndIterateStack (PushAllRegistersAndIterateStack)
0:000:ARM64EC> uf 00007ff7`3417ac48
Flow analysis was incomplete, some code may be missing
arm64ec_test!PushAllRegistersAndIterateStack$exit_thunk:
00007ff7`3417ac48 a9bf7bfd stp         fp,lr,[sp,#-0x10]!
00007ff7`3417ac4c b0000069 adrp        x9,arm64ec_test!__os_arm64x_dispatch_call_no_redirect (00007ff7`34187000)
00007ff7`3417ac50 f9400929 ldr         x9,[x9,#0x10]
00007ff7`3417ac54 b000000a adrp        x10,arm64ec_test!$ientry_thunk$cdecl$i8$i8m1i8+0x48 (00007ff7`3417b000)
00007ff7`3417ac58 910fc14a add         x10,x10,#0x3F0
00007ff7`3417ac5c f0fff8ab adrp        x11,arm64ec_test!PushAllRegistersAndIterateStack (00007ff7`34091000)
00007ff7`3417ac60 9100016b add         x11,x11,#0
00007ff7`3417ac64 d63f0120 blr         x9
00007ff7`3417ac68 d503201f nop
00007ff7`3417ac6c a8c17bfd ldp         fp,lr,[sp],#0x10
00007ff7`3417ac70 d61f0160 br          x11
0:000:ARM64EC> dq 00007ff7`34187000
00007ff7`34187000  00007ffe`634db6f4 00007ffe`634db478
00007ff7`34187010  00007ffe`64aa23c0 00007ffe`64aa23c0

As introduced in the link (Understanding Arm64EC ABI and assembly code | Microsoft Learn), it appears that the third call checker thunk is __os_arm64x_check_icall_cfg, while the first call checker thunk is __os_arm64x_check_icall. When an exported function is implemented using assembly language, the bit corresponding to the 8-byte code region containing the first instruction of the PushAllRegistersAndIterateStack function is consistently set to 0. This behavior occurs regardless of whether the compiler is clang-cl or the MSVC toolset, leading to the failure of the security checker.

Metadata

Metadata

Assignees

No one assigned

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions