[CIR] Autogenerate most of function format #1833

xlauko · 2025-08-18T17:40:25Z

No description provided.

There are some subtleties here. This is the code in OG: ```cpp // note: this is different from default ABI if (!RetTy->isScalarType()) return ABIArgInfo::getDirect(); ``` which says we should return structs directly. It's correct, has have the same behaviour as `nvcc`, and it obeys the PTX ABI as well. The comment dates back to 2013 (see [this commit](llvm/llvm-project@f9329ff) -- it didn't provide any explanation either), so I believe it's outdated. I didn't include this comment in the PR.

…1486) The pattern `call {{.*}} i32` mismatches `call i32` due to double spaces surrounding `{{.*}}`. This patch removes the first space to fix the failure.

This PR resolves an assertion failure in `CIRGenTypes::isFuncParamTypeConvertible`, which is involved when trying to emit a vtable entry to a virtual function whose type includes a pointer-to-member-function.

Lower neon vabsd_s64

…1431) Implements `::verify` for operations cir.atomic.xchg and cir.atomic.cmp_xchg I believe the existing regression tests don't get to the CIR level type check failure and I was not able to implement a case that does. Most attempts of reproducing cir.atomic.xchg type check failure were along the lines of: ``` int a; long long b,c; __atomic_exchange(&a, &b, &c, memory_order_seq_cst); ``` And they seem to never trigger the failure on `::verify` because they fail earlier in function parameter checking: ``` exmp.cpp:7:27: error: cannot initialize a parameter of type 'int *' with an rvalue of type 'long long *' 7 | __atomic_exchange(&a, &b, &c, memory_order_seq_cst); | ^~ ``` Closes #1378 .

Lower neon vcaled_f64

This PR adds a new boolean flag to the `cir.load` and the `cir.store` operation that distinguishes nontemporal loads and stores. Besides, this PR also adds support for the `__builtin_nontemporal_load` and the `__builtin_nontemporal_store` intrinsic function.

Resolves #1477 .

Lower vcales_f32

This PR adds an insertion guard for the try body scope for try-catch. Currently, the following code snippet fails during CodeGen: ``` void foo() { int r = 1; try { ++r; return; } catch (...) { } } ``` The insertion point doesn't get reset properly and the cleanup is being ran for a wrong/deleted block causing a segmentation fault. I also added a test.

The comments suggested that we should use TableGen to generate the recognizing functions. However, I think templates might be more suitable for generating them -- and I can't find any existing TableGen backends that let us generate arbitrary functions. My choice of design is to offer a template to match standard library functions: ```cpp // matches std::find with 3 arguments, and raise it into StdFindOp StdRecognizer<3, StdFindOp, StdFuncsID::Find> ``` I have to use a TableGen'd enum to map names to IDs, as we can't pass string literals to template arguments easily in C++17. This also constraints design of future `StdXXXOp`s: they must take operands the same way of StdFindOp, where the first one is the original function, and the rest are function arguments. I'm not sure if this approach is the best way. Please tell me if you have concerns or any alternative ways.

…was set explicitly (#1482) This is backported from a change made in llvm/llvm-project#131181 --------- Co-authored-by: Morris Hafner <mhafner@nvidia.com>

…R attribute. (#1467) Started decorating CUDA shadow variables with the shadow_name CIR attribute which will be used for registering the globals.

Closes #1354.

Lower neon vcaltd_f64

@mmha

… target was set explicitly" (#1509) Reverts #1482 @mmha this is crashing on macos on asserts build: ``` FAIL: Clang :: CIR/Tools/cir-translate/warn-default-triple.cir (472 of 552) ******************** TEST 'Clang :: CIR/Tools/cir-translate/warn-default-triple.cir' FAILED ******************** Exit Code: 134 Command Output (stdout): -- Assertion failed: (!DataLayoutString.empty() && "Uninitialized DataLayout!"), function getDataLayoutString, file TargetInfo.h, line 1282. ``` Perhaps besides picking a default you maybe need to do some missing datalayout init?

llvm/llvm-project#132269

Ref: cd269fe Related: #1497

…lementwise_acos (#1507) Closes: #1374 Replaces LLVMIntrinsicCallOp with ACosOp in __builtin_elementwise_acos.

https://www.stepsecurity.io/blog/harden-runner-detection-tj-actions-changed-files-action-is-compromised

Sub-issue of #1192. Adds CIR_ASinOp and support for __builtin_elementwise_asin.

This un-xfails the 6 files in #1497 related to variadic calls.

Lower vcalts_f32

Sub-issue of #1192. Adds CIR_ATanOp and support for __builtin_elementwise_atan.

Part of #258 . 1. Added `AddressPointAttr` 2. Change all occurrences of `VTableAddrPointOp` into using the attribute 3. Update tests --------- Co-authored-by: Sirui Mu <msrlancern@gmail.com> Co-authored-by: Morris Hafner <mmha@users.noreply.github.com> Co-authored-by: Morris Hafner <mhafner@nvidia.com> Co-authored-by: Sharp-Edged <48861530+Sharp-Edged@users.noreply.github.com> Co-authored-by: Amr Hesham <amr96@programmer.net> Co-authored-by: Bruno Cardoso Lopes <bruno.cardoso@gmail.com> Co-authored-by: Letu Ren <fantasquex@gmail.com>

Clang relies on `llvm::Intrinsic::getOrInsertDeclaration` to handle functions marked as `ClangBuiltin` in TableGen. That function receives a `CodeGenModule*` so CIR can't use that. We need to re-implement parts of it.

This patch bumps the windows CI container to windows server 2022 from windows server 2019. This is necessary as Github has sunsetted support for sever 2019, so we cannot build the container through GHA without updating. Using more recent versions is just good practice anyways. This will not roll out immediately and we'll have to make some TF changes to get deployed, but some additional validation first will be good anyways. Reviewers: lnihlen, tstellar, cmtice Reviewed By: cmtice Pull Request: llvm/llvm-project#148318 (cherry picked from commit 3e43915)

…style (#1741) - This adds common `CIR_` prefix to all operation disambiguating them when used with other dialects. - Unifies traits style in operation definitions

- This fixes default value to be expected 65535 - Introduces DefaultGlobalCtorDtorPriority constant - Makes function to use I32Attr for priority instead of unnecessary attribute with reference to function

Seems like this is the wrong approach. This reverts commit bc91ef4.

This updates the lowering of CIR function aliases in such a way that they now actually become aliases in the final LLVM IR.

…1740) This PR has two parts: 1. Mimicking the OG [special case](https://github.com/llvm/clangir/blob/d030c9bff74f4f9504a61abe9b2c04a8777028a5/clang/lib/CodeGen/CGException.cpp#L690) for a single catch-all when getting dispatch blocks. The huge testcase I added, gotten by using [creduce](https://github.com/csmith-project/creduce) on a c++ file, crashed at this point [in our version](https://github.com/llvm/clangir/blob/d030c9bff74f4f9504a61abe9b2c04a8777028a5/clang/lib/CIR/CodeGen/CIRGenException.cpp#L789). 2. Fixing multiple destructor calls for the same object. For example, there were tests like [#1](https://github.com/llvm/clangir/blob/d030c9bff74f4f9504a61abe9b2c04a8777028a5/clang/test/CIR/CodeGen/try-catch-dtors.cpp#L370C1-L372C80) and [#2](https://github.com/llvm/clangir/blob/d030c9bff74f4f9504a61abe9b2c04a8777028a5/clang/test/CIR/CodeGen/conditional-cleanup.cpp#L217C1-L224C25), having a second destructor call to an already destroyed object. This PR fixes these and I have updated the tests. Also, I added `"CIR-NEXT"` at some points, to confirm the destructors are indeed called once. As usual, please let me know if you have any concerns.

This patch backports changes made to the bit operations in the upstream PR llvm/llvm-project#148378. Namely, this patch includes the following changes: - This patch removes the `bit.` prefix in the op mnemonic. The operation names now directly correspond to the builtin function names except for bswap which is represented by `cir.byte_swap` for more clarity. - Since all bit operations are `SameOperandsAndResultType`, this patch updates their assembly format and avoids spelling out the operand type twice.

The LoweringPrepare pass was generating the wrong condition for loops when lowering the ArrayCtor op, causing only one element in an array of objects to be constructed. This fixes that problem.

Backporting passing enum kind directly to complex cast helpers

…ent (#1748) ## Overview Currently, getting the pointer to an element of an array requires a pointer decay and a (possible) pointer stride. A similar pattern for records has been eliminated with the `cir.get_member` operation. This PR provides a similar level of abstraction for arrays with the `get_element` operation. `get_element` replaces the above pattern with a single operation, which takes a pointer to an array and an index, and produces a pointer to the element at that index. There are many places in CIR analysis and lowering where the `ptr_stride(array_to_ptrdecay(x), i)` pattern is handled as a special case. By subsuming the special case pattern with an explicit operation, we make these analyses and lowering more robust. ## Changes Adds the `cir.get_element` operation. Extends CIRGen to emit `cir.get_element` for array subscript expressions. Updated LifetimeCheck to handle `get_element` operation, subsuming special case analysis of `cir.ptr_stride` operation (did not remove the special case). Extends CIR-to-LLVM lowering to lower `cir.get_element` to `llvm.getelementptr` Extends CIR-to-MLIR lowering to lower `cir.get_element` to `memref` operations, matching existing special case `cir.ptr_stride` lowering. ## Additional Notes Currently, 47.6% of `cir.ptr_stride` operations in the llvm-test-suite (SingleSource and MultiSource) can be replaced by `cir.get_element` operations. ### Operator Breakdown (current) name | count | % -- | -- | -- cir.load | 825221 | 22.27% cir.br | 429822 | 11.60% cir.const | 380381 | 10.26% cir.cast | 325646 | 8.79% cir.store | 309586 | 8.35% cir.get_member | 226895 | 6.12% cir.get_global | 186851 | 5.04% cir.ptr_stride | 158094 | 4.27% cir.call | 144522 | 3.90% cir.binop | 141142 | 3.81% cir.alloca | 134346 | 3.63% cir.brcond | 112864 | 3.05% cir.cmp | 83532 | 2.25% ### Operator Breakdown (with `get_element`) name | count | % -- | -- | -- cir.load | 825221 | 22.74% cir.br | 429822 | 11.84% cir.const | 380381 | 10.48% cir.store | 309586 | 8.53% cir.cast | 248645 | 6.85% cir.get_member | 226895 | 6.25% cir.get_global | 186851 | 5.15% cir.call | 144522 | 3.98% cir.binop | 141142 | 3.89% cir.alloca | 134346 | 3.70% cir.brcond | 112864 | 3.11% cir.cmp | 83532 | 2.30% cir.ptr_stride | 81093 | 2.23% cir.get_elem | 77001 | 2.12% --------- Co-authored-by: Andy Kaylor <akaylor@nvidia.com> Co-authored-by: Henrich Lauko <xlauko@mail.muni.cz>

Implemented `noexcept` expression handling in CIR generation. Added a `noexcept.cpp` test based on cppreference. There was no OG test to base it off of, so I used the example code from [cppreference](https://en.cppreference.com/w/cpp/language/noexcept.html).

I think this one is self-explanatory, so I will not write much 🙂‍ Adding this attribute helps in optimizations like [#1653](#1653), and using the attribute it's easy to create operations like `cir.std.vector.ctor`/`cir.std.vector.dtor` by just modifying `IdiomRecognizer` a bit. I believe it will also be useful for future optimizations. Finally, I updated quite a number of tests so they now reflect this attribute. Please, let me know if you see any issues.

Implemented opportunistic vtable emission, which marks vtables as `available_externally` to enable inlining if optimizations are enabled. Added `GlobalOp` verifier support `available_externally` linkage type, all cases are covered now, so I removed the `default` case. Added the `vtable-available-externally` CIRGen test.

Fix lowering Complex to Complex cast, backported from llvm/llvm-project#149717

…r` (#1753) Implemented CIR code generation for `CXXPseudoDestructorExpr`. Added a pseudo destructor test to `CIR/CodeGen/dtors.cpp`.

#1760)

…trdecay` to `get_element` when possible (#1761) Extended the `CIRCanonicalizePass` with new rewrite rules: - Rewrite `ptr_stride (cast array_to_ptrdecay %base), %index` to `get_element %base[%index]` - Rewrite `ptr_stride (get_element %base[%index]), %stride` to `get_element %base[%index + %stride]` - Rewrite `cast array_to_ptrdecay %base, ptr<T>` to `get_element %base[0], ptr<T>` if it is only used by `load %ptr : T`, `store %val : T, %ptr`, or `get_member %ptr[field] : ptr<T> -> U` Updated CodeGen tests, and extended CIR-to-CIR test. --------- Co-authored-by: Henrich Lauko <xlauko@mail.muni.cz>

`cir::PointerType` was not included in the applicability guard for `cir::VAArg` lowering during `LoweringPrepare`. Since we don't have generic LLVM `cir::VAArgOp` (see [more info](#1088 (comment))) this causes an NYI error during lowering that doesn't need to happen. To fix this I added the missing `cir::PointerType` to the `isa`. There is probably a more comprehensive fix to this if someone is interested, this check should be removed and let the (possible) error occur at the actual NYI site.

- Replaces dyn_cast<cir::ConstantOp>(v.getDefiningOp()) and similar with v.getDefiningOp<cir::ConstantOp>() - Adds `getValueAttr`, `getIntValue` and `getBoolValue` methods to ConstantOp

…Op()) (NFC) (#1765) This applies similar changes to llvm/llvm-project#150428

(Copied from my question on Discord) I’ve been working on the vector to bit-mask related intrinsics for X86. I’ve been stuck specifically on `X86::BI__builtin_ia32_cvtb2mask128(_mm256_movepi16_mask`) and its variations with different vector/mask sizes. In this case, we perform a vector comparison of `vector<16xi16>` and bitcast the resulting `vector<16xi1>` directly into a scalar integer mask (i16). I’m successfully able to lower to cir: ``` ... %5 = cir.vec.cmp(lt, %3, %4) : !cir.vector<!s16i x 16>, !cir.vector<!cir.int<u, 1> x 16> %6 = cir.cast(bitcast, %5 : !cir.vector<!cir.int<u, 1> x 16>), !u16i ... ``` There's an issue arises when lowering this to LLVM, the error message I'm getting is: ``` error: integer width of the output type is smaller or equal to the integer width of the input type ``` By looking at the test cases on the llvm dialect, this is related to the sext / zext instruction. This is the cir → llvm dialect lowered for the latter: ``` ... %14 = "llvm.icmp"(%12, %13) <{predicate = 2 : i64}> : (vector<16xi16>, vector<16xi16>) -> vector<16xi1> %15 = "llvm.sext"(%14) : (vector<16xi1>) -> vector<16xi1> %16 = "llvm.bitcast"(%15) : (vector<16xi1>) -> i16 ... ``` This is seems to be the cause: ``` %15 = "llvm.sext"(%14) : (vector<16xi1>) -> vector<16xi1> ``` **The fix**: Added a type check: if the result type does not differ from the expected type, we won't insert a sextOp

Implemented `CXXDeleteExpr` for concrete and virtual destructors. NYI, global delete, i.e., `::delete`. Added tests for both destructor types.

For these intrinsics there only seems to be one function where the IR emmited seems to diverge: for `_mm_load_sbh` loads a single 16-bit bfloat (__bf16) value from memory into the lowest element of a 128-bit bfloat vector (__m128bh), leaving the remaining lanes unchanged or filled with a passthrough value. It is implemented using a masked load with only the first lane enabled. [source for intrinsics with similar behaviour](https://gist.github.com/leopck/86799fee6ceb9649d0ebe32c1c6e5b85) In the CIR lowering of `_mm_load_sbh`, we are currently emitting the mask of intrinsic (`llvm.masked.load`) operand as an explicit constant vector: ``` llvm <8 x i1> <true, false, false, false, false, false, false, false> ``` whereas OG lowers: ```llvm <8 x i1> bitcast (<1 x i8> splat (i8 1) to <8 x i1>) ``` I believe both things are semantically equal so: Is it acceptable for CIR and OG to diverge in this way for masked loads, or should we aim for parity in how the mask is represented, even if that reduces readability in CIR?

Implement supporting for CK_LValueToRValueBitCast for ComplexType

xlauko · 2025-08-18T17:40:54Z

[CIR] Autogenerate most of function format #1833 👈 (View in Graphite)
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

AdUhTkJm and others added 30 commits April 9, 2025 15:18

[CIR][NFC] Fix test failures caused by double spaces in check line (#…

9e8806a

…1486) The pattern `call {{.*}} i32` mismatches `call i32` due to double spaces surrounding `{{.*}}`. This patch removes the first space to fix the failure.

[CIR] Fix assertion in CIRGenTypes::isFuncParamTypeConvertible (#1487)

17935d6

This PR resolves an assertion failure in `CIRGenTypes::isFuncParamTypeConvertible`, which is involved when trying to emit a vtable entry to a virtual function whose type includes a pointer-to-member-function.

[CIR][CIRGen][Builtin][Neon] Lower neon vabsd_s64 (#1489)

bce7507

Lower neon vabsd_s64

[CIR][CIRGen][Builtin][Neon] Lower neon vcaled_f64 (#1495)

e74e226

Lower neon vcaled_f64

[CIR][CIRGen][builtin] handle _mm_pause (#1493)

dd37e38

[CIR] Emit nsw flag for unary integer operations (#1485)

6a3e881

Resolves #1477 .

[CIR][CIRGen][Builtin][Neon] Lower vcales_f32 (#1500)

8f04109

Lower vcales_f32

[CIR][NFC] Fix a wrong test case in fc293bb (#1503)

126956d

cir-translate: Use default target triple instead of x86 if no target …

778bedb

…was set explicitly (#1482) This is backported from a change made in llvm/llvm-project#131181 --------- Co-authored-by: Morris Hafner <mhafner@nvidia.com>

[CIR][CUDA] Decorate global CUDA shadow variables with appropriate CI…

86a85ef

…R attribute. (#1467) Started decorating CUDA shadow variables with the shadow_name CIR attribute which will be used for registering the globals.

[CIR][CIRGen][Builtin] add __builtin_tan (#1502)

c66f98b

Closes #1354.

[CIR][CIRGen][Builtin][Neon] Lower neon vcaltd_f64 (#1505)

0004b81

Lower neon vcaltd_f64

[CIR] Match comment in upstream

571e56f

llvm/llvm-project#132269

[CIR] Backport clang commit to unxfail some builtin call (#1501)

6b19f09

Ref: cd269fe Related: #1497

[CIR][CodeGen] Replace LLVMIntrinsicCallOp with ACosOp in __builtin_e…

a442924

…lementwise_acos (#1507) Closes: #1374 Replaces LLVMIntrinsicCallOp with ACosOp in __builtin_elementwise_acos.

Drop use of comprimised action

2a6a8d8

https://www.stepsecurity.io/blog/harden-runner-detection-tj-actions-changed-files-action-is-compromised

Drop use of comprimised action

49be891

https://www.stepsecurity.io/blog/harden-runner-detection-tj-actions-changed-files-action-is-compromised

[CIR][CodeGen] Support for __builtin_elementwise_asin (#1511)

beccc39

Sub-issue of #1192. Adds CIR_ASinOp and support for __builtin_elementwise_asin.

[CIR] Un-xfail variadic calls (#1517)

f60e828

This un-xfails the 6 files in #1497 related to variadic calls.

[CIR][CIRGen][Builtin][Neon] Lower vcalts_f32 (#1520)

a61a0d8

Lower vcalts_f32

[CIR][CodeGen] Support for __builtin_elementwise_atan (#1512)

ec817c5

Sub-issue of #1192. Adds CIR_ATanOp and support for __builtin_elementwise_atan.

[CIR][CUDA] Handle clang builtin functions (#1496)

3af8b3f

Clang relies on `llvm::Intrinsic::getOrInsertDeclaration` to handle functions marked as `ClangBuiltin` in TableGen. That function receives a `CodeGenModule*` so CIR can't use that. We need to re-implement parts of it.

boomanaiden154 and others added 28 commits July 14, 2025 16:33

[CIR] Reformat Ops to use common CIR_ prefix and definition traits …

17aa78d

…style (#1741) - This adds common `CIR_` prefix to all operation disambiguating them when used with other dialects. - Unifies traits style in operation definitions

[CIR] Fix Global Ctor/Dtor priority attributes (#1743)

ee785a0

- This fixes default value to be expected 65535 - Introduces DefaultGlobalCtorDtorPriority constant - Makes function to use I32Attr for priority instead of unnecessary attribute with reference to function

Revert "[CI][Github] Bump Windows Container to Server 2022"

b0901d3

Seems like this is the wrong approach. This reverts commit bc91ef4.

[CIR] Implement function alias lowering (#1739)

504714a

This updates the lowering of CIR function aliases in such a way that they now actually become aliases in the final LLVM IR.

[CIR] Reformat Attr to use common CIR_ prefix and traits style (#1746)

ae25175

[CIR][NFC] Add placeholders for remaining x86 intrinsics (#1754)

8f89224

[CIR] Fix array init loop condition (#1758)

2a126d2

The LoweringPrepare pass was generating the wrong condition for loops when lowering the ArrayCtor op, causing only one element in an array of objects to be constructed. This fixes that problem.

[CIR][NFC] Pass enum kind directly to complex cast helpers (#1757)

754a11a

Backporting passing enum kind directly to complex cast helpers

[CIR] Fix lowering Complex to Complex cast (#1750)

1ea6ce8

Fix lowering Complex to Complex cast, backported from llvm/llvm-project#149717

[CIR][CodeGen] Implemented CIR generation for `CXXPseudoDestructorExp…

d423af4

…r` (#1753) Implemented CIR code generation for `CXXPseudoDestructorExpr`. Added a pseudo destructor test to `CIR/CodeGen/dtors.cpp`.

[CIR][CIRGen][Builtin][X86] Lower AVX generic masked store instrinsics (

83da585

#1760)

[CIR] Simplify ConstantOp accesses and its getDefiningOp (#1764)

e571d1c

- Replaces dyn_cast<cir::ConstantOp>(v.getDefiningOp()) and similar with v.getDefiningOp<cir::ConstantOp>() - Adds `getValueAttr`, `getIntValue` and `getBoolValue` methods to ConstantOp

[CIR] Use getDefiningOp<OpTy>() instead of dyn_cast<OpTy>(getDefining…

d45823a

…Op()) (NFC) (#1765) This applies similar changes to llvm/llvm-project#150428

[CIR][CodeGen] Implemented CXXDeleteExpr (#1756)

2ccc77e

Implemented `CXXDeleteExpr` for concrete and virtual destructors. NYI, global delete, i.e., `::delete`. Added tests for both destructor types.

[CIR] Implement CK_LValueToRValueBitCast for ComplexType (#1751)

b647f4b

Implement supporting for CK_LValueToRValueBitCast for ComplexType

XXX

f3e2e35

region format do not merge

9a98168

xlauko changed the title ~~XXX~~ [CIR] Autogenerate most of function format Aug 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CIR] Autogenerate most of function format #1833

[CIR] Autogenerate most of function format #1833

xlauko commented Aug 18, 2025 •

edited

Loading

Uh oh!

xlauko commented Aug 18, 2025

Uh oh!

Uh oh!

[CIR] Autogenerate most of function format #1833

Are you sure you want to change the base?

[CIR] Autogenerate most of function format #1833

Conversation

xlauko commented Aug 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xlauko commented Aug 18, 2025

Uh oh!

Uh oh!

xlauko commented Aug 18, 2025 •

edited

Loading