[ModuleSplitter] llvm module splitter for parallel llvm compilation. #2

weiweichen · 2025-03-25T20:29:48Z

No description provided.

Add int overloads which cast the various ints to a float and call the float builtin. These overloads are conditional on hlsl version 202x or earlier. Add tests and puts tests in own files, including some of the tests added for double overloads. Closes llvm#128229

Since Clang 20 has been release we no longer support Clang 18 per our policy. Note the Clang 18 workarounds will be removed in a follow-up patch.

This adds some initial documentation about freestanding requirements for Clang. The most critical part of the documentation is spelling out that a conforming freestanding C Standard Library is required; Clang will not be providing the headers for <string.h> in C23 which expose a number of symbols in freestanding mode. The docs also make it clear that in addition to a conforming freestanding C standard library, the library must provide some additional symbols which LLVM requires. These docs are not comprehensive, this is just getting the bare bones in place so that they can be expanded later. This also updates the C status page to make it clear that we don't have anything to do for WG14 N2524 which adds string interfaces to freestanding mode.

C2y adds the `_Countof` operator which returns the number of elements in an array. As with `sizeof`, `_Countof` either accepts a parenthesized type name or an expression. Its operand must be (of) an array type. When passed a constant-size array operand, the operator is a constant expression which is valid for use as an integer constant expression. This is being exposed as an extension in earlier C language modes, but not in C++. C++ already has `std::extent` and `std::size` to cover these needs, so the operator doesn't seem to get the user enough benefit to warrant carrying this as an extension. Fixes llvm#102836

The main goal of this patch is to improve the performance of concept subsumption by - Making sure literal (atomic) clauses are de-duplicated (Whether 2 atomic constraint is established during the initial normal form production). - Eagerly removing duplicated clauses. This should minimize the risks of exponentially large formulas that can be produced by a naive {C,D}NF transformation. While at it, I restructured that part of the code to be a bit clearer. Subsumption of fold expanded constraint is also cached. --- Note that removing duplicated clauses seems to be necessary and sufficient to have acceptable performance on anything that could be construed as reasonable code. Ultimately, the number of clauses is always going to be fairly small (but $2^{fairly\ small}$ is quickly *fairly large*..). I went too far in the rabbit hole of Tseitin transformations etc, which was much faster but would then require to check satisfiabiliy to establish subsumption between some constraints (although it was good enough to pass all but ones of our tests...). It doesn't help that the C++ standard has a very specific definition of subsumption that is really more of an implication... While that sort of musing is fascinating, it was ultimately a fool's errand, at least until such time that there is more motivation for a SAT solver in clang (clang-tidy can after all use z3!). Here be dragons. Fixes llvm#122581

We have a lot of missing Codesize costs for vector operations. This patch starts things off by adding codesize costs for getVectorInstrCost, returning a single cost instead of the VectorInsertExtractBaseCost (which is typically 2). Insert of a load are given a cost of 0 as they use ld1, otherwise the cost is 1.

) Current usage of alignstack is restricted to LLVM pointer types, whereas when it's used in parameters it's possible to use it for other types, see examples like `{i8, i8}, [2 x float], etc` in `llvm/test/CodeGen`. This PR lifts the restriction and add testcases.

…duling model (llvm#133165) P500-series cores should have a floating point load latency closer to 5 cycles, just like P400- and P600-series cores.

…rface (llvm#133274) The primary reason is that if you pass a TypeSize without explicitly converting to LocationSize, you otherwise implicit convert to uint64_t to call the respective LocationSize constructor. This means that any scalable value becomes a runtime assertion failure. By replacing uint64_t with TypeSize in this API, we avoid the implicit conversion for TypeSize. uint64_t callers implicit convert to LocationSize (via the raw constructor) which should have unchanged behavior.

Test just needs an explicit triple that was missed.

Commit 20b7f59 includes a case that checks diagnostics for for loops using thread locals. This fails on platforms which do not support TLS. This change adds guards to run this part of the test iff the feature is supported.

llvm#132690) Keep the start value as operand of ComputeFindLastIVResult. A follow-up patch will use this to make sure the start value is frozen if needed. Depends on llvm#132689 PR: llvm#132690

…llvm#132742) This patch adds the `IsolatedFromAbove` trait as a dependent trait to the `DataLayoutOpInterface` op interface. The motivation behind this change comes from the implementation of the `ptr` dialect, specifically the `ptr.type_offset` op. This op produces an int-like value that equates to the size of a memory element. This is useful for ptr arithmetic and indexing arrays. For example: ```mlir %f32_off = ptr.type_offset f32 : index %addr = ptr.ptradd %ptr, %f32_off : !ptr, index %x = ptr.load %addr : !ptr -> f32 // Read ptr[1] ``` Without the `IsolatedFromAvobe` trait in the DL interface, the `ptr.type_offset` cannot be `ConstantLike`. Why? Take the example: ```mlir op {DL1} { %f32_off0 = ptr.type_offset f32 : index op {DL2} { %f32_off1 = ptr.type_offset f32 : index } } ``` If `ptr.type_offset` were to be `ConstantLike` then `canonicalize` would hoist and unique the value. However, that could be wrong as DL2 could have an entry to specify the size that's different from the size in DL1. The best solution to the above problem is to make `DataLayoutOpInterface` require the `IsolatedFromAbove` trait, as it preserves the constness of values in the DL with respect to the canonicalizer.

…and [nfc] These were added in d584cea. This change runs through existing uses and simplifies where obvious.

Summary: I am probably the person most familiar with the offloading pipeline in clang at this point.

Adds extra test coverage showing change by llvm#128045.

This patch bumps the CI container to the latest LLVM Release and gets rid of the patch that we were carrying that is in 20.1.1. Reviewers: tstellar Reviewed By: tstellar Pull Request: llvm#132567

This helps keep things up to date, and should not cause any issues given we do not need to care about binary compatibility for things built in the CI container. This patch also changes the name of the container which allows incrementally moving jobs over after this lands. Reviewers: tstellar Reviewed By: tstellar Pull Request: llvm#132568

…m#133299) This reverts commit d7cea2b. It causes crashes in API tests.

…llvm#133283) Summary: The llvm#128509 patch introduced `--flto-partitions`. This was marked as a HIP only argument, and was also spelled and handled incorrectly for an `-f` option. This patch makes the handling generic for `ld.lld` consumers. This also fixes some issues with emitting the flags being put after the default arguments, preventing users from overriding them. Also, forwards things properly for the new driver so we can test this.

The new function will return `std::nullopt` when any error occurs.

This patch refactors the generate_test_report script, namely turning it into a proper library, and pulling the script/unittests out into separate files, as is standard with most python scripts. The main purpose of this is to enable reusing the library for the new Github premerge. Reviewers: tstellar, DavidSpickett, Keenuts, lnihlen Reviewed By: DavidSpickett Pull Request: llvm#133196

Adds wide integer emulation support for the `arith.subi` op. `(i2N, i2N) -> (i2N)` ops are emulated as `(vector<2xiN>, vector<2xiN>) -> (vector<2xiN>)`, just as the other emulation patterns. The emulation uses the following scheme: ``` resLow = lhsLow - rhsLow; // carry = 1 if rhsLow > lhsLow resHigh = lhsLow - carry - rhsLow; ``` Signed-off-by: Ege Beysel <beysel@roofline.ai>

`env -u` is not supported by the system `env` utility on AIX. `/opt/freeware/bin/env` is the standard path for the GNU coreutils `env` utility as distributed by the AIX Toolbox for Open Source Software. Adding `/opt/freeware/bin` to `PATH` causes issues by picking up other utilities that are less capable, in an AIX context, than the system ones. This patch modifies the relevant usage of `env` to use (on AIX) the full path to `/opt/freeware/bin/env`.

Emit progress events from SymbolFileDWARFDebugMap. Because we know the number of OSOs, we can show determinate progress. This is based on a patch from Adrian, and part of what prompted me to look into improving how LLDB shows progress events. Before the statusline, all these progress events would get shadowed and never displayed on the command line.

…lvm#129262) Added a new setting called `lldb-dap.arguments` and a debug configuration attribute called `debugAdapterArgs` that can be used to set the arguments used to launch the debug adapter. Right now this is mostly useful for debugging purposes to add the `--wait-for-debugger` option to lldb-dap. Additionally, the extension will now check for a changed lldb-dap executable or arguments when launching a debug session in server mode. I had to add a new `DebugConfigurationProvider` to do this because VSCode will show an unhelpful error modal when the `DebugAdapterDescriptorFactory` returns `undefined`. In order to facilitate this, I had to add two new properties to the launch configuration that are used by the `DebugAdapterDescriptorFactory` to tell VS Code how to launch the debug adapter: - `debugAdapterHostname` - the hostname for an existing lldb-dap server - `debugAdapterPort` - the port for an existing lldb-dap server I've also removed the check for the `executable` argument in `LLDBDapDescriptorFactory.createDebugAdapterDescriptor()`. This argument is only set by VS Code when the debug adapter executable properties are set in the `package.json`. The LLDB DAP extension does not currently do this (and I don't think it ever will). So, this makes the debug adapter descriptor factory a little easier to read. The check for whether or not `lldb-dap` exists has been moved into the new `DebugConfigurationProvider` as well. This way the extension won't get in the user's way unless they actually try to start a debugging session. The error will show up as a modal which will also make it more obvious when something goes wrong, rather than popping up as a warning at the bottom right of the screen.

…lvm#133176) Currently only ctor/dtor list and their priorities are supported. This PR adds support for the missing data field. Few implementation notes: - The assembly printer has a fixed form because previous `attr_dict` will sort the dict by key name, making global_dtor and global_ctor differ in the order of printed arguments. - LLVM's `ptr null` is being converted to `#llvm.zero` otherwise we'd have to create a region to use the default operation conversion from `ptr null`, which is silly given that the field only support null or a symbol.

Do cleanup in DXILFinalizeLinkage.cpp where intrinsic declares are getting orphaned. This change reduces "Unsupported intrinsic for DXIL lowering" errors when compiling DML shaders from 12218 to 415. and improves our compilation success rate from less than 1% to 44%.

Closes llvm#99156. Tasks completed: - Implement `smoothstep` using HLSL source in `hlsl_intrinsics.h` - Implement the `smoothstep` SPIR-V target built-in in `clang/include/clang/Basic/BuiltinsSPIRV.td` - Add sema checks for `smoothstep` to `CheckSPIRVBuiltinFunctionCall` in `clang/lib/Sema/SemaSPIRV.cpp` - Add codegen for spv `smoothstep` to `EmitSPIRVBuiltinExpr` in `clang/lib/CodeGen/TargetBuiltins/SPIR.cpp` - Add codegen tests to `clang/test/CodeGenHLSL/builtins/smoothstep.hlsl` - Add spv codegen test to `clang/test/CodeGenSPIRV/Builtins/smoothstep.c` - Add sema tests to `clang/test/SemaHLSL/BuiltIns/smoothstep-errors.hlsl` - Add spv sema tests to `clang/test/SemaSPIRV/BuiltIns/smoothstep-errors.c` - Create the `int_spv_smoothstep` intrinsic in `IntrinsicsSPIRV.td` - In SPIRVInstructionSelector.cpp create the `smoothstep` lowering and map it to `int_spv_smoothstep` in `SPIRVInstructionSelector::selectIntrinsic` - Create SPIR-V backend test case in `llvm/test/CodeGen/SPIRV/hlsl-intrinsics/smoothstep.ll` - Create SPIR-V backend test case in `llvm/test/CodeGen/SPIRV/opencl/smoothstep.ll`

This patch migrates the CI over to the new compute_projects.py script for calculating what projects need to be tested based on a change to LLVM. Reviewers: lnihlen, ldionne, tstellar, Endilll, joker-eph, Keenuts Reviewed By: Keenuts, tstellar Pull Request: llvm#132642

Currently when someone touches a docs directory in a subproject, it is treated as if the source code of that project got touched, so the project is built, it is tested, and the same for all of its enumerated dependents. This is wasteful, particularly for patches just touching docs in places like LLVM where we might spend an hour of node time to do nothing useful given changes in the docs shouldn't cause test failures and there is already another workflow that tests the documentation build completes successfully. Reviewers: Keenuts, tstellar, lnihlen Reviewed By: tstellar Pull Request: llvm#133185

This patch adds rich test failure information to the Github output, using the same library that is used for the buildkite pipeline. Eventually I think we want to add more information like reproduction information using the containers, but that is very divergent between Github and Buildkite, so we probably want to wait until we've switched over before doing that. Reviewers: Keenuts, tstellar, lnihlen, DavidSpickett Reviewed By: DavidSpickett, Keenuts Pull Request: llvm#133197

The coding standards states that error messages should start with a lowercase. Also use WithColor, and add missing test coverage for the failed to write to output file case.

Avoid capitalized Error. This loses the "Error constructing pass pipeline" part, and just forwards the error to the default report_fatal_error case. Not sure if it's worth trying to keep.

- Moves the verification logic to the `verifyRegions` method of the parent operation. - Fixes a crash during verification when the last block lacks a terminator. Fixes llvm#132850.

Add an entry for pcsections Metadata that references the PC Sections Metadata document. Fixes: llvm#130552

Both traits do the same thing. This patch renames __can_reference to __referenceable and moves it to the is_referenceable header.

…133420)

)

…33466) - rename `GISelKnownBits` to `GISelValueTracking` to analyze more than just `KnownBits` in the future

Implement asinh for Float16 along with tests. Closes llvm#131001

Setting `EnableLoopTermFold` enables `loop-term-fold` pass.

Reverts llvm#133584

…e DSA (llvm#133232) Issue: Cray Pointer is not associated to Cray Pointee, leading to Segmentation fault Fix: GetUltimate, retrieves the base symbol in the current scope, which gets passed all the references and returns the original symbol --------- Co-authored-by: Michael Klemm <michael.klemm@amd.com>

…rgs.cpp`

Unlike `flat_map` and `flat_multimap`, The two function template overloads `flat_set::insert`'s wording strongly suggest we should use the transparent comparator https://eel.is/c++draft/flat.set#modifiers-1 Both the code and the tests were not using the transparent comparator, which needs to be fixed

This patch implements a simple printing pass for regions. This is meant to be used in tests and for debugging.

…demanded elts across all EXTRACT_SUBVECTOR uses (REAPPLIED) (llvm#133401) Similar to what is done for visitEXTRACT_VECTOR_ELT - if all uses of a vector are EXTRACT_SUBVECTOR, then determine the accumulated demanded elts across all users and call SimplifyDemandedVectorElts in "AssumeSingleUse" use. Second try after llvm#133130 was reverted by llvm#133331 due to it affecting reverted test files

…pcrel clang -fexperimental-relative-c++-abi-vtables might generate `@plt` and `@gotpcrel` specifiers in data directives. The syntax is not used in humand-written assembly code, and is not supported by GNU assembler. Note: the `@plt` in `.word foo@plt` is different from the legacy `call func@plt` (where `@plt` is simply ignored). The `@plt` syntax was selected was simply due to a quirk of AsmParser: the syntax was supported by all targets until I updated it to be an opt-in feature in a067175 RISC-V favors the `%specifier(expr)` syntax following MIPS and Sparc, and we should follow this convention. This PR adds support for `.word %pltpcrel(foo+offset)` and `.word %gotpcrel(foo)`, and drops `@plt` and `@gotpcrel`. * MCValue::SymA can no longer have a SymbolVariant. Add an assert similar to that of AArch64ELFObjectWriter.cpp before https://reviews.llvm.org/D81446 (see my analysis at https://maskray.me/blog/2025-03-16-relocation-generation-in-assemblers if intrigued) * `jump foo@plt, x31` now has a different diagnostic. Pull Request: llvm#132569

…se-time constants An immediate operand is encoded as an `MCExpr`, with `RISCVMCExpr` specifying an operand that includes a relocation specifier. When https://reviews.llvm.org/D23568 added initial fixup and relocation support in 2017, it adapted code from `PPCMCExpr` (for `@l` `@ha`) to evaluate the `RISCVMCExpr` operand. (PPCAsmParser had considerable technical debt, though I’ve recently streamlined it somewhat, e.g. 8560da2) Evaluating RISCVMCExpr during parsing is unnecessary. For example, there's no need to treat `lui a0, %hi(2)` differently from `lui a0, %hi(foo)` when foo has not been defined yet. This evaluation introduces unnecessary complexity. For instance, parser functions require an extra check like `VK == RISCVMCExpr::VK_None`, as seen in these examples: ``` if (!evaluateConstantImm(getImm(), Imm, VK) || VK != RISCVMCExpr::VK_None) return IsConstantImm && isUInt<N>(Imm) && VK == RISCVMCExpr::VK_None; ``` This PR eliminates the parse-time evaluation of `RISCVMCExpr`, aligning it more closely with other targets. --- `abs = 0x12345; lui t3, %hi(abs)` now generates R_RISCV_HI20/R_RISCV_RELAX with linker relaxation. (Tested by test/MC/RISCV/linker-relaxation.s) (Notably, since commit ba2de8f in lld/ELF, the linker can handle R_RISCV_HI relocations with a symbol index of 0 in -pie mode.) Pull Request: llvm#133377

…eic/llvm-module-splitter

spall and others added 30 commits March 27, 2025 09:34

[libc++] Remove official Clang 18 support. (llvm#130142)

82c078c

Since Clang 20 has been release we no longer support Clang 18 per our policy. Note the Clang 18 workarounds will be removed in a follow-up patch.

[RISCV] Add test case for PR llvm#133256

08bb0b8

[RISCV] Update the latency of floating point load in SiFive P500 sche…

aa207c3

…duling model (llvm#133165) P500-series cores should have a floating point load latency closer to 5 cycles, just like P400- and P600-series cores.

Fix failing test case for _Countof

4480f26

Test just needs an explicit triple that was missed.

[VPlan] Manage FindLastIV start value in ComputeFindLastIVResult (NFC) (

8ddbc01

llvm#132690) Keep the start value as operand of ComputeFindLastIVResult. A follow-up patch will use this to make sure the start value is frozen if needed. Depends on llvm#132689 PR: llvm#132690

[CodeGen] Simplify code using TypeSize overloads of getMachineMemOper…

c90a536

…and [nfc] These were added in d584cea. This change runs through existing uses and simplifies where obvious.

[Clang] Add 'Joseph Huber' as offloading driver maintainer (llvm#133296)

a243279

Summary: I am probably the person most familiar with the offloading pipeline in clang at this point.

[LAA] Add missing test coverage for retrying with runtime checks.

8bdcd0a

Adds extra test coverage showing change by llvm#128045.

[Github] Bump CI container to LLVM 20.1.1

5cb3052

This patch bumps the CI container to the latest LLVM Release and gets rid of the patch that we were carrying that is in 20.1.1. Reviewers: tstellar Reviewed By: tstellar Pull Request: llvm#132567

Revert "[lldb] Remove UnwindPlan::Row shared_ptrs (llvm#132370)" (llv…

48864a5

…m#133299) This reverts commit d7cea2b. It causes crashes in API tests.

[VPlan] Add assertion ensuring Plan's UF matches BestUF (NFC).

5eccd71

[AMDGPU] Add a new function getIntegerPairAttribute (llvm#133271)

02b45f4

The new function will return `std::nullopt` when any error occurs.

[RISCV] Sort list of files. NFC.

ee0009c

farzonl and others added 30 commits March 29, 2025 00:45

[RISCV] Test %hi(absolute_symbol)

1e62a35

llvm-reduce: Make some error messages more consistent (llvm#133563)

e87bec6

The coding standards states that error messages should start with a lowercase. Also use WithColor, and add missing test coverage for the failed to write to output file case.

llvm-reduce: Make run-ir-passes error more consistent (llvm#133564)

5b3e152

Avoid capitalized Error. This loses the "Error constructing pass pipeline" part, and just forwards the error to the default report_fatal_error case. Not sure if it's worth trying to keep.

[mlir][spirv] Update verifier for spirv.mlir.merge (llvm#133427)

1878259

- Moves the verification logic to the `verifyRegions` method of the parent operation. - Fixes a crash during verification when the last block lacks a terminator. Fixes llvm#132850.

[LangRef] Add entry for pcsections Metadata (llvm#133423)

fa5025b

Add an entry for pcsections Metadata that references the PC Sections Metadata document. Fixes: llvm#130552

[libc++] Unify __can_reference and __is_referenceable_v (llvm#133278)

646ad89

Both traits do the same thing. This patch renames __can_reference to __referenceable and moves it to the is_referenceable header.

llvm-reduce: Fix losing callsite attributes in operand-to-args (llvm#…

5b5f402

…133420)

llvm-reduce: Fix losing fast math flags in operands-to-args (llvm#133421

d3d4a24

)

llvm-reduce: Fix losing call metadata in operands-to-args (llvm#133422)

d852cc5

[GlobalISel][NFC] Rename GISelKnownBits to GISelValueTracking (llvm#1…

1d0005a

…33466) - rename `GISelKnownBits` to `GISelValueTracking` to analyze more than just `KnownBits` in the future

[libc][math][c23] Add asinhf16 function (llvm#131351)

d22e35b

Implement asinh for Float16 along with tests. Closes llvm#131001

[NFC] LLVM reduce: fix unused variable (llvm#133584)

9451617

MIPS: Set EnableLoopTermFold (llvm#133454)

1c7ab39

Setting `EnableLoopTermFold` enables `loop-term-fold` pass.

Revert "[NFC] LLVM reduce: remove unused variable" (llvm#133586)

0bed721

Reverts llvm#133584

[NFC][AMDGPU] clang-format AMDGPUBaseInfo.[h,cpp] (llvm#133559)

3e742b5

[NFC][llvm-reduce] Fix an used variable warning in `ReduceOperandsToA…

847cdd4

…rgs.cpp`

[lld] Use *Set::insert_range (NFC) (llvm#133565)

1ff7491

[SandboxVec] Add print-region pass (llvm#131019)

3db5be7

This patch implements a simple printing pass for regions. This is meant to be used in tests and for debugging.

[X86] Use MCRegister. NFC

2ec8837

Merge branch 'main' of https://github.com/llvm/llvm-project into weiw…

fce31b5

…eic/llvm-module-splitter

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ModuleSplitter] llvm module splitter for parallel llvm compilation. #2

[ModuleSplitter] llvm module splitter for parallel llvm compilation. #2

weiweichen commented Mar 25, 2025

[ModuleSplitter] llvm module splitter for parallel llvm compilation. #2

Are you sure you want to change the base?

[ModuleSplitter] llvm module splitter for parallel llvm compilation. #2

Conversation

weiweichen commented Mar 25, 2025