[pull] main from llvm:main #185

pull · 2025-12-01T15:41:20Z

See Commits and Changes for more details.

Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

Runs the `std::shared/unique_ptr` tests with PDB with two changes: - PDB uses the "full" name, so `std::string` is `std::basic_string<char, std::char_traits<char>, std::allocator<char>>` - The type of the pointer inside the shared/unique_ptr isn't the `element_type` typedef

This change introduces a new IR pass in the llc pipeline for NVPTX that transforms sequences of FMUL followed by FADD or FSUB into a single FMA instruction. Currently, all FMA folding for NVPTX occurs at the DAGCombine stage, which is too late for any IR-level passes that might want to optimize or analyze FMAs. By moving this transformation earlier into the IR phase, we enable more opportunities for FMA folding, including across basic blocks. Additionally, this new pass relies on the contract instruction level fast-math flag to perform these transformations, rather than depending on the -fp-contract=fast or -enable-unsafe-fp-math options passed to llc.

Fixed the argument types of the following intrinsics to match with the ISA: - vpdpwssd_128, vpdpwssd_256, vpdpwssd_512, - vpdpwssds_128, vpdpwssds_256, vpdpwssds_512 - vpdpwsud_128, vpdpwsud_256, vpdowsud_512 - vpdpwsuds_128, vpdpwsuds_256, vpdpwsuds_512 - vpdpwusd_128, vpdpwusd_256, vpdpwusd_512 - vpdpwusds_128, vpdpwusds_256, vpdpwusds_512 - vpdpwuud_128, vpdpwuud_256, vpdpwuud_512 - vpdpwuuds_128, vpdpwuuds_256, vpdpwuuds_512 Fixes #97271. Note that this is the last PR for the issue.

LLVM has pretty thorough support for `int128`, and it has started seeing some use. Even thouth we already have support for the `SPV_ALTERA_arbitrary_precision_integers` extension, the BE was oddly capping integer width to 64-bits. This patch adds partial support for lowering 128-bit integers to `OpTypeInt 128`. Some work remains to be done around legalisation support and validating constant uses (e.g. cases that get lowered to `OpSpecConstantOp`).

…and for OpenCL (#167652) For extended imges insts amdgcn_image_sample_*_/gather4_* builtins, using 'x' in the builtin def so that it will take _Float16 for both HIP/C++ and OpenCL.

Added masked compress builtin in CIR. Note: This is my first PR to llvm. Looking forward to corrections --------- Co-authored-by: bhuvan1527 <balabhuvanvarma@gmail.com>

Fixes warning: build.rst:107: WARNING: 'any' reference target not found: https://visualstudio.microsoft.com

Single backticks RST tries to resolve to a reference. Double means plaintext. Fixes these warnings: map.rst:803: WARNING: 'any' reference target not found: target.prefer-dynamic-value map.rst:814: WARNING: 'any' reference target not found: expr

…antiations on non-MSVC targets (#168170) Previously, even when MSVC compatibility was not requested, inline move constructors in dllexport-ed templates were not exported, which was seemingly unintended. On non-MSVC targets (MinGW, Cygwin, and PS), such move constructors should be exported consistently with copy constructors and with the behavior of modern MSVC.

…0875) Define `LHS.subsetOf(RHS)` as a more descriptive name for `!LHS.test(RHS)` and update the existing callers to use that name. Co-authored-by: Jakub Kuderski <jakub@nod-labs.com>

Since PDB doesn't have template information, we need to get the element type from somewhere else. I'm using the type of `_Myval` in a list node, which holds the element type.

…liner.cpp (NFC)

…source values are bitcast from vectors (#171481)

…70993) Add ability to defer parsing and re-enqueueing oneself. This enables changing CallSiteLoc parsing to not recurse as deeply: previously this could fail (especially on large inputs in debug mode the recursion could overflow). Add a default depth cutoff, this could be a parameter later if needed.

Similar to the other PRs, this runs the `std::optional` test with PDB. Since we don't know that variables use typedefs, we check for the full name when testing PDB.

@Masked

This LLVM IR https://godbolt.org/z/5bM1vrMY1 ```llvm define <4 x i32> @Masked(<2 x double> %a, <4 x i32> %src, i8 noundef zeroext %mask) unnamed_addr #0 { %r = tail call <4 x i32> @llvm.x86.avx10.mask.vcvttpd2udqs.128(<2 x double> %a, <4 x i32> %src, i8 noundef %mask) ret <4 x i32> %r } define <4 x i32> @unmasked(<2 x double> %a) unnamed_addr #0 { %r = tail call <4 x i32> @llvm.x86.avx10.mask.vcvttpd2udqs.128(<2 x double> %a, <4 x i32> zeroinitializer, i8 noundef -1) ret <4 x i32> %r } declare <4 x i32> @llvm.x86.avx10.mask.vcvttpd2udqs.128(<2 x double>, <4 x i32>, i8) unnamed_addr attributes #0 = { mustprogress nofree norecurse nosync nounwind nonlazybind willreturn memory(none) uwtable "probe-stack"="inline-asm" "target-cpu"="x86-64" "target-features"="+avx10.2-512" } ``` produces ```asm masked: # @Masked kmovd k1, edi vcvttpd2dqs xmm1 {k1}, xmm0 vmovaps xmm0, xmm1 ret unmasked: # @unmasked vcvttpd2udqs xmm0, xmm0 ret ``` So, when a mask is used, somehow the signed version of this instruction is selected. I suspect this is a typo.

If the subtarget supports flat scratch SVS mode and there is no SGPR available to replace a frame index, convert a scratch instruction in SS form into SV form and replace the frame index with a scavenged VGPR. Resolves #155902 Co-authored-by: Matt Arsenault <matthew.arsenault@amd.com>

Removes the legacy HTML backend and replaces it with the Mustache backend.

… for spawning symbolizer (#170809) Due to a legacy incompatibility with `atos`, we were allocating a pty whenever we spawned the symbolizer. This is no longer necessary and we can use a regular ol' pipe. This PR is split into two commits: - The first removes the pty allocation and replaces it with a pipe. This relocates the `CreateTwoHighNumberedPipes` call to be common to the `posix_spawn` and `StartSubprocess` path. - The second commit adds the `child_stdin_fd_` field to `SymbolizerProcess`, storing the read end of the stdin pipe. By holding on to this fd for the lifetime of the symbolizer, we are able to avoid getting SIGPIPE (which would occur when we write to a pipe whose read-end had been closed due to the death of the symbolizer). This will be very close to solving #120915, but this PR is intentionally not touching the non-posix_spawn path. rdar://165894284

Test case for the mis-compile mentioned in #166247 (comment) The issue is that we don't generate a runtime check even though it is required to vectorize.

Expose the HVXV81 abs, conversion, comparison, log2, negate and mixed subtract intrinsics so Clang can emit the new instructions.

… +1 out argument as a leak (#161633) Make RetainPtrCtorAdoptChecker recognize an assignment to an +1 out argument so that it won't emit a memory leak warning.

Treat a weak Objective-C property, ivar, member variable, and local variable as safe.

…pattern (#161019) Generalize the check for recognizing [[Obj alloc] init] to also recognize [allocObj() init]. We do this by utilizing isAllocInit function in RetainPtrCtorAdoptChecker.

When GeneratedRTChecks::create bails out due to exceeding the cost threshold, no runtime checks are generated and we must not proceed assuming checks have been generated. Mark the checks as never succeeding, to make sure we don't try to vectorize assuming the runtime checks hold. This fixes a case where we previously incorrectly vectorized assuming runtime checks had been generated when forcing vectorization via metadate. Fixes the mis-compile mentioned in #166247 (comment)

…ames. NFC. (#171645) Both `decomposeBitTestICmp` and `decomposeBitTest` have a parameter called `lookThroughTrunc`. This was spelled in full (i.e. `lookThroughTrunc`) in the header. However, in the implementation, it's written as `lookThruTrunc`. I opted to convert all instances of `lookThruTrunc` into `lookThroughTrunc` to reduce surprise while reading the code and for conformity. --- The other change in this PR is the renaming of the wrapper around `decomposeBitTest()`. Even though it was a wrapper around `CmpInstAnalysis.h`'s `decomposeBitTest`, the function was called `decomposeBitTestICmp`. This is quite confusing because such a function _also_ exists in `CmpInstAnalysis.h`, but it is _not_ the one actually being used in `InstCombineAndOrXor.cpp`.

Add `f64:32:64` to the data layout for AIX, to indicate that doubles have a 32-bit ABI alignment and 64-bit preferred alignment. Clang was already taking this into account, but it was not reflected in LLVM's data layout. A notable effect of this change is that `double` loads/stores with 4 byte alignment are no longer considered "unaligned" and avoid the corresponding unaligned access legalization. I assume that this is correct/desired for AIX. (The codegen previously already relied on this in some places related to the call ABI simply by dint of assuming certain stack locations were 8 byte aligned, even though they were only actually 4 byte aligned.) Fixes #133599.

This patch try to move all vl patterns and sd node patterns to RISCVInstrInfoVVLPatterns.td and RISCVInstrInfoVSDPatterns.td respectively. It removes redefinition of pattern classes for zvfbfa and make it easier to maintain and change. Note: this does not include intrinsic patterns, if we want to also unify intrinsic patterns we need to also move pseudo instruction definitions of zvfbfa to RISCVInstrInfoVPseudos.td.

#171460)

…teExtInst` instead of `SPIRVRegularizer` (#170155) This patch consist of 2 parts: * A first part that removes the scalar to vector promotion for built-ins in the `SPIRVRegularizer`; * and a second part that implements the promotion for built-ins from scalar to vector in `generateExtInst`. The implementation in `SPIRVRegularizer` had several issues: * It rolled its own built-in pattern matching that was extremely permissive * the compiler would crash if the built-in had a definition * the compiler would crash if the built-in had no arguments * The compiler would crash if there were more than 2 function definitions in the module. * It'd be better if this was implemented as a module pass; where we iterate over the users of the function, instead of scanning the whole module for callers. This patch does the scalar to vector promotion just before the `OpExtInst` is generated. Without relying on the IR transformation. One change in the generated code from the previous implementation is that this version uses a single `OpCompositeConstruct` operation to convert the scalar into a vector. The old implementation inserted an element at the 0 position in an `undef` vector (using `OpCompositeInsert`); then copied that element for every vector element using `OpVectorShuffle`. This patch also adds a test (`OpExtInst_vector_promotion_bug.ll`) that highlights an issue in the builtin pattern matching that we're using: our pattern matching doesn't consider the number of arguments, only the demangled name, first and last arguments (`min(int,int,int)` matches the same builtin as `min(int, int)`).

Before this patch, `insertelement/extractelement` with dynamic indices would fail to select with `-O0` for vector 32-bit element types with sizes 3, 5, 6 and 7, which did not map to a `SI_INDIRECT_SRC/DST` pattern. Other "weird" sizes bigger than 8 (like 13) are properly handled already. To solve this issue we add the missing patterns for the problematic sizes. Solves SWDEV-568862

…#171651) Allocators should be extremely cheap, if not free, to copy. Furthermore, we have requirements on allocator types that copies must compare equal, and that move and copy must be the same. Hence, taking an allocator by reference should not provide benefits beyond making a copy of it. However, taking the allocator by reference leads to complexity in __split_buffer, which can be removed if we stop using that pattern.

Some [ideas for improvement](#169858 (review)) came up during review of recent changes to `isTRNMask`. This PR applies them also to `isZIPMask`, which is implemented almost identically.

This essentially reverts #100685 and fixes the bidirectional and random access specializations to be actually used. ``` Benchmark old new Difference % Difference ------------------------------------------------------------ -------------- -------------- ------------ -------------- rng::find_end(deque<int>)_(match_near_end)/1000 366.91 47.63 -319.28 -87.02% rng::find_end(deque<int>)_(match_near_end)/1024 3273.31 35.42 -3237.89 -98.92% rng::find_end(deque<int>)_(match_near_end)/8192 171608.41 285.04 -171323.38 -99.83% rng::find_end(deque<int>)_(near_matches)/1000 31808.40 19214.35 -12594.05 -39.59% rng::find_end(deque<int>)_(near_matches)/1024 37428.72 20773.87 -16654.85 -44.50% rng::find_end(deque<int>)_(near_matches)/8192 1719468.34 1213967.45 -505500.89 -29.40% rng::find_end(deque<int>)_(process_all)/1000 275.81 336.29 60.49 21.93% rng::find_end(deque<int>)_(process_all)/1024 258.88 320.36 61.47 23.74% rng::find_end(deque<int>)_(process_all)/1048576 277117.41 327640.37 50522.96 18.23% rng::find_end(deque<int>)_(process_all)/8192 2166.36 2533.52 367.16 16.95% rng::find_end(deque<int>)_(same_length)/1000 1280.06 362.53 -917.53 -71.68% rng::find_end(deque<int>)_(same_length)/1024 1419.99 417.58 -1002.40 -70.59% rng::find_end(deque<int>)_(same_length)/8192 11363.81 2870.63 -8493.18 -74.74% rng::find_end(deque<int>)_(single_element)/1000 277.22 363.52 86.31 31.13% rng::find_end(deque<int>)_(single_element)/1024 257.11 353.94 96.84 37.66% rng::find_end(deque<int>)_(single_element)/8192 2059.02 2762.29 703.27 34.16% rng::find_end(deque<int>,_pred)_(match_near_end)/1000 696.84 70.07 -626.77 -89.94% rng::find_end(deque<int>,_pred)_(match_near_end)/1024 4774.82 70.75 -4704.07 -98.52% rng::find_end(deque<int>,_pred)_(match_near_end)/8192 267492.37 549.57 -266942.81 -99.79% rng::find_end(deque<int>,_pred)_(near_matches)/1000 39414.88 31070.43 -8344.46 -21.17% rng::find_end(deque<int>,_pred)_(near_matches)/1024 38168.52 32362.18 -5806.34 -15.21% rng::find_end(deque<int>,_pred)_(near_matches)/8192 2594717.16 1938056.79 -656660.38 -25.31% rng::find_end(deque<int>,_pred)_(process_all)/1000 600.88 586.92 -13.96 -2.32% rng::find_end(deque<int>,_pred)_(process_all)/1024 613.00 592.66 -20.33 -3.32% rng::find_end(deque<int>,_pred)_(process_all)/1048576 600059.65 603440.98 3381.33 0.56% rng::find_end(deque<int>,_pred)_(process_all)/8192 4850.32 4764.56 -85.76 -1.77% rng::find_end(deque<int>,_pred)_(same_length)/1000 1514.90 700.34 -814.57 -53.77% rng::find_end(deque<int>,_pred)_(same_length)/1024 1561.14 705.80 -855.34 -54.79% rng::find_end(deque<int>,_pred)_(same_length)/8192 12544.84 5024.45 -7520.39 -59.95% rng::find_end(deque<int>,_pred)_(single_element)/1000 603.79 650.63 46.84 7.76% rng::find_end(deque<int>,_pred)_(single_element)/1024 614.93 656.43 41.50 6.75% rng::find_end(deque<int>,_pred)_(single_element)/8192 4885.89 5225.71 339.82 6.96% rng::find_end(forward_list<int>)_(match_near_end)/1000 770.05 769.32 -0.73 -0.09% rng::find_end(forward_list<int>)_(match_near_end)/1024 4833.13 4733.24 -99.90 -2.07% rng::find_end(forward_list<int>)_(match_near_end)/8192 259324.32 261066.84 1742.52 0.67% rng::find_end(forward_list<int>)_(near_matches)/1000 38301.11 38608.61 307.50 0.80% rng::find_end(forward_list<int>)_(near_matches)/1024 39370.54 39878.59 508.05 1.29% rng::find_end(forward_list<int>)_(near_matches)/8192 2527338.50 2527722.47 383.97 0.02% rng::find_end(forward_list<int>)_(process_all)/1000 713.63 720.74 7.11 1.00% rng::find_end(forward_list<int>)_(process_all)/1024 727.81 731.60 3.79 0.52% rng::find_end(forward_list<int>)_(process_all)/1048576 757728.47 766470.14 8741.67 1.15% rng::find_end(forward_list<int>)_(process_all)/8192 5821.05 5817.80 -3.25 -0.06% rng::find_end(forward_list<int>)_(same_length)/1000 1458.99 1454.50 -4.49 -0.31% rng::find_end(forward_list<int>)_(same_length)/1024 1507.73 1515.78 8.05 0.53% rng::find_end(forward_list<int>)_(same_length)/8192 20432.32 18658.93 -1773.39 -8.68% rng::find_end(forward_list<int>)_(single_element)/1000 712.41 708.41 -4.00 -0.56% rng::find_end(forward_list<int>)_(single_element)/1024 728.05 728.78 0.73 0.10% rng::find_end(forward_list<int>)_(single_element)/8192 5795.48 6332.88 537.40 9.27% rng::find_end(forward_list<int>,_pred)_(match_near_end)/1000 843.67 846.77 3.10 0.37% rng::find_end(forward_list<int>,_pred)_(match_near_end)/1024 5267.90 5343.84 75.94 1.44% rng::find_end(forward_list<int>,_pred)_(match_near_end)/8192 280912.75 286141.10 5228.35 1.86% rng::find_end(forward_list<int>,_pred)_(near_matches)/1000 43386.35 44489.38 1103.03 2.54% rng::find_end(forward_list<int>,_pred)_(near_matches)/1024 44929.84 45608.55 678.71 1.51% rng::find_end(forward_list<int>,_pred)_(near_matches)/8192 2723281.29 2765369.43 42088.14 1.55% rng::find_end(forward_list<int>,_pred)_(process_all)/1000 763.13 763.85 0.72 0.09% rng::find_end(forward_list<int>,_pred)_(process_all)/1024 796.98 773.40 -23.58 -2.96% rng::find_end(forward_list<int>,_pred)_(process_all)/1048576 858071.76 846166.06 -11905.69 -1.39% rng::find_end(forward_list<int>,_pred)_(process_all)/8192 6282.19 6244.95 -37.24 -0.59% rng::find_end(forward_list<int>,_pred)_(same_length)/1000 1560.18 1583.03 22.86 1.47% rng::find_end(forward_list<int>,_pred)_(same_length)/1024 1603.94 1612.22 8.28 0.52% rng::find_end(forward_list<int>,_pred)_(same_length)/8192 16907.98 15638.35 -1269.63 -7.51% rng::find_end(forward_list<int>,_pred)_(single_element)/1000 746.72 754.08 7.36 0.99% rng::find_end(forward_list<int>,_pred)_(single_element)/1024 761.27 771.75 10.48 1.38% rng::find_end(forward_list<int>,_pred)_(single_element)/8192 6166.83 6687.87 521.04 8.45% rng::find_end(list<int>)_(match_near_end)/1000 793.99 67.06 -726.93 -91.55% rng::find_end(list<int>)_(match_near_end)/1024 4682.12 79.82 -4602.31 -98.30% rng::find_end(list<int>)_(match_near_end)/8192 263187.10 582.64 -262604.46 -99.78% rng::find_end(list<int>)_(near_matches)/1000 38066.70 34687.59 -3379.11 -8.88% rng::find_end(list<int>)_(near_matches)/1024 39721.77 36150.04 -3571.73 -8.99% rng::find_end(list<int>)_(near_matches)/8192 2543369.85 2247297.03 -296072.82 -11.64% rng::find_end(list<int>)_(process_all)/1000 716.89 726.65 9.76 1.36% rng::find_end(list<int>)_(process_all)/1024 742.41 744.05 1.64 0.22% rng::find_end(list<int>)_(process_all)/1048576 822449.08 873801.46 51352.38 6.24% rng::find_end(list<int>)_(process_all)/8192 7704.49 9766.50 2062.02 26.76% rng::find_end(list<int>)_(same_length)/1000 1508.19 710.90 -797.28 -52.86% rng::find_end(list<int>)_(same_length)/1024 1540.23 735.35 -804.88 -52.26% rng::find_end(list<int>)_(same_length)/8192 22786.44 10752.45 -12033.98 -52.81% rng::find_end(list<int>)_(single_element)/1000 699.16 734.76 35.60 5.09% rng::find_end(list<int>)_(single_element)/1024 717.09 750.91 33.82 4.72% rng::find_end(list<int>)_(single_element)/8192 9502.45 10289.21 786.76 8.28% rng::find_end(list<int>,_pred)_(match_near_end)/1000 841.98 83.86 -758.12 -90.04% rng::find_end(list<int>,_pred)_(match_near_end)/1024 5463.71 76.95 -5386.76 -98.59% rng::find_end(list<int>,_pred)_(match_near_end)/8192 287070.76 647.14 -286423.62 -99.77% rng::find_end(list<int>,_pred)_(near_matches)/1000 43878.61 38899.00 -4979.61 -11.35% rng::find_end(list<int>,_pred)_(near_matches)/1024 45672.50 40520.68 -5151.82 -11.28% rng::find_end(list<int>,_pred)_(near_matches)/8192 2764800.76 2495879.89 -268920.87 -9.73% rng::find_end(list<int>,_pred)_(process_all)/1000 764.46 774.78 10.32 1.35% rng::find_end(list<int>,_pred)_(process_all)/1024 786.81 793.05 6.24 0.79% rng::find_end(list<int>,_pred)_(process_all)/1048576 934166.34 954637.60 20471.26 2.19% rng::find_end(list<int>,_pred)_(process_all)/8192 9509.24 10209.73 700.49 7.37% rng::find_end(list<int>,_pred)_(same_length)/1000 1545.67 782.96 -762.71 -49.34% rng::find_end(list<int>,_pred)_(same_length)/1024 1580.94 796.87 -784.08 -49.60% rng::find_end(list<int>,_pred)_(same_length)/8192 21558.41 13370.92 -8187.49 -37.98% rng::find_end(list<int>,_pred)_(single_element)/1000 766.49 762.81 -3.68 -0.48% rng::find_end(list<int>,_pred)_(single_element)/1024 784.75 781.47 -3.28 -0.42% rng::find_end(list<int>,_pred)_(single_element)/8192 9722.26 10399.11 676.85 6.96% rng::find_end(vector<int>)_(match_near_end)/1000 267.82 25.34 -242.48 -90.54% rng::find_end(vector<int>)_(match_near_end)/1024 2259.46 25.78 -2233.68 -98.86% rng::find_end(vector<int>)_(match_near_end)/8192 119747.92 214.53 -119533.39 -99.82% rng::find_end(vector<int>)_(near_matches)/1000 16913.73 14102.20 -2811.53 -16.62% rng::find_end(vector<int>)_(near_matches)/1024 16097.97 14767.26 -1330.71 -8.27% rng::find_end(vector<int>)_(near_matches)/8192 1102803.07 823463.30 -279339.78 -25.33% rng::find_end(vector<int>)_(process_all)/1000 233.43 380.28 146.85 62.91% rng::find_end(vector<int>)_(process_all)/1024 238.86 389.32 150.46 62.99% rng::find_end(vector<int>)_(process_all)/1048576 269619.36 391698.75 122079.39 45.28% rng::find_end(vector<int>)_(process_all)/8192 2011.46 3061.40 1049.94 52.20% rng::find_end(vector<int>)_(same_length)/1000 632.19 253.50 -378.69 -59.90% rng::find_end(vector<int>)_(same_length)/1024 556.53 254.87 -301.66 -54.20% rng::find_end(vector<int>)_(same_length)/8192 4597.26 2095.57 -2501.68 -54.42% rng::find_end(vector<int>)_(single_element)/1000 231.57 417.64 186.06 80.35% rng::find_end(vector<int>)_(single_element)/1024 236.41 427.03 190.62 80.63% rng::find_end(vector<int>)_(single_element)/8192 1918.95 3367.29 1448.33 75.48% rng::find_end(vector<int>,_pred)_(match_near_end)/1000 581.49 52.67 -528.82 -90.94% rng::find_end(vector<int>,_pred)_(match_near_end)/1024 3545.40 53.74 -3491.65 -98.48% rng::find_end(vector<int>,_pred)_(match_near_end)/8192 190482.78 432.30 -190050.48 -99.77% rng::find_end(vector<int>,_pred)_(near_matches)/1000 28878.24 24723.01 -4155.23 -14.39% rng::find_end(vector<int>,_pred)_(near_matches)/1024 30035.85 25597.45 -4438.40 -14.78% rng::find_end(vector<int>,_pred)_(near_matches)/8192 1858596.45 1584796.11 -273800.34 -14.73% rng::find_end(vector<int>,_pred)_(process_all)/1000 518.92 813.46 294.53 56.76% rng::find_end(vector<int>,_pred)_(process_all)/1024 531.17 710.20 179.03 33.70% rng::find_end(vector<int>,_pred)_(process_all)/1048576 674064.13 905070.15 231006.01 34.27% rng::find_end(vector<int>,_pred)_(process_all)/8192 4254.34 6372.76 2118.43 49.79% rng::find_end(vector<int>,_pred)_(same_length)/1000 1106.96 526.23 -580.73 -52.46% rng::find_end(vector<int>,_pred)_(same_length)/1024 1133.60 539.70 -593.90 -52.39% rng::find_end(vector<int>,_pred)_(same_length)/8192 8988.10 4302.83 -4685.27 -52.13% rng::find_end(vector<int>,_pred)_(single_element)/1000 528.11 523.69 -4.42 -0.84% rng::find_end(vector<int>,_pred)_(single_element)/1024 539.58 838.49 298.91 55.40% rng::find_end(vector<int>,_pred)_(single_element)/8192 4301.43 7313.22 3011.79 70.02% std::find_end(deque<int>)_(match_near_end)/1000 347.82 38.56 -309.26 -88.91% std::find_end(deque<int>)_(match_near_end)/1024 3340.80 34.54 -3306.27 -98.97% std::find_end(deque<int>)_(match_near_end)/8192 171599.83 281.87 -171317.96 -99.84% std::find_end(deque<int>)_(near_matches)/1000 29703.68 19712.27 -9991.41 -33.64% std::find_end(deque<int>)_(near_matches)/1024 32312.41 20008.21 -12304.20 -38.08% std::find_end(deque<int>)_(near_matches)/8192 1851286.99 1216112.34 -635174.65 -34.31% std::find_end(deque<int>)_(process_all)/1000 256.69 315.96 59.27 23.09% std::find_end(deque<int>)_(process_all)/1024 260.97 305.42 44.45 17.03% std::find_end(deque<int>)_(process_all)/1048576 273310.08 309499.13 36189.05 13.24% std::find_end(deque<int>)_(process_all)/8192 2071.33 2606.57 535.25 25.84% std::find_end(deque<int>)_(same_length)/1000 1422.58 441.07 -981.51 -68.99% std::find_end(deque<int>)_(same_length)/1024 1844.27 350.75 -1493.52 -80.98% std::find_end(deque<int>)_(same_length)/8192 14681.69 2839.26 -11842.43 -80.66% std::find_end(deque<int>)_(single_element)/1000 291.63 344.82 53.19 18.24% std::find_end(deque<int>)_(single_element)/1024 257.97 330.19 72.21 27.99% std::find_end(deque<int>)_(single_element)/8192 2220.10 2505.02 284.92 12.83% std::find_end(deque<int>,_pred)_(match_near_end)/1000 694.70 69.60 -625.11 -89.98% std::find_end(deque<int>,_pred)_(match_near_end)/1024 4735.45 71.12 -4664.33 -98.50% std::find_end(deque<int>,_pred)_(match_near_end)/8192 267417.02 561.03 -266855.99 -99.79% std::find_end(deque<int>,_pred)_(near_matches)/1000 42199.71 31597.49 -10602.22 -25.12% std::find_end(deque<int>,_pred)_(near_matches)/1024 38007.49 32362.16 -5645.33 -14.85% std::find_end(deque<int>,_pred)_(near_matches)/8192 2607708.49 1935799.88 -671908.60 -25.77% std::find_end(deque<int>,_pred)_(process_all)/1000 599.65 552.71 -46.94 -7.83% std::find_end(deque<int>,_pred)_(process_all)/1024 615.88 554.17 -61.71 -10.02% std::find_end(deque<int>,_pred)_(process_all)/1048576 598471.63 599441.79 970.16 0.16% std::find_end(deque<int>,_pred)_(process_all)/8192 4853.45 4394.20 -459.25 -9.46% std::find_end(deque<int>,_pred)_(same_length)/1000 1511.68 797.64 -714.04 -47.23% std::find_end(deque<int>,_pred)_(same_length)/1024 1568.63 810.85 -757.78 -48.31% std::find_end(deque<int>,_pred)_(same_length)/8192 12609.34 5092.02 -7517.32 -59.62% std::find_end(deque<int>,_pred)_(single_element)/1000 601.22 628.80 27.58 4.59% std::find_end(deque<int>,_pred)_(single_element)/1024 613.25 627.15 13.89 2.27% std::find_end(deque<int>,_pred)_(single_element)/8192 4823.85 4795.25 -28.60 -0.59% std::find_end(forward_list<int>)_(match_near_end)/1000 762.64 769.74 7.10 0.93% std::find_end(forward_list<int>)_(match_near_end)/1024 4767.93 4840.87 72.94 1.53% std::find_end(forward_list<int>)_(match_near_end)/8192 260275.68 260835.21 559.53 0.21% std::find_end(forward_list<int>)_(near_matches)/1000 38020.76 38197.53 176.77 0.46% std::find_end(forward_list<int>)_(near_matches)/1024 39028.86 39333.38 304.51 0.78% std::find_end(forward_list<int>)_(near_matches)/8192 2524921.48 2523470.32 -1451.16 -0.06% std::find_end(forward_list<int>)_(process_all)/1000 699.95 699.93 -0.02 -0.00% std::find_end(forward_list<int>)_(process_all)/1024 715.24 712.07 -3.17 -0.44% std::find_end(forward_list<int>)_(process_all)/1048576 755926.33 756976.31 1049.98 0.14% std::find_end(forward_list<int>)_(process_all)/8192 5696.72 5672.92 -23.81 -0.42% std::find_end(forward_list<int>)_(same_length)/1000 1485.84 1480.19 -5.65 -0.38% std::find_end(forward_list<int>)_(same_length)/1024 1493.62 1516.95 23.33 1.56% std::find_end(forward_list<int>)_(same_length)/8192 16833.75 13551.42 -3282.33 -19.50% std::find_end(forward_list<int>)_(single_element)/1000 688.87 675.02 -13.85 -2.01% std::find_end(forward_list<int>)_(single_element)/1024 688.89 691.59 2.69 0.39% std::find_end(forward_list<int>)_(single_element)/8192 5735.87 6748.85 1012.98 17.66% std::find_end(forward_list<int>,_pred)_(match_near_end)/1000 836.01 853.28 17.27 2.07% std::find_end(forward_list<int>,_pred)_(match_near_end)/1024 5259.92 5299.30 39.39 0.75% std::find_end(forward_list<int>,_pred)_(match_near_end)/8192 279479.85 285593.49 6113.65 2.19% std::find_end(forward_list<int>,_pred)_(near_matches)/1000 42577.60 44550.54 1972.94 4.63% std::find_end(forward_list<int>,_pred)_(near_matches)/1024 44374.19 45697.95 1323.76 2.98% std::find_end(forward_list<int>,_pred)_(near_matches)/8192 2711138.03 2742988.33 31850.30 1.17% std::find_end(forward_list<int>,_pred)_(process_all)/1000 752.03 762.75 10.72 1.43% std::find_end(forward_list<int>,_pred)_(process_all)/1024 767.04 781.48 14.44 1.88% std::find_end(forward_list<int>,_pred)_(process_all)/1048576 843453.35 861838.82 18385.47 2.18% std::find_end(forward_list<int>,_pred)_(process_all)/8192 6241.65 6308.05 66.40 1.06% std::find_end(forward_list<int>,_pred)_(same_length)/1000 2384.18 1589.21 -794.97 -33.34% std::find_end(forward_list<int>,_pred)_(same_length)/1024 2428.97 1617.17 -811.80 -33.42% std::find_end(forward_list<int>,_pred)_(same_length)/8192 16961.22 14972.86 -1988.36 -11.72% std::find_end(forward_list<int>,_pred)_(single_element)/1000 743.31 752.77 9.47 1.27% std::find_end(forward_list<int>,_pred)_(single_element)/1024 763.62 768.70 5.08 0.67% std::find_end(forward_list<int>,_pred)_(single_element)/8192 6189.73 6934.04 744.31 12.02% std::find_end(list<int>)_(match_near_end)/1000 773.76 76.41 -697.35 -90.12% std::find_end(list<int>)_(match_near_end)/1024 4715.36 69.09 -4646.27 -98.53% std::find_end(list<int>)_(match_near_end)/8192 264864.51 584.19 -264280.32 -99.78% std::find_end(list<int>)_(near_matches)/1000 37650.69 35233.45 -2417.24 -6.42% std::find_end(list<int>)_(near_matches)/1024 39239.25 36699.13 -2540.13 -6.47% std::find_end(list<int>)_(near_matches)/8192 2543446.71 2252625.27 -290821.44 -11.43% std::find_end(list<int>)_(process_all)/1000 718.00 724.59 6.59 0.92% std::find_end(list<int>)_(process_all)/1024 735.14 746.70 11.57 1.57% std::find_end(list<int>)_(process_all)/1048576 812620.48 869606.78 56986.30 7.01% std::find_end(list<int>)_(process_all)/8192 8217.98 8462.53 244.55 2.98% std::find_end(list<int>)_(same_length)/1000 1500.85 716.45 -784.39 -52.26% std::find_end(list<int>)_(same_length)/1024 1534.13 736.62 -797.51 -51.98% std::find_end(list<int>)_(same_length)/8192 20274.06 10621.82 -9652.24 -47.61% std::find_end(list<int>)_(single_element)/1000 717.05 725.64 8.60 1.20% std::find_end(list<int>)_(single_element)/1024 732.87 742.44 9.57 1.31% std::find_end(list<int>)_(single_element)/8192 9835.11 11896.39 2061.28 20.96% std::find_end(list<int>,_pred)_(match_near_end)/1000 845.46 75.09 -770.37 -91.12% std::find_end(list<int>,_pred)_(match_near_end)/1024 5301.60 77.14 -5224.46 -98.54% std::find_end(list<int>,_pred)_(match_near_end)/8192 281976.13 648.87 -281327.25 -99.77% std::find_end(list<int>,_pred)_(near_matches)/1000 44076.98 39576.32 -4500.67 -10.21% std::find_end(list<int>,_pred)_(near_matches)/1024 45531.64 41020.11 -4511.54 -9.91% std::find_end(list<int>,_pred)_(near_matches)/8192 2756383.66 2503085.29 -253298.37 -9.19% std::find_end(list<int>,_pred)_(process_all)/1000 766.06 764.48 -1.58 -0.21% std::find_end(list<int>,_pred)_(process_all)/1024 780.35 799.51 19.15 2.45% std::find_end(list<int>,_pred)_(process_all)/1048576 894643.71 898947.94 4304.24 0.48% std::find_end(list<int>,_pred)_(process_all)/8192 8436.41 9977.74 1541.33 18.27% std::find_end(list<int>,_pred)_(same_length)/1000 1545.22 784.29 -760.92 -49.24% std::find_end(list<int>,_pred)_(same_length)/1024 1583.27 808.52 -774.74 -48.93% std::find_end(list<int>,_pred)_(same_length)/8192 21850.99 10896.50 -10954.48 -50.13% std::find_end(list<int>,_pred)_(single_element)/1000 752.03 755.00 2.97 0.39% std::find_end(list<int>,_pred)_(single_element)/1024 774.22 784.14 9.92 1.28% std::find_end(list<int>,_pred)_(single_element)/8192 10219.43 10396.49 177.05 1.73% std::find_end(vector<int>)_(match_near_end)/1000 277.37 28.45 -248.91 -89.74% std::find_end(vector<int>)_(match_near_end)/1024 2247.56 25.80 -2221.76 -98.85% std::find_end(vector<int>)_(match_near_end)/8192 119785.10 212.44 -119572.66 -99.82% std::find_end(vector<int>)_(near_matches)/1000 16351.34 14073.13 -2278.21 -13.93% std::find_end(vector<int>)_(near_matches)/1024 16656.33 14654.36 -2001.97 -12.02% std::find_end(vector<int>)_(near_matches)/8192 1181392.88 828918.96 -352473.91 -29.84% std::find_end(vector<int>)_(process_all)/1000 231.14 235.80 4.66 2.01% std::find_end(vector<int>)_(process_all)/1024 235.87 232.06 -3.81 -1.61% std::find_end(vector<int>)_(process_all)/1048576 239922.25 238229.38 -1692.87 -0.71% std::find_end(vector<int>)_(process_all)/8192 1837.43 1802.25 -35.19 -1.91% std::find_end(vector<int>)_(same_length)/1000 632.59 252.80 -379.79 -60.04% std::find_end(vector<int>)_(same_length)/1024 524.51 257.58 -266.94 -50.89% std::find_end(vector<int>)_(same_length)/8192 5159.01 2090.12 -3068.89 -59.49% std::find_end(vector<int>)_(single_element)/1000 229.56 250.47 20.91 9.11% std::find_end(vector<int>)_(single_element)/1024 234.86 252.18 17.32 7.37% std::find_end(vector<int>)_(single_element)/8192 1825.74 1981.90 156.16 8.55% std::find_end(vector<int>,_pred)_(match_near_end)/1000 574.17 52.98 -521.19 -90.77% std::find_end(vector<int>,_pred)_(match_near_end)/1024 3525.35 54.03 -3471.32 -98.47% std::find_end(vector<int>,_pred)_(match_near_end)/8192 190155.81 423.41 -189732.40 -99.78% std::find_end(vector<int>,_pred)_(near_matches)/1000 28541.98 24598.37 -3943.61 -13.82% std::find_end(vector<int>,_pred)_(near_matches)/1024 29696.55 25675.27 -4021.28 -13.54% std::find_end(vector<int>,_pred)_(near_matches)/8192 1846970.41 1596191.84 -250778.57 -13.58% std::find_end(vector<int>,_pred)_(process_all)/1000 519.71 592.14 72.43 13.94% std::find_end(vector<int>,_pred)_(process_all)/1024 529.74 491.07 -38.67 -7.30% std::find_end(vector<int>,_pred)_(process_all)/1048576 631923.41 643729.57 11806.16 1.87% std::find_end(vector<int>,_pred)_(process_all)/8192 4215.05 3909.30 -305.75 -7.25% std::find_end(vector<int>,_pred)_(same_length)/1000 1095.46 524.99 -570.47 -52.08% std::find_end(vector<int>,_pred)_(same_length)/1024 1117.95 537.65 -580.31 -51.91% std::find_end(vector<int>,_pred)_(same_length)/8192 8923.95 4307.13 -4616.83 -51.74% std::find_end(vector<int>,_pred)_(single_element)/1000 516.52 656.32 139.80 27.07% std::find_end(vector<int>,_pred)_(single_element)/1024 528.82 673.72 144.90 27.40% std::find_end(vector<int>,_pred)_(single_element)/8192 4210.37 5529.52 1319.15 31.33% Geomean 6995.43 3440.97 -3554.46 -50.81% ```

…iginally legal f64 values that we can store directly. (#171602) Based off feedback from #171478

…1637) They were using the wrong scheduler resource. They're also missing from the optimisation guides, but WriteLD should be closer at least.

…169914) This is technically ABI breaking, since `is_trivial` and `is_trivially_default_constructible` now return different results. However, I don't think that's a significant issue, since `allocator` is almost always used in classes which own memory, making them non-trivial anyways.

…9413) We've seen in quite a few cases while optimizing `__tree`'s copy construction that `_DetachedTreeCache` is actually quite slow and not necessarily an optimization at all. This patch removes the code, since it's now only used by `operator=(initializer_list)`, which should be quite cold code. We might look into actually optimizing it again in the future, but I doubt an optimization will be small enough compared to the likely speedup in real-world code this would give.

…165160) This removes a bit of code duplication and might simplify future segmented iterator optimitations.

Adding Annotation Inference in Lifetime Analysis. This PR implicitly adds lifetime bound annotations to the AST which is then used by functions which are parsed later to detect UARs etc. Example: ```cpp std::string_view f1(std::string_view a) { return a; } std::string_view f2(std::string_view a) { return f1(a); } std::string_view ff(std::string_view a) { std::string stack = "something on stack"; return f2(stack); // warning: address of stack memory is returned } ``` Note: 1. We only add lifetime bound annotations to the functions being analyzed currently. 2. Currently, both annotation suggestion and inference work simultaneously. This can be modified based on requirements. 3. The current approach works given that functions are already present in the correct order (callee-before-caller). For not so ideal cases, we can create a CallGraph prior to calling the analysis. This can be done in the next PR.

Depends upon #170900 Re-land #169544 Previously we were less specific for POINTER/TARGET: encoding that they could alias with (almost) anything. In the new system, the "target data" tree is now a sibling of the other trees (e.g. "global data"). POITNTER variables go at the root of the "target data" tree, whereas TARGET variables get their own nodes under that tree. For example, ``` integer, pointer :: ip real, pointer :: rp integer, target :: it integer, target :: it2(:) real, target :: rt integer :: i real :: r ``` - `ip` and `rp` may alias with any variable except `i` and `r`. - `it`, `it2`, and `rt` may alias only with `ip` or `rp`. - `i` and `r` cannot alias with any other variable. Fortran 2023 15.5.2.14 gives restrictions on entities associated with dummy arguments. These do not allow non-target globals to be modified through dummy arguments and therefore I don't think we need to make all globals alias with dummy arguments. I haven't implemented it in this patch, but I wonder whether it is ever possible for `ip` to alias with `rt`. While I was updating the tests I fixed up some tests that still assumed that local alloc tbaa wasn't the default. Cray pointers/pointees are (optionally) modelled as aliasing with all non-descriptor data. This is not enabled by default. I found no functional regressions in the gfortran test suite.

…0323) (#171787) ``` Step 7 (test-check-all) failure: Test just built components: check-all completed (failure) ******************** TEST 'LLVM :: CodeGen/AMDGPU/insert_vector_dynelt.ll' FAILED ******************** Exit Code: 1 Command Output (stdout): -- # RUN: at line 2 /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/llc -mtriple=amdgcn -mcpu=fiji < /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll | /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/FileCheck -enable-var-scope -check-prefixes=GCN /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll # executed command: /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/llc -mtriple=amdgcn -mcpu=fiji # executed command: /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/FileCheck -enable-var-scope -check-prefixes=GCN /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll # RUN: at line 3 /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/llc -O0 -mtriple=amdgcn -mcpu=fiji < /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll | /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/FileCheck --check-prefixes=GCN-O0 /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll # executed command: /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/llc -O0 -mtriple=amdgcn -mcpu=fiji # .---command stderr------------ # | # | # After Instruction Selection # | # Machine code for function insert_dyn_i32_6: IsSSA, TracksLiveness # | Function Live Ins: $sgpr16 in %8, $sgpr17 in %9, $sgpr18 in %10, $sgpr19 in %11, $sgpr20 in %12, $sgpr21 in %13, $vgpr0 in %14, $vgpr1 in %15 # | # | bb.0 (%ir-block.0): # | successors: %bb.1(0x80000000); %bb.1(100.00%) # | liveins: $sgpr16, $sgpr17, $sgpr18, $sgpr19, $sgpr20, $sgpr21, $vgpr0, $vgpr1 # | %15:vgpr_32 = COPY $vgpr1 # | %14:vgpr_32 = COPY $vgpr0 # | %13:sgpr_32 = COPY $sgpr21 # | %12:sgpr_32 = COPY $sgpr20 # | %11:sgpr_32 = COPY $sgpr19 # | %10:sgpr_32 = COPY $sgpr18 # | %9:sgpr_32 = COPY $sgpr17 # | %8:sgpr_32 = COPY $sgpr16 # | %17:sgpr_192 = REG_SEQUENCE %8:sgpr_32, %subreg.sub0, %9:sgpr_32, %subreg.sub1, %10:sgpr_32, %subreg.sub2, %11:sgpr_32, %subreg.sub3, %12:sgpr_32, %subreg.sub4, %13:sgpr_32, %subreg.sub5 # | %16:sgpr_192 = COPY %17:sgpr_192 # | %19:vreg_192 = COPY %17:sgpr_192 # | %28:sreg_64_xexec = IMPLICIT_DEF # | %27:sreg_64_xexec = S_MOV_B64 $exec # | # | bb.1: # | ; predecessors: %bb.1, %bb.0 # | successors: %bb.1(0x40000000), %bb.3(0x40000000); %bb.1(50.00%), %bb.3(50.00%) # | # | %26:vreg_192 = PHI %19:vreg_192, %bb.0, %18:vreg_192, %bb.1 # | %29:sreg_64 = PHI %28:sreg_64_xexec, %bb.0, %30:sreg_64, %bb.1 # | %31:sreg_32_xm0 = V_READFIRSTLANE_B32 %14:vgpr_32, implicit $exec # | %32:sreg_64 = V_CMP_EQ_U32_e64 %31:sreg_32_xm0, %14:vgpr_32, implicit $exec # | %30:sreg_64 = S_AND_SAVEEXEC_B64 killed %32:sreg_64, implicit-def $exec, implicit-def $scc, implicit $exec # | $m0 = COPY killed %31:sreg_32_xm0 # | %18:vreg_192 = V_INDIRECT_REG_WRITE_MOVREL_B32_V8 %26:vreg_192(tied-def 0), %15:vgpr_32, 3, implicit $m0, implicit $exec # | $exec = S_XOR_B64_term $exec, %30:sreg_64, implicit-def $scc # | S_CBRANCH_EXECNZ %bb.1, implicit $exec # | # | bb.3: ``` This reverts commit 15df9e7.

…1791)

Currently fmul is not reassociated unless it has nsz, although this should be unnecessary.

…171158) Add additional bound for the induction variable of the scf.forall such that: %iv <= %lower_bound + (%trip_count - 1) * step Same as #126426 but for scf.forall loop

The patch updates the lowering of `id` based pmevent also to intrinsics. The mask is simply (1 << event-id). Signed-off-by: Durgadoss R <durgadossr@nvidia.com>

This function contains most of the logic for BTI: - it takes the BasicBlock and the instruction used to jump to it. - Then it checks if the first non-pseudo instruction is a sufficient landing pad for the used call. - if not, it generates the correct BTI instruction. Also introduce the isCallCoveredByBTI helper to simplify the logic.

nsz can only change the behavior of the sign bit. The sign bit for fmul can be implemented as xor, which is associative. DAGCombiner already reassociates the multiply by 2 constants without nsz. Fixes #64967

This patch adds TLS support for SystemZ on top of orc-runtime support. A separate orc-runtime support #171062 has been created from earlier TLS support #[170706](#170706). See conversations in [#170706](#170706) --------- Co-authored-by: anoopkg6 <anoopkg6@github.com>

#171797) This patch fixes toolchain-msvc.test on Windows ARM64 hosts running under native ARM64 environment via vcvarsarm64.bat. Our lab buildbot recently switched from using cross vcvarsamd64_arm64.bat environment to native vcvarsarm64.bat. This patch updates FileCheck patterns to also allow HostARM64 and arm64 PATH entries. Changes: -> Extend host regex to match HostARM64 (case-insensitive) -> Allow arm64 in PATH tail. -> Apply same fix in both 32-bit and 64-bit sections.

This patch adds `ReduceOp::verifyRegions` to ensure that the number of reduction regions equals the number of operands (`getReductions().size() == getOperands().size()`). Additionally, `ParallelOp::verify` is updated to gracefully handle cases where the number of reduce operands differs from the initial values, preventing verification logic crashes and relying on `ReduceOp` to report structural inconsistencies. Fixes: #118768

…171803)

…171616) If the scalar integer sources are freely transferable to the FPU, then perform the bitlogic op as a SSE/AVX operation. Uses the mayFoldIntoVector helper added at #171589

pull bot locked and limited conversation to collaborators Dec 1, 2025

pull bot added the ⤵️ pull label Dec 1, 2025

Nerixyz and others added 28 commits December 9, 2025 18:06

[AMDGPU] Modifies builtin def to take _Float16('x') for both HIP/C++ …

04a5ee6

…and for OpenCL (#167652) For extended imges insts amdgcn_image_sample_*_/gather4_* builtins, using 'x' in the builtin def so that it will take _Float16 for both HIP/C++ and OpenCL.

[CIR][CIRGen][Builtin][X86] Masked compress Intrinsics (#169582)

fa60765

Added masked compress builtin in CIR. Note: This is my first PR to llvm. Looking forward to corrections --------- Co-authored-by: bhuvan1527 <balabhuvanvarma@gmail.com>

[lldb][docs] Fix Visual Studio link in build doc

ab8208f

Fixes warning: build.rst:107: WARNING: 'any' reference target not found: https://visualstudio.microsoft.com

[gn build] Port 1bada0a

0a39d1f

[ADT] BitVector: give subsetOf(RHS) name to !test(RHS) (NFC) (#17…

c0eac77

…0875) Define `LHS.subsetOf(RHS)` as a more descriptive name for `!LHS.test(RHS)` and update the existing callers to use that name. Co-authored-by: Jakub Kuderski <jakub@nod-labs.com>

[LLDB] Run MSVC STL (forward-)list test with PDB (#166953)

cd805a7

Since PDB doesn't have template information, we need to get the element type from somewhere else. I'm using the type of `_Myval` in a list node, which holds the element type.

[MLIR] Apply clang-tidy fixes for readability-identifier-naming in In…

d796d73

…liner.cpp (NFC)

[X86] bitcnt-big-integer.ll - add additional test coverage where the …

00bccfc

…source values are bitcast from vectors (#171481)

[flang][OpenMP] Use DirId() instead of DirName().v, NFC (#171484)

cc25ac4

[LLDB] Run MSVC STL optional test with PDB (#171486)

719826d

Similar to the other PRs, this runs the `std::optional` test with PDB. Since we don't know that variables use typedefs, we check for the full name when testing PDB.

[clang-doc] Replace HTML generation with Mustache backend (#170199)

24117f7

Removes the legacy HTML backend and replaces it with the Mustache backend.

[gn build] Port 24117f7

4bff9fd

[LV] Add test with threshold=0 and metadata forcing vectorization.

7a5e2c9

Test case for the mis-compile mentioned in #166247 (comment) The issue is that we don't generate a runtime check even though it is required to vectorize.

[Hexagon] Add HVX V81 builtins (#170680)

b3d05e6

Expose the HVXV81 abs, conversion, comparison, log2, negate and mixed subtract intrinsics so Clang can emit the new instructions.

[alpha.webkit.RetainPtrCtorAdoptChecker] Don't treat assignment to an…

0eb00ef

… +1 out argument as a leak (#161633) Make RetainPtrCtorAdoptChecker recognize an assignment to an +1 out argument so that it won't emit a memory leak warning.

[WebKit checkers] Treat a weak property / variable as safe (#163689)

f9326ff

Treat a weak Objective-C property, ivar, member variable, and local variable as safe.

[alpha.webkit.UnretainedCallArgsChecker] Recognize [allocObj() init] …

06f0758

…pattern (#161019) Generalize the check for recognizing [[Obj alloc] init] to also recognize [allocObj() init]. We do this by utilizing isAllocInit function in RetainPtrCtorAdoptChecker.

wermos and others added 30 commits December 11, 2025 07:40

[libclc] use clc functions in clspv/shared/vstore_half.cl (#171770)

aa31efc

[RISCV][llvm] Support PSRA, PSRAI, PSRL, PSRLI codegen for P extension (

794551d

#171460)

[AArch64][NFC] Add isTRNMask improvements to isZIPMask (#171532)

db06ebb

Some [ideas for improvement](#169858 (review)) came up during review of recent changes to `isTRNMask`. This PR applies them also to `isZIPMask`, which is implemented almost identically.

[X86] LowerATOMIC_STORE - on 32-bit targets see if i64 values were or…

6573f62

…iginally legal f64 values that we can store directly. (#171602) Based off feedback from #171478

[AArch64] Fix scheduling info for Armv8.4-a LDAPUR* instructions (#17…

4d335cb

…1637) They were using the wrong scheduler resource. They're also missing from the optimisation guides, but WriteLD should be closer at least.

[libc++] Merge the segmented iterator code for {copy,move}_backward (#…

d15ff59

…165160) This removes a bit of code duplication and might simplify future segmented iterator optimitations.

Revert "[lldb] fix failing tests due to CI diagnostics rendering (#17…

0b522d9

…1791)

InstCombine: Add baseline test for #64697 fmul reassociation (#171725)

4f5071f

Currently fmul is not reassociated unless it has nsz, although this should be unnecessary.

[mlir][scf] Add value bound for computed upper bound of forall loop (#…

f8d1f53

…171158) Add additional bound for the induction variable of the scf.forall such that: %iv <= %lower_bound + (%trip_count - 1) * step Same as #126426 but for scf.forall loop

[MLIR][NVVM] Update PMEvent lowering to intrinsics (#171649)

8af88a4

The patch updates the lowering of `id` based pmevent also to intrinsics. The mask is simply (1 << event-id). Signed-off-by: Durgadoss R <durgadossr@nvidia.com>

IR: Stop requiring nsz to reassociate fmul (#171726)

481ce81

nsz can only change the behavior of the sign bit. The sign bit for fmul can be implemented as xor, which is associative. DAGCombiner already reassociates the multiply by 2 constants without nsz. Fixes #64967

[AsmPrinter][NFC] Reuse Target Triple variable (#171612)

9b6b52b

[X86] vector-compare-results.ll - regenerate VPTERNLOG asm comments (#…

9bcba9d

…171803)

[X86] Allow handling of i128/256/512 AND/OR/XOR bitlogic on the FPU (#…

ac2291d

…171616) If the scalar integer sources are freely transferable to the FPU, then perform the bitlogic op as a SSE/AVX operation. Uses the mayFoldIntoVector helper added at #171589

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[pull] main from llvm:main #185

[pull] main from llvm:main #185

Uh oh!

pull bot commented Dec 1, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

[pull] main from llvm:main #185

Are you sure you want to change the base?

[pull] main from llvm:main #185

Uh oh!

Conversation

pull bot commented Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

pull bot commented Dec 1, 2025 •

edited

Loading