Skip to content

Conversation

@pull
Copy link

@pull pull bot commented Dec 1, 2025

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

@pull pull bot locked and limited conversation to collaborators Dec 1, 2025
@pull pull bot added the ⤵️ pull label Dec 1, 2025
Nerixyz and others added 28 commits December 9, 2025 18:06
Runs the `std::shared/unique_ptr` tests with PDB with two changes:

- PDB uses the "full" name, so `std::string` is `std::basic_string<char,
std::char_traits<char>, std::allocator<char>>`
- The type of the pointer inside the shared/unique_ptr isn't the
`element_type` typedef
This change introduces a new IR pass in the llc pipeline for NVPTX that
transforms sequences of FMUL followed by FADD or FSUB into a single FMA
instruction.

Currently, all FMA folding for NVPTX occurs at the DAGCombine stage,
which is too late for any IR-level passes that might want to optimize or
analyze FMAs. By moving this transformation earlier into the IR phase,
we enable more opportunities for FMA folding, including across basic
blocks.

Additionally, this new pass relies on the contract instruction level
fast-math flag to perform these transformations, rather than depending
on the -fp-contract=fast or -enable-unsafe-fp-math options passed to
llc.
Fixed the argument types of the following intrinsics to match with the
ISA:
 - vpdpwssd_128, vpdpwssd_256, vpdpwssd_512,
 - vpdpwssds_128, vpdpwssds_256, vpdpwssds_512
 - vpdpwsud_128, vpdpwsud_256, vpdowsud_512
 - vpdpwsuds_128, vpdpwsuds_256, vpdpwsuds_512
 - vpdpwusd_128, vpdpwusd_256, vpdpwusd_512
 - vpdpwusds_128, vpdpwusds_256, vpdpwusds_512
 - vpdpwuud_128, vpdpwuud_256, vpdpwuud_512
 - vpdpwuuds_128, vpdpwuuds_256, vpdpwuuds_512

Fixes #97271. Note that this is the last PR for the issue.
LLVM has pretty thorough support for `int128`, and it has started seeing
some use. Even thouth we already have support for the
`SPV_ALTERA_arbitrary_precision_integers` extension, the BE was oddly
capping integer width to 64-bits. This patch adds partial support for
lowering 128-bit integers to `OpTypeInt 128`. Some work remains to be
done around legalisation support and validating constant uses (e.g.
cases that get lowered to `OpSpecConstantOp`).
…and for OpenCL (#167652)

For extended imges insts amdgcn_image_sample_*_/gather4_* builtins,
using 'x' in the builtin def so that it will take _Float16 for both
HIP/C++ and OpenCL.
Added masked compress builtin in CIR.
Note: This is my first PR to llvm. Looking forward to corrections

---------

Co-authored-by: bhuvan1527 <balabhuvanvarma@gmail.com>
Fixes warning:
build.rst:107: WARNING: 'any' reference target not found: https://visualstudio.microsoft.com
Single backticks RST tries to resolve to a reference.
Double means plaintext.

Fixes these warnings:
map.rst:803: WARNING: 'any' reference target not found: target.prefer-dynamic-value
map.rst:814: WARNING: 'any' reference target not found: expr
…antiations on non-MSVC targets (#168170)

Previously, even when MSVC compatibility was not requested, inline move
constructors in dllexport-ed templates were not exported, which was
seemingly unintended.
On non-MSVC targets (MinGW, Cygwin, and PS), such move constructors
should be exported consistently with copy constructors and with the
behavior of modern MSVC.
…0875)

Define `LHS.subsetOf(RHS)` as a more descriptive name for `!LHS.test(RHS)`
and update the existing callers to use that name.

Co-authored-by: Jakub Kuderski <jakub@nod-labs.com>
Since PDB doesn't have template information, we need to get the element
type from somewhere else. I'm using the type of `_Myval` in a list node,
which holds the element type.
…70993)

Add ability to defer parsing and re-enqueueing oneself. This enables
changing CallSiteLoc parsing to not recurse as deeply: previously this
could fail (especially on large inputs in debug mode the recursion could
overflow). Add a default depth cutoff, this could be a parameter later
if needed.
Similar to the other PRs, this runs the `std::optional` test with PDB.
Since we don't know that variables use typedefs, we check for the full
name when testing PDB.
This LLVM IR

https://godbolt.org/z/5bM1vrMY1

```llvm
define <4 x i32> @Masked(<2 x double> %a, <4 x i32> %src, i8 noundef zeroext %mask) unnamed_addr #0 {
  %r = tail call <4 x i32> @llvm.x86.avx10.mask.vcvttpd2udqs.128(<2 x double> %a, <4 x i32> %src, i8 noundef %mask)
  ret <4 x i32> %r
}

define <4 x i32> @unmasked(<2 x double> %a) unnamed_addr #0 {
  %r = tail call <4 x i32> @llvm.x86.avx10.mask.vcvttpd2udqs.128(<2 x double> %a, <4 x i32> zeroinitializer, i8 noundef -1)
  ret <4 x i32> %r
}

declare <4 x i32> @llvm.x86.avx10.mask.vcvttpd2udqs.128(<2 x double>, <4 x i32>, i8) unnamed_addr

attributes #0 = { mustprogress nofree norecurse nosync nounwind nonlazybind willreturn memory(none) uwtable "probe-stack"="inline-asm" "target-cpu"="x86-64" "target-features"="+avx10.2-512" }
```

produces 

```asm
masked:                                 # @Masked
        kmovd   k1, edi
        vcvttpd2dqs     xmm1 {k1}, xmm0
        vmovaps xmm0, xmm1
        ret
unmasked:                               # @unmasked
        vcvttpd2udqs    xmm0, xmm0
        ret
```

So, when a mask is used, somehow the signed version of this instruction
is selected. I suspect this is a typo.
If the subtarget supports flat scratch SVS mode and there is no SGPR
available to replace a frame index, convert a scratch instruction in SS
form into SV form and replace the frame index with a scavenged VGPR.
Resolves #155902

Co-authored-by: Matt Arsenault <matthew.arsenault@amd.com>
Removes the legacy HTML backend and replaces it with the Mustache
backend.
… for spawning symbolizer (#170809)

Due to a legacy incompatibility with `atos`, we were allocating a pty
whenever we spawned the symbolizer. This is no longer necessary and we
can use a regular ol' pipe.

This PR is split into two commits:
- The first removes the pty allocation and replaces it with a pipe. This
relocates the `CreateTwoHighNumberedPipes` call to be common to the
`posix_spawn` and `StartSubprocess` path.
- The second commit adds the `child_stdin_fd_` field to
`SymbolizerProcess`, storing the read end of the stdin pipe. By holding
on to this fd for the lifetime of the symbolizer, we are able to avoid
getting SIGPIPE (which would occur when we write to a pipe whose
read-end had been closed due to the death of the symbolizer). This will
be very close to solving #120915, but this PR is intentionally not
touching the non-posix_spawn path.

rdar://165894284
Test case for the mis-compile mentioned in
#166247 (comment)

The issue is that we don't generate a runtime check even though it is
required to vectorize.
Expose the HVXV81 abs, conversion, comparison, log2, negate and mixed
subtract intrinsics so Clang can emit the new instructions.
… +1 out argument as a leak (#161633)

Make RetainPtrCtorAdoptChecker recognize an assignment to an +1 out
argument so that it won't emit a memory leak warning.
Treat a weak Objective-C property, ivar, member variable, and local
variable as safe.
…pattern (#161019)

Generalize the check for recognizing [[Obj alloc] init] to also
recognize [allocObj() init]. We do this by utilizing isAllocInit
function in RetainPtrCtorAdoptChecker.
When GeneratedRTChecks::create bails out due to exceeding the cost
threshold, no runtime checks are generated and we must not proceed
assuming checks have been generated.

Mark the checks as never succeeding, to make sure we don't try to
vectorize assuming the runtime checks hold. This fixes a case where we
previously incorrectly vectorized assuming runtime checks had been
generated when forcing vectorization via metadate.

Fixes the mis-compile mentioned in
#166247 (comment)
wermos and others added 30 commits December 11, 2025 07:40
…ames. NFC. (#171645)

Both `decomposeBitTestICmp` and `decomposeBitTest` have a parameter
called `lookThroughTrunc`. This was spelled in full (i.e. `lookThroughTrunc`)
in the header. However, in the implementation, it's written as `lookThruTrunc`.

I opted to convert all instances of `lookThruTrunc` into
`lookThroughTrunc` to reduce surprise while reading the code and for
conformity.

---

The other change in this PR is the renaming of the wrapper around
`decomposeBitTest()`. Even though it was a wrapper around
`CmpInstAnalysis.h`'s `decomposeBitTest`, the function was called
`decomposeBitTestICmp`. This is quite confusing because such a function
_also_ exists in `CmpInstAnalysis.h`, but it is _not_ the one actually
being used in `InstCombineAndOrXor.cpp`.
Add `f64:32:64` to the data layout for AIX, to indicate that doubles
have a 32-bit ABI alignment and 64-bit preferred alignment.

Clang was already taking this into account, but it was not reflected in
LLVM's data layout.

A notable effect of this change is that `double` loads/stores with 4
byte alignment are no longer considered "unaligned" and avoid the
corresponding unaligned access legalization. I assume that this is
correct/desired for AIX. (The codegen previously already relied on this
in some places related to the call ABI simply by dint of assuming
certain stack locations were 8 byte aligned, even though they were only
actually 4 byte aligned.)

Fixes #133599.
This patch try to move all vl patterns and sd node patterns to
RISCVInstrInfoVVLPatterns.td and RISCVInstrInfoVSDPatterns.td
respectively. It removes redefinition of pattern classes for zvfbfa and
make it easier to maintain and change.

Note: this does not include intrinsic patterns, if we want to also unify
intrinsic patterns we need to also move pseudo instruction definitions
of zvfbfa to RISCVInstrInfoVPseudos.td.
…teExtInst` instead of `SPIRVRegularizer` (#170155)

This patch consist of 2 parts:
* A first part that removes the scalar to vector promotion for built-ins
in the `SPIRVRegularizer`;
* and a second part that implements the promotion for built-ins from
scalar to vector in `generateExtInst`.

The implementation in `SPIRVRegularizer` had several issues:
* It rolled its own built-in pattern matching that was extremely
permissive
  * the compiler would crash if the built-in had a definition
  * the compiler would crash if the built-in had no arguments
* The compiler would crash if there were more than 2 function
definitions in the module.
* It'd be better if this was implemented as a module pass; where we
iterate over the users of the function, instead of scanning the whole
module for callers.

This patch does the scalar to vector promotion just before the
`OpExtInst` is generated. Without relying on the IR transformation.

One change in the generated code from the previous implementation is
that this version uses a single `OpCompositeConstruct` operation to
convert the scalar into a vector. The old implementation inserted an
element at the 0 position in an `undef` vector (using
`OpCompositeInsert`); then copied that element for every vector element
using `OpVectorShuffle`.

This patch also adds a test (`OpExtInst_vector_promotion_bug.ll`) that
highlights an issue in the builtin pattern matching that we're using:
our pattern matching doesn't consider the number of arguments, only the
demangled name, first and last arguments (`min(int,int,int)` matches the same builtin as `min(int, int)`).
Before this patch, `insertelement/extractelement` with dynamic indices
would
fail to select with `-O0` for vector 32-bit element types with sizes 3,
5, 6 and 7,
which did not map to a `SI_INDIRECT_SRC/DST` pattern.

Other "weird" sizes bigger than 8 (like 13) are properly handled
already.

To solve this issue we add the missing patterns for the problematic
sizes.

Solves SWDEV-568862
…#171651)

Allocators should be extremely cheap, if not free, to copy. Furthermore,
we have requirements on allocator types that copies must compare equal,
and that move and copy must be the same.

Hence, taking an allocator by reference should not provide benefits
beyond making a copy of it. However, taking the allocator by reference
leads to complexity in __split_buffer, which can be removed if we stop
using that pattern.
Some [ideas for
improvement](#169858 (review))
came up during review of recent changes to `isTRNMask`.
This PR applies them also to `isZIPMask`, which is implemented almost
identically.
This essentially reverts #100685 and fixes the bidirectional and random
access specializations to be actually used.

```
Benchmark                                                                old             new    Difference    % Difference
------------------------------------------------------------  --------------  --------------  ------------  --------------
rng::find_end(deque<int>)_(match_near_end)/1000                       366.91           47.63       -319.28         -87.02%
rng::find_end(deque<int>)_(match_near_end)/1024                      3273.31           35.42      -3237.89         -98.92%
rng::find_end(deque<int>)_(match_near_end)/8192                    171608.41          285.04    -171323.38         -99.83%
rng::find_end(deque<int>)_(near_matches)/1000                       31808.40        19214.35     -12594.05         -39.59%
rng::find_end(deque<int>)_(near_matches)/1024                       37428.72        20773.87     -16654.85         -44.50%
rng::find_end(deque<int>)_(near_matches)/8192                     1719468.34      1213967.45    -505500.89         -29.40%
rng::find_end(deque<int>)_(process_all)/1000                          275.81          336.29         60.49          21.93%
rng::find_end(deque<int>)_(process_all)/1024                          258.88          320.36         61.47          23.74%
rng::find_end(deque<int>)_(process_all)/1048576                    277117.41       327640.37      50522.96          18.23%
rng::find_end(deque<int>)_(process_all)/8192                         2166.36         2533.52        367.16          16.95%
rng::find_end(deque<int>)_(same_length)/1000                         1280.06          362.53       -917.53         -71.68%
rng::find_end(deque<int>)_(same_length)/1024                         1419.99          417.58      -1002.40         -70.59%
rng::find_end(deque<int>)_(same_length)/8192                        11363.81         2870.63      -8493.18         -74.74%
rng::find_end(deque<int>)_(single_element)/1000                       277.22          363.52         86.31          31.13%
rng::find_end(deque<int>)_(single_element)/1024                       257.11          353.94         96.84          37.66%
rng::find_end(deque<int>)_(single_element)/8192                      2059.02         2762.29        703.27          34.16%
rng::find_end(deque<int>,_pred)_(match_near_end)/1000                 696.84           70.07       -626.77         -89.94%
rng::find_end(deque<int>,_pred)_(match_near_end)/1024                4774.82           70.75      -4704.07         -98.52%
rng::find_end(deque<int>,_pred)_(match_near_end)/8192              267492.37          549.57    -266942.81         -99.79%
rng::find_end(deque<int>,_pred)_(near_matches)/1000                 39414.88        31070.43      -8344.46         -21.17%
rng::find_end(deque<int>,_pred)_(near_matches)/1024                 38168.52        32362.18      -5806.34         -15.21%
rng::find_end(deque<int>,_pred)_(near_matches)/8192               2594717.16      1938056.79    -656660.38         -25.31%
rng::find_end(deque<int>,_pred)_(process_all)/1000                    600.88          586.92        -13.96          -2.32%
rng::find_end(deque<int>,_pred)_(process_all)/1024                    613.00          592.66        -20.33          -3.32%
rng::find_end(deque<int>,_pred)_(process_all)/1048576              600059.65       603440.98       3381.33           0.56%
rng::find_end(deque<int>,_pred)_(process_all)/8192                   4850.32         4764.56        -85.76          -1.77%
rng::find_end(deque<int>,_pred)_(same_length)/1000                   1514.90          700.34       -814.57         -53.77%
rng::find_end(deque<int>,_pred)_(same_length)/1024                   1561.14          705.80       -855.34         -54.79%
rng::find_end(deque<int>,_pred)_(same_length)/8192                  12544.84         5024.45      -7520.39         -59.95%
rng::find_end(deque<int>,_pred)_(single_element)/1000                 603.79          650.63         46.84           7.76%
rng::find_end(deque<int>,_pred)_(single_element)/1024                 614.93          656.43         41.50           6.75%
rng::find_end(deque<int>,_pred)_(single_element)/8192                4885.89         5225.71        339.82           6.96%
rng::find_end(forward_list<int>)_(match_near_end)/1000                770.05          769.32         -0.73          -0.09%
rng::find_end(forward_list<int>)_(match_near_end)/1024               4833.13         4733.24        -99.90          -2.07%
rng::find_end(forward_list<int>)_(match_near_end)/8192             259324.32       261066.84       1742.52           0.67%
rng::find_end(forward_list<int>)_(near_matches)/1000                38301.11        38608.61        307.50           0.80%
rng::find_end(forward_list<int>)_(near_matches)/1024                39370.54        39878.59        508.05           1.29%
rng::find_end(forward_list<int>)_(near_matches)/8192              2527338.50      2527722.47        383.97           0.02%
rng::find_end(forward_list<int>)_(process_all)/1000                   713.63          720.74          7.11           1.00%
rng::find_end(forward_list<int>)_(process_all)/1024                   727.81          731.60          3.79           0.52%
rng::find_end(forward_list<int>)_(process_all)/1048576             757728.47       766470.14       8741.67           1.15%
rng::find_end(forward_list<int>)_(process_all)/8192                  5821.05         5817.80         -3.25          -0.06%
rng::find_end(forward_list<int>)_(same_length)/1000                  1458.99         1454.50         -4.49          -0.31%
rng::find_end(forward_list<int>)_(same_length)/1024                  1507.73         1515.78          8.05           0.53%
rng::find_end(forward_list<int>)_(same_length)/8192                 20432.32        18658.93      -1773.39          -8.68%
rng::find_end(forward_list<int>)_(single_element)/1000                712.41          708.41         -4.00          -0.56%
rng::find_end(forward_list<int>)_(single_element)/1024                728.05          728.78          0.73           0.10%
rng::find_end(forward_list<int>)_(single_element)/8192               5795.48         6332.88        537.40           9.27%
rng::find_end(forward_list<int>,_pred)_(match_near_end)/1000          843.67          846.77          3.10           0.37%
rng::find_end(forward_list<int>,_pred)_(match_near_end)/1024         5267.90         5343.84         75.94           1.44%
rng::find_end(forward_list<int>,_pred)_(match_near_end)/8192       280912.75       286141.10       5228.35           1.86%
rng::find_end(forward_list<int>,_pred)_(near_matches)/1000          43386.35        44489.38       1103.03           2.54%
rng::find_end(forward_list<int>,_pred)_(near_matches)/1024          44929.84        45608.55        678.71           1.51%
rng::find_end(forward_list<int>,_pred)_(near_matches)/8192        2723281.29      2765369.43      42088.14           1.55%
rng::find_end(forward_list<int>,_pred)_(process_all)/1000             763.13          763.85          0.72           0.09%
rng::find_end(forward_list<int>,_pred)_(process_all)/1024             796.98          773.40        -23.58          -2.96%
rng::find_end(forward_list<int>,_pred)_(process_all)/1048576       858071.76       846166.06     -11905.69          -1.39%
rng::find_end(forward_list<int>,_pred)_(process_all)/8192            6282.19         6244.95        -37.24          -0.59%
rng::find_end(forward_list<int>,_pred)_(same_length)/1000            1560.18         1583.03         22.86           1.47%
rng::find_end(forward_list<int>,_pred)_(same_length)/1024            1603.94         1612.22          8.28           0.52%
rng::find_end(forward_list<int>,_pred)_(same_length)/8192           16907.98        15638.35      -1269.63          -7.51%
rng::find_end(forward_list<int>,_pred)_(single_element)/1000          746.72          754.08          7.36           0.99%
rng::find_end(forward_list<int>,_pred)_(single_element)/1024          761.27          771.75         10.48           1.38%
rng::find_end(forward_list<int>,_pred)_(single_element)/8192         6166.83         6687.87        521.04           8.45%
rng::find_end(list<int>)_(match_near_end)/1000                        793.99           67.06       -726.93         -91.55%
rng::find_end(list<int>)_(match_near_end)/1024                       4682.12           79.82      -4602.31         -98.30%
rng::find_end(list<int>)_(match_near_end)/8192                     263187.10          582.64    -262604.46         -99.78%
rng::find_end(list<int>)_(near_matches)/1000                        38066.70        34687.59      -3379.11          -8.88%
rng::find_end(list<int>)_(near_matches)/1024                        39721.77        36150.04      -3571.73          -8.99%
rng::find_end(list<int>)_(near_matches)/8192                      2543369.85      2247297.03    -296072.82         -11.64%
rng::find_end(list<int>)_(process_all)/1000                           716.89          726.65          9.76           1.36%
rng::find_end(list<int>)_(process_all)/1024                           742.41          744.05          1.64           0.22%
rng::find_end(list<int>)_(process_all)/1048576                     822449.08       873801.46      51352.38           6.24%
rng::find_end(list<int>)_(process_all)/8192                          7704.49         9766.50       2062.02          26.76%
rng::find_end(list<int>)_(same_length)/1000                          1508.19          710.90       -797.28         -52.86%
rng::find_end(list<int>)_(same_length)/1024                          1540.23          735.35       -804.88         -52.26%
rng::find_end(list<int>)_(same_length)/8192                         22786.44        10752.45     -12033.98         -52.81%
rng::find_end(list<int>)_(single_element)/1000                        699.16          734.76         35.60           5.09%
rng::find_end(list<int>)_(single_element)/1024                        717.09          750.91         33.82           4.72%
rng::find_end(list<int>)_(single_element)/8192                       9502.45        10289.21        786.76           8.28%
rng::find_end(list<int>,_pred)_(match_near_end)/1000                  841.98           83.86       -758.12         -90.04%
rng::find_end(list<int>,_pred)_(match_near_end)/1024                 5463.71           76.95      -5386.76         -98.59%
rng::find_end(list<int>,_pred)_(match_near_end)/8192               287070.76          647.14    -286423.62         -99.77%
rng::find_end(list<int>,_pred)_(near_matches)/1000                  43878.61        38899.00      -4979.61         -11.35%
rng::find_end(list<int>,_pred)_(near_matches)/1024                  45672.50        40520.68      -5151.82         -11.28%
rng::find_end(list<int>,_pred)_(near_matches)/8192                2764800.76      2495879.89    -268920.87          -9.73%
rng::find_end(list<int>,_pred)_(process_all)/1000                     764.46          774.78         10.32           1.35%
rng::find_end(list<int>,_pred)_(process_all)/1024                     786.81          793.05          6.24           0.79%
rng::find_end(list<int>,_pred)_(process_all)/1048576               934166.34       954637.60      20471.26           2.19%
rng::find_end(list<int>,_pred)_(process_all)/8192                    9509.24        10209.73        700.49           7.37%
rng::find_end(list<int>,_pred)_(same_length)/1000                    1545.67          782.96       -762.71         -49.34%
rng::find_end(list<int>,_pred)_(same_length)/1024                    1580.94          796.87       -784.08         -49.60%
rng::find_end(list<int>,_pred)_(same_length)/8192                   21558.41        13370.92      -8187.49         -37.98%
rng::find_end(list<int>,_pred)_(single_element)/1000                  766.49          762.81         -3.68          -0.48%
rng::find_end(list<int>,_pred)_(single_element)/1024                  784.75          781.47         -3.28          -0.42%
rng::find_end(list<int>,_pred)_(single_element)/8192                 9722.26        10399.11        676.85           6.96%
rng::find_end(vector<int>)_(match_near_end)/1000                      267.82           25.34       -242.48         -90.54%
rng::find_end(vector<int>)_(match_near_end)/1024                     2259.46           25.78      -2233.68         -98.86%
rng::find_end(vector<int>)_(match_near_end)/8192                   119747.92          214.53    -119533.39         -99.82%
rng::find_end(vector<int>)_(near_matches)/1000                      16913.73        14102.20      -2811.53         -16.62%
rng::find_end(vector<int>)_(near_matches)/1024                      16097.97        14767.26      -1330.71          -8.27%
rng::find_end(vector<int>)_(near_matches)/8192                    1102803.07       823463.30    -279339.78         -25.33%
rng::find_end(vector<int>)_(process_all)/1000                         233.43          380.28        146.85          62.91%
rng::find_end(vector<int>)_(process_all)/1024                         238.86          389.32        150.46          62.99%
rng::find_end(vector<int>)_(process_all)/1048576                   269619.36       391698.75     122079.39          45.28%
rng::find_end(vector<int>)_(process_all)/8192                        2011.46         3061.40       1049.94          52.20%
rng::find_end(vector<int>)_(same_length)/1000                         632.19          253.50       -378.69         -59.90%
rng::find_end(vector<int>)_(same_length)/1024                         556.53          254.87       -301.66         -54.20%
rng::find_end(vector<int>)_(same_length)/8192                        4597.26         2095.57      -2501.68         -54.42%
rng::find_end(vector<int>)_(single_element)/1000                      231.57          417.64        186.06          80.35%
rng::find_end(vector<int>)_(single_element)/1024                      236.41          427.03        190.62          80.63%
rng::find_end(vector<int>)_(single_element)/8192                     1918.95         3367.29       1448.33          75.48%
rng::find_end(vector<int>,_pred)_(match_near_end)/1000                581.49           52.67       -528.82         -90.94%
rng::find_end(vector<int>,_pred)_(match_near_end)/1024               3545.40           53.74      -3491.65         -98.48%
rng::find_end(vector<int>,_pred)_(match_near_end)/8192             190482.78          432.30    -190050.48         -99.77%
rng::find_end(vector<int>,_pred)_(near_matches)/1000                28878.24        24723.01      -4155.23         -14.39%
rng::find_end(vector<int>,_pred)_(near_matches)/1024                30035.85        25597.45      -4438.40         -14.78%
rng::find_end(vector<int>,_pred)_(near_matches)/8192              1858596.45      1584796.11    -273800.34         -14.73%
rng::find_end(vector<int>,_pred)_(process_all)/1000                   518.92          813.46        294.53          56.76%
rng::find_end(vector<int>,_pred)_(process_all)/1024                   531.17          710.20        179.03          33.70%
rng::find_end(vector<int>,_pred)_(process_all)/1048576             674064.13       905070.15     231006.01          34.27%
rng::find_end(vector<int>,_pred)_(process_all)/8192                  4254.34         6372.76       2118.43          49.79%
rng::find_end(vector<int>,_pred)_(same_length)/1000                  1106.96          526.23       -580.73         -52.46%
rng::find_end(vector<int>,_pred)_(same_length)/1024                  1133.60          539.70       -593.90         -52.39%
rng::find_end(vector<int>,_pred)_(same_length)/8192                  8988.10         4302.83      -4685.27         -52.13%
rng::find_end(vector<int>,_pred)_(single_element)/1000                528.11          523.69         -4.42          -0.84%
rng::find_end(vector<int>,_pred)_(single_element)/1024                539.58          838.49        298.91          55.40%
rng::find_end(vector<int>,_pred)_(single_element)/8192               4301.43         7313.22       3011.79          70.02%
std::find_end(deque<int>)_(match_near_end)/1000                       347.82           38.56       -309.26         -88.91%
std::find_end(deque<int>)_(match_near_end)/1024                      3340.80           34.54      -3306.27         -98.97%
std::find_end(deque<int>)_(match_near_end)/8192                    171599.83          281.87    -171317.96         -99.84%
std::find_end(deque<int>)_(near_matches)/1000                       29703.68        19712.27      -9991.41         -33.64%
std::find_end(deque<int>)_(near_matches)/1024                       32312.41        20008.21     -12304.20         -38.08%
std::find_end(deque<int>)_(near_matches)/8192                     1851286.99      1216112.34    -635174.65         -34.31%
std::find_end(deque<int>)_(process_all)/1000                          256.69          315.96         59.27          23.09%
std::find_end(deque<int>)_(process_all)/1024                          260.97          305.42         44.45          17.03%
std::find_end(deque<int>)_(process_all)/1048576                    273310.08       309499.13      36189.05          13.24%
std::find_end(deque<int>)_(process_all)/8192                         2071.33         2606.57        535.25          25.84%
std::find_end(deque<int>)_(same_length)/1000                         1422.58          441.07       -981.51         -68.99%
std::find_end(deque<int>)_(same_length)/1024                         1844.27          350.75      -1493.52         -80.98%
std::find_end(deque<int>)_(same_length)/8192                        14681.69         2839.26     -11842.43         -80.66%
std::find_end(deque<int>)_(single_element)/1000                       291.63          344.82         53.19          18.24%
std::find_end(deque<int>)_(single_element)/1024                       257.97          330.19         72.21          27.99%
std::find_end(deque<int>)_(single_element)/8192                      2220.10         2505.02        284.92          12.83%
std::find_end(deque<int>,_pred)_(match_near_end)/1000                 694.70           69.60       -625.11         -89.98%
std::find_end(deque<int>,_pred)_(match_near_end)/1024                4735.45           71.12      -4664.33         -98.50%
std::find_end(deque<int>,_pred)_(match_near_end)/8192              267417.02          561.03    -266855.99         -99.79%
std::find_end(deque<int>,_pred)_(near_matches)/1000                 42199.71        31597.49     -10602.22         -25.12%
std::find_end(deque<int>,_pred)_(near_matches)/1024                 38007.49        32362.16      -5645.33         -14.85%
std::find_end(deque<int>,_pred)_(near_matches)/8192               2607708.49      1935799.88    -671908.60         -25.77%
std::find_end(deque<int>,_pred)_(process_all)/1000                    599.65          552.71        -46.94          -7.83%
std::find_end(deque<int>,_pred)_(process_all)/1024                    615.88          554.17        -61.71         -10.02%
std::find_end(deque<int>,_pred)_(process_all)/1048576              598471.63       599441.79        970.16           0.16%
std::find_end(deque<int>,_pred)_(process_all)/8192                   4853.45         4394.20       -459.25          -9.46%
std::find_end(deque<int>,_pred)_(same_length)/1000                   1511.68          797.64       -714.04         -47.23%
std::find_end(deque<int>,_pred)_(same_length)/1024                   1568.63          810.85       -757.78         -48.31%
std::find_end(deque<int>,_pred)_(same_length)/8192                  12609.34         5092.02      -7517.32         -59.62%
std::find_end(deque<int>,_pred)_(single_element)/1000                 601.22          628.80         27.58           4.59%
std::find_end(deque<int>,_pred)_(single_element)/1024                 613.25          627.15         13.89           2.27%
std::find_end(deque<int>,_pred)_(single_element)/8192                4823.85         4795.25        -28.60          -0.59%
std::find_end(forward_list<int>)_(match_near_end)/1000                762.64          769.74          7.10           0.93%
std::find_end(forward_list<int>)_(match_near_end)/1024               4767.93         4840.87         72.94           1.53%
std::find_end(forward_list<int>)_(match_near_end)/8192             260275.68       260835.21        559.53           0.21%
std::find_end(forward_list<int>)_(near_matches)/1000                38020.76        38197.53        176.77           0.46%
std::find_end(forward_list<int>)_(near_matches)/1024                39028.86        39333.38        304.51           0.78%
std::find_end(forward_list<int>)_(near_matches)/8192              2524921.48      2523470.32      -1451.16          -0.06%
std::find_end(forward_list<int>)_(process_all)/1000                   699.95          699.93         -0.02          -0.00%
std::find_end(forward_list<int>)_(process_all)/1024                   715.24          712.07         -3.17          -0.44%
std::find_end(forward_list<int>)_(process_all)/1048576             755926.33       756976.31       1049.98           0.14%
std::find_end(forward_list<int>)_(process_all)/8192                  5696.72         5672.92        -23.81          -0.42%
std::find_end(forward_list<int>)_(same_length)/1000                  1485.84         1480.19         -5.65          -0.38%
std::find_end(forward_list<int>)_(same_length)/1024                  1493.62         1516.95         23.33           1.56%
std::find_end(forward_list<int>)_(same_length)/8192                 16833.75        13551.42      -3282.33         -19.50%
std::find_end(forward_list<int>)_(single_element)/1000                688.87          675.02        -13.85          -2.01%
std::find_end(forward_list<int>)_(single_element)/1024                688.89          691.59          2.69           0.39%
std::find_end(forward_list<int>)_(single_element)/8192               5735.87         6748.85       1012.98          17.66%
std::find_end(forward_list<int>,_pred)_(match_near_end)/1000          836.01          853.28         17.27           2.07%
std::find_end(forward_list<int>,_pred)_(match_near_end)/1024         5259.92         5299.30         39.39           0.75%
std::find_end(forward_list<int>,_pred)_(match_near_end)/8192       279479.85       285593.49       6113.65           2.19%
std::find_end(forward_list<int>,_pred)_(near_matches)/1000          42577.60        44550.54       1972.94           4.63%
std::find_end(forward_list<int>,_pred)_(near_matches)/1024          44374.19        45697.95       1323.76           2.98%
std::find_end(forward_list<int>,_pred)_(near_matches)/8192        2711138.03      2742988.33      31850.30           1.17%
std::find_end(forward_list<int>,_pred)_(process_all)/1000             752.03          762.75         10.72           1.43%
std::find_end(forward_list<int>,_pred)_(process_all)/1024             767.04          781.48         14.44           1.88%
std::find_end(forward_list<int>,_pred)_(process_all)/1048576       843453.35       861838.82      18385.47           2.18%
std::find_end(forward_list<int>,_pred)_(process_all)/8192            6241.65         6308.05         66.40           1.06%
std::find_end(forward_list<int>,_pred)_(same_length)/1000            2384.18         1589.21       -794.97         -33.34%
std::find_end(forward_list<int>,_pred)_(same_length)/1024            2428.97         1617.17       -811.80         -33.42%
std::find_end(forward_list<int>,_pred)_(same_length)/8192           16961.22        14972.86      -1988.36         -11.72%
std::find_end(forward_list<int>,_pred)_(single_element)/1000          743.31          752.77          9.47           1.27%
std::find_end(forward_list<int>,_pred)_(single_element)/1024          763.62          768.70          5.08           0.67%
std::find_end(forward_list<int>,_pred)_(single_element)/8192         6189.73         6934.04        744.31          12.02%
std::find_end(list<int>)_(match_near_end)/1000                        773.76           76.41       -697.35         -90.12%
std::find_end(list<int>)_(match_near_end)/1024                       4715.36           69.09      -4646.27         -98.53%
std::find_end(list<int>)_(match_near_end)/8192                     264864.51          584.19    -264280.32         -99.78%
std::find_end(list<int>)_(near_matches)/1000                        37650.69        35233.45      -2417.24          -6.42%
std::find_end(list<int>)_(near_matches)/1024                        39239.25        36699.13      -2540.13          -6.47%
std::find_end(list<int>)_(near_matches)/8192                      2543446.71      2252625.27    -290821.44         -11.43%
std::find_end(list<int>)_(process_all)/1000                           718.00          724.59          6.59           0.92%
std::find_end(list<int>)_(process_all)/1024                           735.14          746.70         11.57           1.57%
std::find_end(list<int>)_(process_all)/1048576                     812620.48       869606.78      56986.30           7.01%
std::find_end(list<int>)_(process_all)/8192                          8217.98         8462.53        244.55           2.98%
std::find_end(list<int>)_(same_length)/1000                          1500.85          716.45       -784.39         -52.26%
std::find_end(list<int>)_(same_length)/1024                          1534.13          736.62       -797.51         -51.98%
std::find_end(list<int>)_(same_length)/8192                         20274.06        10621.82      -9652.24         -47.61%
std::find_end(list<int>)_(single_element)/1000                        717.05          725.64          8.60           1.20%
std::find_end(list<int>)_(single_element)/1024                        732.87          742.44          9.57           1.31%
std::find_end(list<int>)_(single_element)/8192                       9835.11        11896.39       2061.28          20.96%
std::find_end(list<int>,_pred)_(match_near_end)/1000                  845.46           75.09       -770.37         -91.12%
std::find_end(list<int>,_pred)_(match_near_end)/1024                 5301.60           77.14      -5224.46         -98.54%
std::find_end(list<int>,_pred)_(match_near_end)/8192               281976.13          648.87    -281327.25         -99.77%
std::find_end(list<int>,_pred)_(near_matches)/1000                  44076.98        39576.32      -4500.67         -10.21%
std::find_end(list<int>,_pred)_(near_matches)/1024                  45531.64        41020.11      -4511.54          -9.91%
std::find_end(list<int>,_pred)_(near_matches)/8192                2756383.66      2503085.29    -253298.37          -9.19%
std::find_end(list<int>,_pred)_(process_all)/1000                     766.06          764.48         -1.58          -0.21%
std::find_end(list<int>,_pred)_(process_all)/1024                     780.35          799.51         19.15           2.45%
std::find_end(list<int>,_pred)_(process_all)/1048576               894643.71       898947.94       4304.24           0.48%
std::find_end(list<int>,_pred)_(process_all)/8192                    8436.41         9977.74       1541.33          18.27%
std::find_end(list<int>,_pred)_(same_length)/1000                    1545.22          784.29       -760.92         -49.24%
std::find_end(list<int>,_pred)_(same_length)/1024                    1583.27          808.52       -774.74         -48.93%
std::find_end(list<int>,_pred)_(same_length)/8192                   21850.99        10896.50     -10954.48         -50.13%
std::find_end(list<int>,_pred)_(single_element)/1000                  752.03          755.00          2.97           0.39%
std::find_end(list<int>,_pred)_(single_element)/1024                  774.22          784.14          9.92           1.28%
std::find_end(list<int>,_pred)_(single_element)/8192                10219.43        10396.49        177.05           1.73%
std::find_end(vector<int>)_(match_near_end)/1000                      277.37           28.45       -248.91         -89.74%
std::find_end(vector<int>)_(match_near_end)/1024                     2247.56           25.80      -2221.76         -98.85%
std::find_end(vector<int>)_(match_near_end)/8192                   119785.10          212.44    -119572.66         -99.82%
std::find_end(vector<int>)_(near_matches)/1000                      16351.34        14073.13      -2278.21         -13.93%
std::find_end(vector<int>)_(near_matches)/1024                      16656.33        14654.36      -2001.97         -12.02%
std::find_end(vector<int>)_(near_matches)/8192                    1181392.88       828918.96    -352473.91         -29.84%
std::find_end(vector<int>)_(process_all)/1000                         231.14          235.80          4.66           2.01%
std::find_end(vector<int>)_(process_all)/1024                         235.87          232.06         -3.81          -1.61%
std::find_end(vector<int>)_(process_all)/1048576                   239922.25       238229.38      -1692.87          -0.71%
std::find_end(vector<int>)_(process_all)/8192                        1837.43         1802.25        -35.19          -1.91%
std::find_end(vector<int>)_(same_length)/1000                         632.59          252.80       -379.79         -60.04%
std::find_end(vector<int>)_(same_length)/1024                         524.51          257.58       -266.94         -50.89%
std::find_end(vector<int>)_(same_length)/8192                        5159.01         2090.12      -3068.89         -59.49%
std::find_end(vector<int>)_(single_element)/1000                      229.56          250.47         20.91           9.11%
std::find_end(vector<int>)_(single_element)/1024                      234.86          252.18         17.32           7.37%
std::find_end(vector<int>)_(single_element)/8192                     1825.74         1981.90        156.16           8.55%
std::find_end(vector<int>,_pred)_(match_near_end)/1000                574.17           52.98       -521.19         -90.77%
std::find_end(vector<int>,_pred)_(match_near_end)/1024               3525.35           54.03      -3471.32         -98.47%
std::find_end(vector<int>,_pred)_(match_near_end)/8192             190155.81          423.41    -189732.40         -99.78%
std::find_end(vector<int>,_pred)_(near_matches)/1000                28541.98        24598.37      -3943.61         -13.82%
std::find_end(vector<int>,_pred)_(near_matches)/1024                29696.55        25675.27      -4021.28         -13.54%
std::find_end(vector<int>,_pred)_(near_matches)/8192              1846970.41      1596191.84    -250778.57         -13.58%
std::find_end(vector<int>,_pred)_(process_all)/1000                   519.71          592.14         72.43          13.94%
std::find_end(vector<int>,_pred)_(process_all)/1024                   529.74          491.07        -38.67          -7.30%
std::find_end(vector<int>,_pred)_(process_all)/1048576             631923.41       643729.57      11806.16           1.87%
std::find_end(vector<int>,_pred)_(process_all)/8192                  4215.05         3909.30       -305.75          -7.25%
std::find_end(vector<int>,_pred)_(same_length)/1000                  1095.46          524.99       -570.47         -52.08%
std::find_end(vector<int>,_pred)_(same_length)/1024                  1117.95          537.65       -580.31         -51.91%
std::find_end(vector<int>,_pred)_(same_length)/8192                  8923.95         4307.13      -4616.83         -51.74%
std::find_end(vector<int>,_pred)_(single_element)/1000                516.52          656.32        139.80          27.07%
std::find_end(vector<int>,_pred)_(single_element)/1024                528.82          673.72        144.90          27.40%
std::find_end(vector<int>,_pred)_(single_element)/8192               4210.37         5529.52       1319.15          31.33%
Geomean                                                              6995.43         3440.97      -3554.46         -50.81%
```
…iginally legal f64 values that we can store directly. (#171602)

Based off feedback from #171478
…1637)

They were using the wrong scheduler resource. They're also missing from
the optimisation guides, but WriteLD should be closer at least.
…169914)

This is technically ABI breaking, since `is_trivial` and
`is_trivially_default_constructible` now return different results.
However, I don't think that's a significant issue, since `allocator` is
almost always used in classes which own memory, making them non-trivial
anyways.
…9413)

We've seen in quite a few cases while optimizing `__tree`'s copy
construction that `_DetachedTreeCache` is actually quite slow and not
necessarily an optimization at all. This patch removes the code, since
it's now only used by `operator=(initializer_list)`, which should be
quite cold code. We might look into actually optimizing it again in the
future, but I doubt an optimization will be small enough compared to the
likely speedup in real-world code this would give.
…165160)

This removes a bit of code duplication and might simplify future
segmented iterator optimitations.
Adding Annotation Inference in Lifetime Analysis.

This PR implicitly adds lifetime bound annotations to the AST which is
then used by functions which are parsed later to detect UARs etc.
Example:

```cpp
std::string_view f1(std::string_view a) {
  return a;
}

std::string_view f2(std::string_view a) {
  return f1(a);
}

std::string_view ff(std::string_view a) {
  std::string stack = "something on stack";
  return f2(stack); // warning: address of stack memory is returned
}
```

Note:

1. We only add lifetime bound annotations to the functions being
analyzed currently.
2. Currently, both annotation suggestion and inference work
simultaneously. This can be modified based on requirements.
3. The current approach works given that functions are already present
in the correct order (callee-before-caller). For not so ideal cases, we
can create a CallGraph prior to calling the analysis. This can be done
in the next PR.
Depends upon #170900

Re-land #169544

Previously we were less specific for POINTER/TARGET: encoding that they
could alias with (almost) anything.

In the new system, the "target data" tree is now a sibling of the other
trees (e.g. "global data"). POITNTER variables go at the root of the
"target data" tree, whereas TARGET variables get their own nodes under
that tree. For example,

```
integer, pointer :: ip
real, pointer :: rp
integer, target :: it
integer, target :: it2(:)
real, target :: rt
integer :: i
real :: r
```
- `ip` and `rp` may alias with any variable except `i` and `r`.
- `it`, `it2`, and `rt` may alias only with `ip` or `rp`.
- `i` and `r` cannot alias with any other variable.

Fortran 2023 15.5.2.14 gives restrictions on entities associated with
dummy arguments. These do not allow non-target globals to be modified
through dummy arguments and therefore I don't think we need to make all
globals alias with dummy arguments.

I haven't implemented it in this patch, but I wonder whether it is ever
possible for `ip` to alias with `rt`.

While I was updating the tests I fixed up some tests that still assumed
that local alloc tbaa wasn't the default.

Cray pointers/pointees are (optionally) modelled as aliasing with all
non-descriptor data. This is not enabled by default.

I found no functional regressions in the gfortran test suite.
…0323) (#171787)

```
Step 7 (test-check-all) failure: Test just built components: check-all completed (failure)
******************** TEST 'LLVM :: CodeGen/AMDGPU/insert_vector_dynelt.ll' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 2
/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/llc -mtriple=amdgcn -mcpu=fiji < /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll | /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/FileCheck -enable-var-scope -check-prefixes=GCN /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll
# executed command: /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/llc -mtriple=amdgcn -mcpu=fiji
# executed command: /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/FileCheck -enable-var-scope -check-prefixes=GCN /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll
# RUN: at line 3
/home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/llc -O0 -mtriple=amdgcn -mcpu=fiji < /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll | /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/FileCheck --check-prefixes=GCN-O0 /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/llvm-project/llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll
# executed command: /home/buildbot/worker/as-builder-4/ramdisk/expensive-checks/build/bin/llc -O0 -mtriple=amdgcn -mcpu=fiji
# .---command stderr------------
# |
# | # After Instruction Selection
# | # Machine code for function insert_dyn_i32_6: IsSSA, TracksLiveness
# | Function Live Ins: $sgpr16 in %8, $sgpr17 in %9, $sgpr18 in %10, $sgpr19 in %11, $sgpr20 in %12, $sgpr21 in %13, $vgpr0 in %14, $vgpr1 in %15
# |
# | bb.0 (%ir-block.0):
# |   successors: %bb.1(0x80000000); %bb.1(100.00%)
# |   liveins: $sgpr16, $sgpr17, $sgpr18, $sgpr19, $sgpr20, $sgpr21, $vgpr0, $vgpr1
# |   %15:vgpr_32 = COPY $vgpr1
# |   %14:vgpr_32 = COPY $vgpr0
# |   %13:sgpr_32 = COPY $sgpr21
# |   %12:sgpr_32 = COPY $sgpr20
# |   %11:sgpr_32 = COPY $sgpr19
# |   %10:sgpr_32 = COPY $sgpr18
# |   %9:sgpr_32 = COPY $sgpr17
# |   %8:sgpr_32 = COPY $sgpr16
# |   %17:sgpr_192 = REG_SEQUENCE %8:sgpr_32, %subreg.sub0, %9:sgpr_32, %subreg.sub1, %10:sgpr_32, %subreg.sub2, %11:sgpr_32, %subreg.sub3, %12:sgpr_32, %subreg.sub4, %13:sgpr_32, %subreg.sub5
# |   %16:sgpr_192 = COPY %17:sgpr_192
# |   %19:vreg_192 = COPY %17:sgpr_192
# |   %28:sreg_64_xexec = IMPLICIT_DEF
# |   %27:sreg_64_xexec = S_MOV_B64 $exec
# |
# | bb.1:
# | ; predecessors: %bb.1, %bb.0
# |   successors: %bb.1(0x40000000), %bb.3(0x40000000); %bb.1(50.00%), %bb.3(50.00%)
# |
# |   %26:vreg_192 = PHI %19:vreg_192, %bb.0, %18:vreg_192, %bb.1
# |   %29:sreg_64 = PHI %28:sreg_64_xexec, %bb.0, %30:sreg_64, %bb.1
# |   %31:sreg_32_xm0 = V_READFIRSTLANE_B32 %14:vgpr_32, implicit $exec
# |   %32:sreg_64 = V_CMP_EQ_U32_e64 %31:sreg_32_xm0, %14:vgpr_32, implicit $exec
# |   %30:sreg_64 = S_AND_SAVEEXEC_B64 killed %32:sreg_64, implicit-def $exec, implicit-def $scc, implicit $exec
# |   $m0 = COPY killed %31:sreg_32_xm0
# |   %18:vreg_192 = V_INDIRECT_REG_WRITE_MOVREL_B32_V8 %26:vreg_192(tied-def 0), %15:vgpr_32, 3, implicit $m0, implicit $exec
# |   $exec = S_XOR_B64_term $exec, %30:sreg_64, implicit-def $scc
# |   S_CBRANCH_EXECNZ %bb.1, implicit $exec
# |
# | bb.3:
```

This reverts commit 15df9e7.
Currently fmul is not reassociated unless it has nsz, although
this should be unnecessary.
…171158)

Add additional bound for the induction variable of the scf.forall such
that:
%iv <= %lower_bound + (%trip_count - 1) * step

Same as #126426 but for
scf.forall loop
The patch updates the lowering of `id` based pmevent
also to intrinsics. The mask is simply (1 << event-id).

Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
This function contains most of the logic for BTI:
- it takes the BasicBlock and the instruction used to jump to it.
- Then it checks if the first non-pseudo instruction is a sufficient
landing pad for the used call.
- if not, it generates the correct BTI instruction.

Also introduce the isCallCoveredByBTI helper to simplify the logic.
nsz can only change the behavior of the sign bit.
The sign bit for fmul can be implemented as xor,
which is associative. DAGCombiner already reassociates
the multiply by 2 constants without nsz.

Fixes #64967
This patch adds TLS support for SystemZ on top of orc-runtime support. A
separate orc-runtime support #171062 has been created from earlier TLS
support #[170706](#170706).

See conversations in
[#170706](#170706)

---------

Co-authored-by: anoopkg6 <anoopkg6@github.com>
#171797)

This patch fixes toolchain-msvc.test on Windows ARM64 hosts running
under native ARM64 environment via vcvarsarm64.bat. Our lab buildbot
recently switched from using cross vcvarsamd64_arm64.bat environment to
native vcvarsarm64.bat. This patch updates FileCheck patterns to also
allow HostARM64 and arm64 PATH entries.

Changes:
-> Extend host regex to match HostARM64 (case-insensitive)
-> Allow arm64 in PATH tail.
-> Apply same fix in both 32-bit and 64-bit sections.
This patch adds `ReduceOp::verifyRegions` to ensure that the number of
reduction regions equals the number of operands (`getReductions().size()
== getOperands().size()`).

Additionally, `ParallelOp::verify` is updated to gracefully handle cases
where the number of reduce operands differs from the initial values,
preventing verification logic crashes and relying on `ReduceOp` to
report structural inconsistencies.

Fixes: #118768
…171616)

If the scalar integer sources are freely transferable to the FPU, then
perform the bitlogic op as a SSE/AVX operation.

Uses the mayFoldIntoVector helper added at #171589
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.