Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] main from llvm:main #790

Merged
merged 62 commits into from
Oct 10, 2022
Merged

[pull] main from llvm:main #790

merged 62 commits into from
Oct 10, 2022

Conversation

pull[bot]
Copy link

@pull pull bot commented Oct 10, 2022

See Commits and Changes for more details.


Created by pull[bot]

Can you help keep this open source service alive? 💖 Please sponsor : )

luxufan and others added 30 commits October 6, 2022 03:01
If the location ptr to be killed is in no loop and the Function does not
have irreducible loops, then we can regard it as loop invariant.

Differential Revision: https://reviews.llvm.org/D135369
The current decomposition for GEPs did not correctly handle cases where
GEPs access different source types. Adjust the constraints by including
the indexed type-size as coefficients.

Further generalization to allow GEPs with more than one index is a
needed general follow-up improvement.
…d(lsl(val1,small-shift), lsl(val2,large-shift)).

Ideally, add operand with smaller shift should be RHS. In that way, smaller-shift is folded into ADD.
- Also add another test case when 'lsl(val1,small-shift)' has one than one use, to show the (planned) optimization won't regress this case.
…val1,small-shmt), lsl(val2,large-shmt))

On many aarch64 processors (Cortex A78, Neoverse N1/N2/V1, etc), ADD with LSL shift (shift-amount <= 4) has smaller latency and higher
throughput than ADD with larger shift (shift-amunt > 4). This is at least no-op for the rest of the processors.

Differential Revision: https://reviews.llvm.org/D135208
Fixed error
```
compiler-rt/lib/tsan/go/buildgo.sh: 62: [: unexpected operator
```

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D135537
Zve32* does not support SEW=64. Or any LMUL smaller than 32/SEW.

Reviewed By: eopXD

Differential Revision: https://reviews.llvm.org/D135519
…gistrar.

Now that ExecutionSession objects alway have ExecutorProcessControl (EPC)
objects attached we can use EPCEHFrameRegistrar by default, rather than
InProcessEHFrameRegistrar. This allows LLJIT to work out-of-the-box with remote
EPCs on platforms that use JITLink, without requiring a custom
ObjectLinkingLayerCreator to override the eh-frame registrar.
Null source/destination pointers are ok for zero-sized messages.
reference 0fbe71e.

Also add testcase for addi.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D135538
This patch updates the fir.convert operation's verifier to allow
conversion from !fir.box<!fir.type<T>> to !fir.class<!fir.type<T>>.

Other conversion involving fir.class are likely needed but will
be added when lowering needs them.

Reviewed By: PeteSteinfeld

Differential Revision: https://reviews.llvm.org/D135445
Implements rule ES.75 of C++ Core Guidelines.

Differential Revision: https://reviews.llvm.org/D132461
We can use the new DenseArrayStrictlySorted constraint.

Differential Revision: https://reviews.llvm.org/D135246
Remove the unmaintained Go bindings per
https://discourse.llvm.org/t/rfc-remove-the-go-bindings/65725.

The tinygo project provides maintained Go bindings at
https://github.com/tinygo-org/go-llvm.

Differential Revision: https://reviews.llvm.org/D135436
When determining the initial value of the object, use the constant
folding API to load a given type at a given offset in the global
initializer. This makes it work for cases where the load doesn't
directly correspond to an aggregate member.

Differential Revision: https://reviews.llvm.org/D135435
This fixes the case where callees with SVE arguments outside of the z0-z7
range were incorrectly deduced as SVE calling convention functions
Previously we had a bit of a mix of "signed char" "unsigned char" and
"char".

This adds seperate min and max checks for all three types.

Depends on D135170

Reviewed By: Michael137

Differential Revision: https://reviews.llvm.org/D135352
…with auto return types

**Summary**

The primary motivation for this patch is to make sure we handle
the step-in behaviour for functions in the `std` namespace which
have an `auto` return type. Currently the default `step-avoid-regex`
setting is `^std::` but LLDB will still step into template functions
with `auto` return types in the `std` namespace.

**Details**
When we hit a breakpoint and check whether we should stop, we call
into `ThreadPlanStepInRange::FrameMatchesAvoidCriteria`. We then ask
for the frame function name via `SymbolContext::GetFunctionName(Mangled::ePreferDemangledWithoutArguments)`.
This ends up trying to parse the function name using `CPlusPlusLanguage::MethodName::GetBasename` which
parses the raw demangled name string.

`CPlusPlusNameParser::ParseFunctionImpl` calls `ConsumeTypename` to skip
the (in our case auto) return type of the demangled name (according to the
Itanium ABI this is a valid thing to encode into the mangled name). However,
`ConsumeTypename` doesn't strip out a plain `auto` identifier
(it will strip a `decltype(auto) return type though). So we are now left with
a basename that still has the return type in it, thus failing to match the `^std::`
regex.

Example frame where the return type is still part of the function name:
```
Process 1234 stopped
* thread #1, stop reason = step in
    frame #0: 0x12345678 repro`auto std::test_return_auto<int>() at main.cpp:12:5
   9
   10   template <class>
   11   auto test_return_auto() {
-> 12       return 42;
   13   }
```

This is another case where the `CPlusPlusNameParser` breaks us in subtle ways
due to evolving C++ syntax. There are longer-term plans of replacing the hand-rolled
C++ parser with an alternative that uses the mangle tree API to do the parsing for us.

**Testing**

* Added API and unit-tests
* Adding support for ABI tags into the parser is a larger undertaking
  which we would rather solve properly by using libcxxabi's mangle tree
  parser

Differential Revision: https://reviews.llvm.org/D135413
…VectorInst's mask

This commit fixes #57326.

Currently we would take a Mask out and directly use it by doing
auto Mask = SVI->getShuffleMask();
However, if the mask is undef, this Mask is not initialized. It might be
a vector of -1 or random integers.
This would cause an Out-of-bound read later when trying to find a
StartMask.

This change checks if all indices in the Mask is in the allowed range,
and fixes the out-of-bound accesses.

Differential Revision: https://reviews.llvm.org/D132634
gflegar and others added 27 commits October 10, 2022 16:28
The return type is two u8 packed into a 16 bit VGPR, so this instruction
should be True16.

Reviewed By: dp

Differential Revision: https://reviews.llvm.org/D135478
This reverts commit 4ea1a64.

This breaks on Darwin which tries to export these symbols
https://github.com/llvm/llvm-project/blob/ebb258d3b0785f6dcc65e1f277d0690891ddc94d/clang/lib/Driver/ToolChains/Darwin.cpp#L1363

I'll try to reland which that removed and approval from
Apple folks.
If the single-thread model is used, or the
-licm-force-thread-model-single flag is specified, skip checks
related to thread-safety. This means that store promotion for
conditionally executed stores only requires proof of
dereferenceability and writability, but not of thread-safety. For
example, this enables promotion of stores to (non-constant) globals,
as well as captured allocas.

Fixes #50537.

Differential Revision: https://reviews.llvm.org/D130466
This fixes a crash with scalable vectors, thanks @nikic for spotting
this!
This is to prepare for adding SparseTensorWriter.

Reviewed By: wrengr

Differential Revision: https://reviews.llvm.org/D135477
…nd declarations

rG04ba1856 introduced a call to FilterLookupForScope that is expensive
for very large translation units where it was observed to cause a 6%
compile time degradation. As the comment states, the only effect is to
"remove declarations found in inline namespaces for friend declarations
with unqualified names." This change limits the call to that scenario.
The test that was added by rG04ba1856 continues to pass, but the
observed degradation is cut in half.

Differential Revision: https://reviews.llvm.org/D135370
Support constexpr version of __builtin_fmin and its variations.

Reviewed By: jcranmer-intel

Differential Revision: https://reviews.llvm.org/D135493
Add a pattern to be able to collapse dimensions in a linalg generic op.

Differential Revision: https://reviews.llvm.org/D135503
When removing loops & blocks we also need to clear the SCEV dispositions
as they may now contain incorrect values.

Fixes #58262.
In https://reviews.llvm.org/D133534 I made a little cleanup
to DynamicLoaderDarwinKernel::CreateInstance and unintentionally
changed the logic.  Previously it would not create an instance
if there was a binary given to lldb and it was not a kernel.
With my change, the absence of any binary would also cause it
to not create.  So connecting to a kernel without any binaries
would fail.

rdar://100985097
Reviewed By: michaelrj, sivachandra

Differential Revision: https://reviews.llvm.org/D134665
…onstant.

If the divisor is even, we can first shift the dividend and divisor
right by the number of trailing zeros. Now the divisor is odd and we
can do the original algorithm to calculate a remainder. Then we shift
that remainder left by the number of trailing zeros and add the bits
that were shifted out of the dividend.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D135541
There is no 6.9 in C++11, the quote actually lives in
[intro.multithread] for that revision. However, the words moved in
C++17 to [intro.progress] so I added that information as well.
…sor/Enums.h

This differential attempts to resolve certain issues on Windows (e.g., https://reviews.llvm.org/D134933#3843372 and https://reviews.llvm.org/D133462#3844195).

Reviewed By: stella.stamenova, aganea

Differential Revision: https://reviews.llvm.org/D135502
sence -> sense

Also re-flowed to the usual 80-col limit and added a comma.
… available_externally"

This reverts commit 4fbe335. It causes linking errors, with details provided internally. (Hopefully the author/reviewers will be able to upstream the internal repro).
Some higher level operations such as torch.max generates linalg generic
that returns both the index and the value of the max operation. However
sometimes not all information is being used. This however blocks
vectorization for certain cases which causes performance degradation.
This patch aims to fix this issue.

Reviewed By: mravishankar

Differential Revision: https://reviews.llvm.org/D135388
Update DWARF Extensions For Heterogeneous Debugging proposal for the
DW_OP_LLVM_overlay operation:

1. Add an example.
2. Correct typo in definition of rbss.
3. Correct definition to specify both operands of the
   DW_OP_bit_piece operations.

Reviewed By: zoran.zaric

Differential Revision: https://reviews.llvm.org/D135394
insertelt DestVec, (fneg (extractelt SrcVec, Index)), Index --> shuffle DestVec, (fneg SrcVec), Mask

This is a specialized form of what could be a more general fold for a binop.
It's also possible that fneg is overlooked by SLP in this kind of
insert/extract pattern since it's a unary op.

This shows up in the motivating example from #issue 58139, but it won't solve
it (that probably requires some x86-specific backend changes). There are also
some small enhancements (see TODO comments) that can be done as follow-up
patches.

Differential Revision: https://reviews.llvm.org/D135278
@pull pull bot added the ⤵️ pull label Oct 10, 2022
@pull pull bot merged commit baab4aa into MaxMood96:main Oct 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.