Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/merge upstream 20210511 #54

Merged
merged 650 commits into from
Jul 14, 2021
Merged

Conversation

kaz7
Copy link
Collaborator

@kaz7 kaz7 commented Jul 14, 2021

Merge up to 2021/5/11, d7086af. There are several conflicts. I solved them except f2f88f3 which abandons out-of-tree libomptarget compilation. This patch doesn't work with our libomptarget for VE because ours requires out-of-tree compilation. I've reverted f2f88f3 at the moment. This patch should be re-applied again when libomptarget for VE adapts f2f88f3.

Pass regression tests.

jhuber6 and others added 30 commits May 7, 2021 10:27
Summary:
The allocator interface added in D97883 allows the RTL to allocate shared and
host-pinned memory from the cuda plugin. This patch adds support for these to
the runtime.

Reviewed By: grokos

Differential Revision: https://reviews.llvm.org/D102000
Pointers escape when converted to integers, so a pointer produced by
converting an integer to a pointer must not be a local non-escaping
object.

Reviewed By: nikic, nlopes, aqjune

Differential Revision: https://reviews.llvm.org/D101541
Address sanitizer can detect stack exhaustion via its SEGV handler, which is
executed on a separate stack using the sigaltstack mechanism. When libFuzzer is
used with address sanitizer, it installs its own signal handlers which defer to
those put in place by the sanitizer before performing additional actions. In the
particular case of a stack overflow, the current setup fails because libFuzzer
doesn't preserve the flag for executing the signal handler on a separate stack:
when we run out of stack space, the operating system can't run the SEGV handler,
so address sanitizer never reports the issue. See the included test for an
example.

This commit fixes the issue by making libFuzzer preserve the SA_ONSTACK flag
when installing its signal handlers; the dedicated signal-handler stack set up
by the sanitizer runtime appears to be large enough to support the additional
frames from the fuzzer.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D101824
…les (PR50261)

Sometimes disassembler picks _REV variants of instructions
over the plain ones, which in this case exposed an issue
that the _REV variants aren't being modelled as optimizable moves.
Ensure we don't try to fold when one might be an opaque constant - the constant fold will fail and then the reverse fold will happen in DAGCombine.....
induction variable to be perfect

This patch allow more conditional branches to be considered as loop
guard, and so more loop nests can be considered perfect.

Reviewed By: bmahjour, sidbav

Differential Revision: https://reviews.llvm.org/D94717
Fix function return type and remove check for SUMMARY, since it doesn't
seem to be output in Windows.
Similar to X86 D73230 & 46788a2

With this change, we can set dso_local in clang's -fpic -fno-semantic-interposition mode,
for default visibility external linkage non-ifunc-non-COMDAT definitions.

For such dso_local definitions, variable access/taking the address of a
function/calling a function will go through a local alias to avoid GOT/PLT.

Note: the 'S' inline assembly constraint refers to an absolute symbolic address
or a label reference (D46745).

Differential Revision: https://reviews.llvm.org/D101872
It measures as such, and the reference docs agree.

I can't easily add a MCA test, because there's no mnemonic for it,
it can only be disassembled or created as a MCInst.
Revert the 32-process cap on Windows.  When testing with Swift, we found
that there was a time reduction for testing with the higher load.  This
should hopefully not matter much in practice.  In the case that the
original problem with python remains with a high subprocess count, we
can easily revert this change.
Jobs that test with a more recent standard version run more tests, so
they take longer. We'll decrease the average latency by running them
first instead of last.
This will allow writing
  propagateMetadata(Inst, collectInterestingValues(...))
without concern about empty lists. In case of an empty list,
Inst is returned without any changes.
To improve hygiene, consistency, and usability, it would be good to replace all
the macro intrinsics in wasm_simd128.h with functions. The reason for using
macros in the first place was to enforce the use of constants for some arguments
using `_Static_assert` with `__builtin_constant_p`. This commit switches to
using functions and uses the `__diagnose_if__` attribute rather than
`_Static_assert` to enforce constantness.

The remaining macro intrinsics cannot be made into functions until the builtin
functions they are implemented with can be replaced with normal code patterns
because the builtin functions themselves require that their arguments are
constants.

This commit also fixes a bug with the const_splat intrinsics in which the f32x4
and f64x2 variants were incorrectly producing integer vectors.

Differential Revision: https://reviews.llvm.org/D102018
I think currently isImpliedViaMerge can incorrectly return true for phis
in a loop/cycle, if the found condition involves the previous value of

Consider the case in exit_cond_depends_on_inner_loop.

At some point, we call (modulo simplifications)
isImpliedViaMerge(<=, %x.lcssa, -1, %call, -1).

The existing code tries to prove IncV <= -1 for all incoming values
InvV using the found condition (%call <= -1). At the moment this succeeds,
but only because it does not compare the same runtime value. The found
condition checks the value of the last iteration, but the incoming value
is from the *previous* iteration.

Hence we incorrectly determine that the *previous* value was <= -1,
which may not be true.

I think we need to be more careful when looking at the incoming values
here. In particular, we need to rule out that a found condition refers to
any value that may refer to one of the previous iterations. I'm not sure
there's a reliable way to do so (that also works of irreducible control
flow).

So for now this patch adds an additional requirement that the incoming
value must properly dominate the phi block. This should ensure the
values do not change in a cycle. I am not entirely sure if will catch
all cases and I appreciate a through second look in that regard.

Alternatively we could also unconditionally bail out in this case,
instead of checking the incoming values

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D101829
b614ada ("[mlir] add support for index type in vectors.") removed
this limitation.

Differential Revision: https://reviews.llvm.org/D102081
I want to start using LLVM component libraries in libomptarget
to stop duplicating implementations already available in LLVM
(e.g. LLVMObject, LLVMSupport, etc.). Without relying on LLVM
in all libomptarget builds one has to provide fallback implementation
for each used LLVM feature.

This is an attempt to stop supporting out-of-llvm-tree builds of libomptarget.

I understand that I may need to revert this,
if this affects downstream projects in a bad way.

Differential Revision: https://reviews.llvm.org/D101509
We have vector operations on double vector and float scalar. For
example, vfwadd.wf is such a instruction.

vfloat64m1_t vfwadd_wf(vfloat64m1_t op0, float op1, size_t op2);

We should specify F and D extensions for it.

Differential Revision: https://reviews.llvm.org/D102051
This addresses an issue introduced in D91559. We would invoke the
compiler with -Lpath/to/lib --sysroot=path/to/sysroot where both
locations contain libraries with the same name, but we expect linker
to pick up the library in path/to/lib since that version is more
specialized. This was the case before D91559 where the sysroot path
would be ignored, but after that change linker would now pick up the
library from the sysroot which resulted in unexpected behavior.

The sysroot path should always come after any user provided library
paths, followed by compiler runtime paths. We want for libraries in user
provided library paths to always take precedence over sysroot libraries.
This matches the behavior of other toolchains used with other targets.

Differential Revision: https://reviews.llvm.org/D102049
Implement the reduction transformational intrinsic function NORM2 in
the runtime, using infrastructure already in place for MAXVAL & al.

Differential Revision: https://reviews.llvm.org/D102024
Adjust the name to make it clearer this is the region containing the
target recipe, similar to SinkRegion below.

Suggested post-commit for ccebf7a.
Tobias Gysi and others added 27 commits May 11, 2021 05:53
after introducing the IndexedGenericOp to GenericOp canonicalization (https://reviews.llvm.org/D101612).

Differential Revision: https://reviews.llvm.org/D102176
This is to prevent server from being DOS'd by possible malicious
parties issuing requests that can yield huge responses.

One possible drawback is on rename workflow. As it really requests all
occurences, but it has an internal limit on 50 files currently.
We are putting the limit on 10000 elements per response So for rename to regress
one should have 10k refs to a symbol in less than 50 files. This seems unlikely
and we fix it if there are complaints by giving up on the response based on the
number of files covered instead.

Differential Revision: https://reviews.llvm.org/D101914
Currently client was setting the HasMore to true iff stream said so.
Hence if we had a broken stream for whatever reason (e.g. hitting deadline for a
huge response), HasMore would be false, which is semantically incorrect (e.g.
will throw rename off).

Differential Revision: https://reviews.llvm.org/D101915
…operator

* `operator!=` isn't in the spec
* `<compare>` is designed to work with `operator<=>` so it doesn't
  really make sense to have `operator<=>`-less friendly sections.

Depends on D100283.

Differential Revision: https://reviews.llvm.org/D100342
…n friends

The standard leaves it up to the implementation to decide whether or not
these operators are hidden friends. There are several (well-documented)
reasons to prefer hidden friends, as well as an argument for improved
readability.

Depends on D100342.

Differential Revision: https://reviews.llvm.org/D101707
C++17 deprecates `std::raw_storage_iterator` and C++20 removes it.

Implements part of:
  * P0174R2 'Deprecating Vestigial Library Parts in C++17'
  * P0619R4 'Reviewing Deprecated Facilities of C++17 for C++20'

Differential Revision: https://reviews.llvm.org/D101730
after introducing the IndexedGenericOp to GenericOp canonicalization (https://reviews.llvm.org/D101612).

Differential Revision: https://reviews.llvm.org/D102174
after introducing the IndexedGenericOp to GenericOp canonicalization (https://reviews.llvm.org/D101612).

Differential Revision: https://reviews.llvm.org/D102187
There are cases where a concrete DIE with DW_TAG_subprogram can have
abstract_origin attribute, so we handle that situation as well.

Differential Revision: https://reviews.llvm.org/D101025
…regular mode test

Turn this test into a normal mode as it contains well-formed code and
checks for defined behavior. It still can be run in debug mode as of D100866.

Differential Revision: https://reviews.llvm.org/D102192
VectorTransfer split previously only split read xfer ops. This adds
the same logic to write ops. The resulting code involves 2
conditionals for write ops while read ops only needed 1, but the created
ops are built upon the same patterns, so pattern matching/expectations
are all consistent other than in regards to the if/else ops.

Differential Revision: https://reviews.llvm.org/D102157
This patch adds JSON output style to llvm-symbolizer to better support CLI automation by providing a machine readable output.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D96883
With this patch, `FLANG_BUILD_NEW_DRIVER` is set to `On` by default
(i.e. the new driver is enabled). Note that the new driver depends on
Clang and hence with this change you will need to add `clang` to
`LLVM_ENABLE_PROJECTS`.

If you don't want to build the new driver, set `FLANG_BUILD_NEW_DRIVER`
to `Off`. This way you won't be required to include `clang` in
`LLVM_ENABLE_PROJECTS`.

Differential Revision: https://reviews.llvm.org/D101842
This patch adds support for WebAssembly globals in LLVM IR, representing
them as pointers to global values, in a non-default, non-integral
address space.  Instruction selection legalizes loads and stores to
these pointers to new WebAssemblyISD nodes GLOBAL_GET and GLOBAL_SET.
Once the lowering creates the new nodes, tablegen pattern matches those
and converts them to Wasm global.get/set of the appropriate type.

Based on work by Paulo Matos in https://reviews.llvm.org/D95425.

Reviewed By: pmatos

Differential Revision: https://reviews.llvm.org/D101608
The implementation of subword atomics does not actually
guarantee the result is zero-extended, which now caused
failures after https://reviews.llvm.org/D101342 was landed.
Or at least the sibling call cases which the DAG already handles.
This extends any frame record created in the function to include that
parameter, passed in X22.

The new record looks like [X22, FP, LR] in memory, and FP is stored with 0b0001
in bits 63:60 (CodeGen assumes they are 0b0000 in normal operation). The effect
of this is that tools walking the stack should expect to see one of three
values there:

  * 0b0000 => a normal, non-extended record with just [FP, LR]
  * 0b0001 => the extended record [X22, FP, LR]
  * 0b1111 => kernel space, and a non-extended record.

All other values are currently reserved.

If compiling for arm64e this context pointer is address-discriminated with the
discriminator 0xc31a and the DB (process-specific) key.

There is also an "i8** @llvm.swift.async.context.addr()" intrinsic providing
front-ends access to this slot (and forcing its creation initialized to nullptr
if necessary).
Or at least the sibling call cases which the DAG already handles.
This reverts commit f2f88f3.
Because our libomptarget implementation for VE requries out-of-tree
builds at the moemnt.  TODO: need to apply this patch with
updates of libomptarget for VE.
@kaz7 kaz7 merged commit 6dc0edd into develop Jul 14, 2021
@kaz7 kaz7 deleted the feature/merge-upstream-20210511 branch July 14, 2021 09:45
kaz7 pushed a commit that referenced this pull request May 22, 2023
…callback

The `TypeSystemMap::m_mutex` guards against concurrent modifications
of members of `TypeSystemMap`. In particular, `m_map`.

`TypeSystemMap::ForEach` iterates through the entire `m_map` calling
a user-specified callback for each entry. This is all done while
`m_mutex` is locked. However, there's nothing that guarantees that
the callback itself won't call back into `TypeSystemMap` APIs on the
same thread. This lead to double-locking `m_mutex`, which is undefined
behaviour. We've seen this cause a deadlock in the swift plugin with
following backtrace:

```

int main() {
    std::unique_ptr<int> up = std::make_unique<int>(5);

    volatile int val = *up;
    return val;
}

clang++ -std=c++2a -g -O1 main.cpp

./bin/lldb -o “br se -p return” -o run -o “v *up” -o “expr *up” -b
```

```
frame #4: std::lock_guard<std::mutex>::lock_guard
frame #5: lldb_private::TypeSystemMap::GetTypeSystemForLanguage <<<< Lock #2
frame #6: lldb_private::TypeSystemMap::GetTypeSystemForLanguage
frame #7: lldb_private::Target::GetScratchTypeSystemForLanguage
...
frame #26: lldb_private::SwiftASTContext::LoadLibraryUsingPaths
frame #27: lldb_private::SwiftASTContext::LoadModule
frame #30: swift::ModuleDecl::collectLinkLibraries
frame #31: lldb_private::SwiftASTContext::LoadModule
frame #34: lldb_private::SwiftASTContext::GetCompileUnitImportsImpl
frame #35: lldb_private::SwiftASTContext::PerformCompileUnitImports
frame #36: lldb_private::TypeSystemSwiftTypeRefForExpressions::GetSwiftASTContext
frame #37: lldb_private::TypeSystemSwiftTypeRefForExpressions::GetPersistentExpressionState
frame #38: lldb_private::Target::GetPersistentSymbol
frame #41: lldb_private::TypeSystemMap::ForEach                 <<<< Lock #1
frame #42: lldb_private::Target::GetPersistentSymbol
frame #43: lldb_private::IRExecutionUnit::FindInUserDefinedSymbols
frame #44: lldb_private::IRExecutionUnit::FindSymbol
frame #45: lldb_private::IRExecutionUnit::MemoryManager::GetSymbolAddressAndPresence
frame #46: lldb_private::IRExecutionUnit::MemoryManager::findSymbol
frame #47: non-virtual thunk to lldb_private::IRExecutionUnit::MemoryManager::findSymbol
frame #48: llvm::LinkingSymbolResolver::findSymbol
frame #49: llvm::LegacyJITSymbolResolver::lookup
frame #50: llvm::RuntimeDyldImpl::resolveExternalSymbols
frame #51: llvm::RuntimeDyldImpl::resolveRelocations
frame #52: llvm::MCJIT::finalizeLoadedModules
frame #53: llvm::MCJIT::finalizeObject
frame #54: lldb_private::IRExecutionUnit::ReportAllocations
frame #55: lldb_private::IRExecutionUnit::GetRunnableInfo
frame #56: lldb_private::ClangExpressionParser::PrepareForExecution
frame #57: lldb_private::ClangUserExpression::TryParse
frame #58: lldb_private::ClangUserExpression::Parse
```

Our solution is to simply iterate over a local copy of `m_map`.

**Testing**

* Confirmed on manual reproducer (would reproduce 100% of the time
  before the patch)

Differential Revision: https://reviews.llvm.org/D149949
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.