Feature/merge upstream 20210511 #54

kaz7 · 2021-07-14T01:28:14Z

Merge up to 2021/5/11, d7086af. There are several conflicts. I solved them except f2f88f3 which abandons out-of-tree libomptarget compilation. This patch doesn't work with our libomptarget for VE because ours requires out-of-tree compilation. I've reverted f2f88f3 at the moment. This patch should be re-applied again when libomptarget for VE adapts f2f88f3.

Pass regression tests.

Summary: The allocator interface added in D97883 allows the RTL to allocate shared and host-pinned memory from the cuda plugin. This patch adds support for these to the runtime. Reviewed By: grokos Differential Revision: https://reviews.llvm.org/D102000

This is a reduction of the example in: https://llvm.org/PR50256

Pointers escape when converted to integers, so a pointer produced by converting an integer to a pointer must not be a local non-escaping object. Reviewed By: nikic, nlopes, aqjune Differential Revision: https://reviews.llvm.org/D101541

Differential Revision: https://reviews.llvm.org/D102041

…ise op Differential Revision: https://reviews.llvm.org/D102034

Address sanitizer can detect stack exhaustion via its SEGV handler, which is executed on a separate stack using the sigaltstack mechanism. When libFuzzer is used with address sanitizer, it installs its own signal handlers which defer to those put in place by the sanitizer before performing additional actions. In the particular case of a stack overflow, the current setup fails because libFuzzer doesn't preserve the flag for executing the signal handler on a separate stack: when we run out of stack space, the operating system can't run the SEGV handler, so address sanitizer never reports the issue. See the included test for an example. This commit fixes the issue by making libFuzzer preserve the SA_ONSTACK flag when installing its signal handlers; the dedicated signal-handler stack set up by the sanitizer runtime appears to be large enough to support the additional frames from the fuzzer. Reviewed By: morehouse Differential Revision: https://reviews.llvm.org/D101824

…les (PR50261) Sometimes disassembler picks _REV variants of instructions over the plain ones, which in this case exposed an issue that the _REV variants aren't being modelled as optimizable moves.

Ensure we don't try to fold when one might be an opaque constant - the constant fold will fail and then the reverse fold will happen in DAGCombine.....

induction variable to be perfect This patch allow more conditional branches to be considered as loop guard, and so more loop nests can be considered perfect. Reviewed By: bmahjour, sidbav Differential Revision: https://reviews.llvm.org/D94717

Fix function return type and remove check for SUMMARY, since it doesn't seem to be output in Windows.

Similar to X86 D73230 & 46788a2 With this change, we can set dso_local in clang's -fpic -fno-semantic-interposition mode, for default visibility external linkage non-ifunc-non-COMDAT definitions. For such dso_local definitions, variable access/taking the address of a function/calling a function will go through a local alias to avoid GOT/PLT. Note: the 'S' inline assembly constraint refers to an absolute symbolic address or a label reference (D46745). Differential Revision: https://reviews.llvm.org/D101872

It measures as such, and the reference docs agree. I can't easily add a MCA test, because there's no mnemonic for it, it can only be disassembled or created as a MCInst.

…e in RegisterFile

Revert the 32-process cap on Windows. When testing with Swift, we found that there was a time reduction for testing with the higher load. This should hopefully not matter much in practice. In the case that the original problem with python remains with a high subprocess count, we can easily revert this change.

Jobs that test with a more recent standard version run more tests, so they take longer. We'll decrease the average latency by running them first instead of last.

…e llvm

This will allow writing propagateMetadata(Inst, collectInterestingValues(...)) without concern about empty lists. In case of an empty list, Inst is returned without any changes.

To improve hygiene, consistency, and usability, it would be good to replace all the macro intrinsics in wasm_simd128.h with functions. The reason for using macros in the first place was to enforce the use of constants for some arguments using `_Static_assert` with `__builtin_constant_p`. This commit switches to using functions and uses the `__diagnose_if__` attribute rather than `_Static_assert` to enforce constantness. The remaining macro intrinsics cannot be made into functions until the builtin functions they are implemented with can be replaced with normal code patterns because the builtin functions themselves require that their arguments are constants. This commit also fixes a bug with the const_splat intrinsics in which the f32x4 and f64x2 variants were incorrectly producing integer vectors. Differential Revision: https://reviews.llvm.org/D102018

I think currently isImpliedViaMerge can incorrectly return true for phis in a loop/cycle, if the found condition involves the previous value of Consider the case in exit_cond_depends_on_inner_loop. At some point, we call (modulo simplifications) isImpliedViaMerge(<=, %x.lcssa, -1, %call, -1). The existing code tries to prove IncV <= -1 for all incoming values InvV using the found condition (%call <= -1). At the moment this succeeds, but only because it does not compare the same runtime value. The found condition checks the value of the last iteration, but the incoming value is from the *previous* iteration. Hence we incorrectly determine that the *previous* value was <= -1, which may not be true. I think we need to be more careful when looking at the incoming values here. In particular, we need to rule out that a found condition refers to any value that may refer to one of the previous iterations. I'm not sure there's a reliable way to do so (that also works of irreducible control flow). So for now this patch adds an additional requirement that the incoming value must properly dominate the phi block. This should ensure the values do not change in a cycle. I am not entirely sure if will catch all cases and I appreciate a through second look in that regard. Alternatively we could also unconditionally bail out in this case, instead of checking the incoming values Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D101829

This reverts commit 0791f96. Causing crashes: https://crbug.com/1206764

b614ada ("[mlir] add support for index type in vectors.") removed this limitation. Differential Revision: https://reviews.llvm.org/D102081

Differential Revision: https://reviews.llvm.org/D102089

Differential Revision: https://reviews.llvm.org/D102088

I want to start using LLVM component libraries in libomptarget to stop duplicating implementations already available in LLVM (e.g. LLVMObject, LLVMSupport, etc.). Without relying on LLVM in all libomptarget builds one has to provide fallback implementation for each used LLVM feature. This is an attempt to stop supporting out-of-llvm-tree builds of libomptarget. I understand that I may need to revert this, if this affects downstream projects in a bad way. Differential Revision: https://reviews.llvm.org/D101509

We have vector operations on double vector and float scalar. For example, vfwadd.wf is such a instruction. vfloat64m1_t vfwadd_wf(vfloat64m1_t op0, float op1, size_t op2); We should specify F and D extensions for it. Differential Revision: https://reviews.llvm.org/D102051

This addresses an issue introduced in D91559. We would invoke the compiler with -Lpath/to/lib --sysroot=path/to/sysroot where both locations contain libraries with the same name, but we expect linker to pick up the library in path/to/lib since that version is more specialized. This was the case before D91559 where the sysroot path would be ignored, but after that change linker would now pick up the library from the sysroot which resulted in unexpected behavior. The sysroot path should always come after any user provided library paths, followed by compiler runtime paths. We want for libraries in user provided library paths to always take precedence over sysroot libraries. This matches the behavior of other toolchains used with other targets. Differential Revision: https://reviews.llvm.org/D102049

Implement the reduction transformational intrinsic function NORM2 in the runtime, using infrastructure already in place for MAXVAL & al. Differential Revision: https://reviews.llvm.org/D102024

Adjust the name to make it clearer this is the region containing the target recipe, similar to SinkRegion below. Suggested post-commit for ccebf7a.

after introducing the IndexedGenericOp to GenericOp canonicalization (https://reviews.llvm.org/D101612). Differential Revision: https://reviews.llvm.org/D102176

This is to prevent server from being DOS'd by possible malicious parties issuing requests that can yield huge responses. One possible drawback is on rename workflow. As it really requests all occurences, but it has an internal limit on 50 files currently. We are putting the limit on 10000 elements per response So for rename to regress one should have 10k refs to a symbol in less than 50 files. This seems unlikely and we fix it if there are complaints by giving up on the response based on the number of files covered instead. Differential Revision: https://reviews.llvm.org/D101914

Currently client was setting the HasMore to true iff stream said so. Hence if we had a broken stream for whatever reason (e.g. hitting deadline for a huge response), HasMore would be false, which is semantically incorrect (e.g. will throw rename off). Differential Revision: https://reviews.llvm.org/D101915

…operator * `operator!=` isn't in the spec * `<compare>` is designed to work with `operator<=>` so it doesn't really make sense to have `operator<=>`-less friendly sections. Depends on D100283. Differential Revision: https://reviews.llvm.org/D100342

…n friends The standard leaves it up to the implementation to decide whether or not these operators are hidden friends. There are several (well-documented) reasons to prefer hidden friends, as well as an argument for improved readability. Depends on D100342. Differential Revision: https://reviews.llvm.org/D101707

C++17 deprecates `std::raw_storage_iterator` and C++20 removes it. Implements part of: * P0174R2 'Deprecating Vestigial Library Parts in C++17' * P0619R4 'Reviewing Deprecated Facilities of C++17 for C++20' Differential Revision: https://reviews.llvm.org/D101730

after introducing the IndexedGenericOp to GenericOp canonicalization (https://reviews.llvm.org/D101612). Differential Revision: https://reviews.llvm.org/D102174

after introducing the IndexedGenericOp to GenericOp canonicalization (https://reviews.llvm.org/D101612). Differential Revision: https://reviews.llvm.org/D102187

There are cases where a concrete DIE with DW_TAG_subprogram can have abstract_origin attribute, so we handle that situation as well. Differential Revision: https://reviews.llvm.org/D101025

…regular mode test Turn this test into a normal mode as it contains well-formed code and checks for defined behavior. It still can be run in debug mode as of D100866. Differential Revision: https://reviews.llvm.org/D102192

VectorTransfer split previously only split read xfer ops. This adds the same logic to write ops. The resulting code involves 2 conditionals for write ops while read ops only needed 1, but the created ops are built upon the same patterns, so pattern matching/expectations are all consistent other than in regards to the if/else ops. Differential Revision: https://reviews.llvm.org/D102157

This patch adds JSON output style to llvm-symbolizer to better support CLI automation by providing a machine readable output. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D96883

With this patch, `FLANG_BUILD_NEW_DRIVER` is set to `On` by default (i.e. the new driver is enabled). Note that the new driver depends on Clang and hence with this change you will need to add `clang` to `LLVM_ENABLE_PROJECTS`. If you don't want to build the new driver, set `FLANG_BUILD_NEW_DRIVER` to `Off`. This way you won't be required to include `clang` in `LLVM_ENABLE_PROJECTS`. Differential Revision: https://reviews.llvm.org/D101842

This patch adds support for WebAssembly globals in LLVM IR, representing them as pointers to global values, in a non-default, non-integral address space. Instruction selection legalizes loads and stores to these pointers to new WebAssemblyISD nodes GLOBAL_GET and GLOBAL_SET. Once the lowering creates the new nodes, tablegen pattern matches those and converts them to Wasm global.get/set of the appropriate type. Based on work by Paulo Matos in https://reviews.llvm.org/D95425. Reviewed By: pmatos Differential Revision: https://reviews.llvm.org/D101608

This reverts commit 3b6f030.

…merge-upstream-20210511

The implementation of subword atomics does not actually guarantee the result is zero-extended, which now caused failures after https://reviews.llvm.org/D101342 was landed.

Or at least the sibling call cases which the DAG already handles.

This extends any frame record created in the function to include that parameter, passed in X22. The new record looks like [X22, FP, LR] in memory, and FP is stored with 0b0001 in bits 63:60 (CodeGen assumes they are 0b0000 in normal operation). The effect of this is that tools walking the stack should expect to see one of three values there: * 0b0000 => a normal, non-extended record with just [FP, LR] * 0b0001 => the extended record [X22, FP, LR] * 0b1111 => kernel space, and a non-extended record. All other values are currently reserved. If compiling for arm64e this context pointer is address-discriminated with the discriminator 0xc31a and the DB (process-specific) key. There is also an "i8** @llvm.swift.async.context.addr()" intrinsic providing front-ends access to this slot (and forcing its creation initialized to nullptr if necessary).

This reverts commit 85af8a8.

Or at least the sibling call cases which the DAG already handles.

… ArgListEntry" This reverts commit 16748bd. Causes https://crbug.com/1209013

This reverts commit f2f88f3. Because our libomptarget implementation for VE requries out-of-tree builds at the moemnt. TODO: need to apply this patch with updates of libomptarget for VE.

…callback The `TypeSystemMap::m_mutex` guards against concurrent modifications of members of `TypeSystemMap`. In particular, `m_map`. `TypeSystemMap::ForEach` iterates through the entire `m_map` calling a user-specified callback for each entry. This is all done while `m_mutex` is locked. However, there's nothing that guarantees that the callback itself won't call back into `TypeSystemMap` APIs on the same thread. This lead to double-locking `m_mutex`, which is undefined behaviour. We've seen this cause a deadlock in the swift plugin with following backtrace: ``` int main() { std::unique_ptr<int> up = std::make_unique<int>(5); volatile int val = *up; return val; } clang++ -std=c++2a -g -O1 main.cpp ./bin/lldb -o “br se -p return” -o run -o “v *up” -o “expr *up” -b ``` ``` frame #4: std::lock_guard<std::mutex>::lock_guard frame #5: lldb_private::TypeSystemMap::GetTypeSystemForLanguage <<<< Lock #2 frame #6: lldb_private::TypeSystemMap::GetTypeSystemForLanguage frame #7: lldb_private::Target::GetScratchTypeSystemForLanguage ... frame #26: lldb_private::SwiftASTContext::LoadLibraryUsingPaths frame #27: lldb_private::SwiftASTContext::LoadModule frame #30: swift::ModuleDecl::collectLinkLibraries frame #31: lldb_private::SwiftASTContext::LoadModule frame #34: lldb_private::SwiftASTContext::GetCompileUnitImportsImpl frame #35: lldb_private::SwiftASTContext::PerformCompileUnitImports frame #36: lldb_private::TypeSystemSwiftTypeRefForExpressions::GetSwiftASTContext frame #37: lldb_private::TypeSystemSwiftTypeRefForExpressions::GetPersistentExpressionState frame #38: lldb_private::Target::GetPersistentSymbol frame #41: lldb_private::TypeSystemMap::ForEach <<<< Lock #1 frame #42: lldb_private::Target::GetPersistentSymbol frame #43: lldb_private::IRExecutionUnit::FindInUserDefinedSymbols frame #44: lldb_private::IRExecutionUnit::FindSymbol frame #45: lldb_private::IRExecutionUnit::MemoryManager::GetSymbolAddressAndPresence frame #46: lldb_private::IRExecutionUnit::MemoryManager::findSymbol frame #47: non-virtual thunk to lldb_private::IRExecutionUnit::MemoryManager::findSymbol frame #48: llvm::LinkingSymbolResolver::findSymbol frame #49: llvm::LegacyJITSymbolResolver::lookup frame #50: llvm::RuntimeDyldImpl::resolveExternalSymbols frame #51: llvm::RuntimeDyldImpl::resolveRelocations frame #52: llvm::MCJIT::finalizeLoadedModules frame #53: llvm::MCJIT::finalizeObject frame #54: lldb_private::IRExecutionUnit::ReportAllocations frame #55: lldb_private::IRExecutionUnit::GetRunnableInfo frame #56: lldb_private::ClangExpressionParser::PrepareForExecution frame #57: lldb_private::ClangUserExpression::TryParse frame #58: lldb_private::ClangUserExpression::Parse ``` Our solution is to simply iterate over a local copy of `m_map`. **Testing** * Confirmed on manual reproducer (would reproduce 100% of the time before the patch) Differential Revision: https://reviews.llvm.org/D149949

jhuber6 and others added 30 commits May 7, 2021 10:27

[AArch64] add test for missed vectorization; NFC

0a6f11a

This is a reduction of the example in: https://llvm.org/PR50256

BasicAA: Recognize inttoptr as isEscapeSource

bc302bf

Pointers escape when converted to integers, so a pointer produced by converting an integer to a pointer must not be a local non-escaping object. Reviewed By: nikic, nlopes, aqjune Differential Revision: https://reviews.llvm.org/D101541

[mlir][spirv] add support lowering of extract_slice to scalar type

565ee6a

Differential Revision: https://reviews.llvm.org/D102041

[mlir][vector] add pattern to cast away leading unit dim for elementw…

a970e69

…ise op Differential Revision: https://reviews.llvm.org/D102034

[NFC][X86][MCA] AMD Zen3: add test for zero-cycle X87 move

a8e30e6

[X86] AMD Zen 3: _REV variants of zero-cycles moves are also zero-cyc…

2819009

…les (PR50261) Sometimes disassembler picks _REV variants of instructions over the plain ones, which in this case exposed an issue that the _REV variants aren't being modelled as optimizable moves.

[X86] combineXor - limit fold to non-opaque constants (PR50254)

f744723

Ensure we don't try to fold when one might be an opaque constant - the constant fold will fail and then the reverse fold will happen in DAGCombine.....

[libFuzzer] Fix stack-overflow-with-asan.test.

f094144

Fix function return type and remove check for SUMMARY, since it doesn't seem to be output in Windows.

[X86] AMD Zen 3: MOVSX32rr32 is a zero-cycle move

5b1610a

It measures as such, and the reference docs agree. I can't easily add a MCA test, because there's no mnemonic for it, it can only be disassembled or created as a MCInst.

[X86] AMD Zen 3: mark XMM/YMM (but not MMX!) reg moves as eliminatibl…

b8701dc

…e in RegisterFile

[libc++][ci] Run longer CI jobs first

8002c5d

Jobs that test with a more recent standard version run more tests, so they take longer. We'll decrease the average latency by running them first instead of last.

Internalize some cl::opt global variables or move them under namespac…

d8aba75

…e llvm

Allow empty value list in propagateMetadata(Inst, ArrayOf...)

50cf0a1

This will allow writing propagateMetadata(Inst, collectInterestingValues(...)) without concern about empty lists. In case of an empty list, Inst is returned without any changes.

[unittest] Fix -Wunused-variable after D94717

7246049

Revert "[DebugInfo] Fix updateDbgUsersToReg to support DBG_VALUE_LIST"

7ca26c5

This reverts commit 0791f96. Causing crashes: https://crbug.com/1206764

[mlir][docs] remove stale statement about index type in vectors

21db1e3

b614ada ("[mlir] add support for index type in vectors.") removed this limitation. Differential Revision: https://reviews.llvm.org/D102081

[mlir] Add a pattern to bufferize linalg.tensor_reshape.

a3f22d0

Differential Revision: https://reviews.llvm.org/D102089

[mlir] Add a pattern to bufferize std.index_cast.

3444996

Differential Revision: https://reviews.llvm.org/D102088

[flang] Implement NORM2 in the runtime

01c78a0

Implement the reduction transformational intrinsic function NORM2 in the runtime, using infrastructure already in place for MAXVAL & al. Differential Revision: https://reviews.llvm.org/D102024

[LV] Rename Region to TargetRegion, similar to SinkRegion (NFC).

01c26d4

Adjust the name to make it clearer this is the region containing the target recipe, similar to SinkRegion below. Suggested post-commit for ccebf7a.

Tobias Gysi and others added 27 commits May 11, 2021 05:53

[mlir][linalg] Remove IndexedGenericOp support from Tiling...

d69bccf

after introducing the IndexedGenericOp to GenericOp canonicalization (https://reviews.llvm.org/D101612). Differential Revision: https://reviews.llvm.org/D102176

[mlir][linalg] Remove IndexedGenericOp support from Fusion...

6676e09

after introducing the IndexedGenericOp to GenericOp canonicalization (https://reviews.llvm.org/D101612). Differential Revision: https://reviews.llvm.org/D102174

[mlir][linalg] Remove IndexedGenericOp support from LinalgToLoops...

7bc6df2

after introducing the IndexedGenericOp to GenericOp canonicalization (https://reviews.llvm.org/D101612). Differential Revision: https://reviews.llvm.org/D102187

[llvm-dwarfdump] Fix abstract origin vars location stats calculation

1ed2963

There are cases where a concrete DIE with DW_TAG_subprogram can have abstract_origin attribute, so we handle that situation as well. Differential Revision: https://reviews.llvm.org/D101025

[OpenCL] [NFC] Fixed underline being too short in rst

7d20f70

Fix -Wdocumentation warnings. NFCI.

3339940

* Add support for JSON output style to llvm-symbolizer

05d1ae4

This patch adds JSON output style to llvm-symbolizer to better support CLI automation by providing a machine readable output. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D96883

Revert "[VE] Comment out function using VPLegalization"

5b68d9e

This reverts commit 3b6f030.

Merge commit '1db4dbb' into merge/expand-vp

ffdd8a6

[VP] vp.reduce expansion

b5c7601

Merge commit 'd7086af2143d58a6535e0837c4d8789c69c6985f' into feature/…

8a600d2

…merge-upstream-20210511

[VE] Set getExtendForAtomicOps to ISD::ANY_EXTEND

882bdbf

The implementation of subword atomics does not actually guarantee the result is zero-extended, which now caused failures after https://reviews.llvm.org/D101342 was landed.

AMDGPU/GlobalISel: Implement tail calls

113a664

Or at least the sibling call cases which the DAG already handles.

Revert "[NFC] Use ArgListEntry indirect types more in ISel lowering"

757a9ce

This reverts commit 85af8a8.

AMDGPU/GlobalISel: Implement tail calls

6351eb0

Or at least the sibling call cases which the DAG already handles.

Revert "[TargetLowering] Only inspect attributes in the arguments for…

1f06f50

… ArgListEntry" This reverts commit 16748bd. Causes https://crbug.com/1209013

Revert "An attempt to abandon omptarget out-of-tree builds."

af9c013

This reverts commit f2f88f3. Because our libomptarget implementation for VE requries out-of-tree builds at the moemnt. TODO: need to apply this patch with updates of libomptarget for VE.

kaz7 merged commit 6dc0edd into develop Jul 14, 2021

kaz7 deleted the feature/merge-upstream-20210511 branch July 14, 2021 09:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/merge upstream 20210511 #54

Feature/merge upstream 20210511 #54

kaz7 commented Jul 14, 2021

Feature/merge upstream 20210511 #54

Feature/merge upstream 20210511 #54

Conversation

kaz7 commented Jul 14, 2021