Feature/merge upstream 20210417 #50

kaz7 · 2021-07-10T03:19:02Z

Fix #40 and #47.
Also merge up to 2021/4/17.

I'll post the result of regression test soon.

Have funcattrs expand all implied attributes into the IR. This expands the infrastructure from D100400, but for definitions not declarations this time. Somewhat subtly, this mostly isn't semantic. Because the accessors did the inference, any client which used the accessor was already getting the stronger result. Clients that directly checked presence of attributes (there are some), will see a stronger result now. The old behavior can end up quite confusing for two reasons: * Without this change, we have situations where function-attrs appears to fail when inferring an attribute (as seen by a human reading IR), but that consuming code will see that it should have been implied. As a human trying to sanity check test results and study IR for optimization possibilities, this is exceeding error prone and confusing. (I'll note that I wasted several hours recently because of this.) * We can have transforms which trigger without the IR appearing (on inspection) to meet the preconditions. This change doesn't prevent this from happening (as the accessors still involve multiple checks), but it should make it less frequent. I'd argue in favor of deleting the extra checks out of the accessors after this lands, but I want that in it's own review as a) it's purely stylistic, and b) I already know there's some disagreement. Once this lands, I'm also going to do a cleanup change which will delete some now redundant duplicate predicates in the inference code, but again, that deserves to be a change of it's own. Differential Revision: https://reviews.llvm.org/D100226

Current atfork() handler for child processes does not reset the affinity masks array which prevents users from setting their own affinity in child processes. Differential Revision: https://reviews.llvm.org/D99218

Add endianness detection support. This will be useful to implement `memcmp`. Differential Revision: https://reviews.llvm.org/D100571

Implement the remaining GOMP_* functions to support task reductions in taskgroup, parallel, loop, and taskloop constructs. The unused mem argument to many of the work-sharing constructs has to do with the scan() directive/ inscan() modifier. If mem is set, each function will call KMP_FATAL() and tell the user scan/inscan is unsupported. The GOMP reduction implementation is kept separate from our implementation because of how GOMP presents reduction data and computes the reductions. GOMP expects the privatized copies to be present even after a #pragma omp parallel reduction(task:...) region has ended so the data is stored inside GOMP's uintptr_t* data pseudo-structure. This style is tightly coupled with GCC compiler codegen. There also isn't any init(), combiner(), fini() functions in GOMP's codegen so the two implementations were to disparate to try to wrap GOMP's around our own. Differential Revision: https://reviews.llvm.org/D98806

A large portion of the patterns are duplicated for HwMode on RISCV. If we expand HwMode first, we need to check nearly twice as many patterns for variants. HwModes shouldn't affect whether a variant is valid so we should be able to expand after. This also reduces the RISCV isel table by 539 bytes due to factoring working better on this pattern order. Unfortunately it increases Hexagon table size by ~50 bytes. But I think this is a reasonable trade.

If we have a nobuiltin function, we can't assume we know anything about the implementation. I noticed this when tracing through a log from an in the wild miscompile (emscripten-core/emscripten#9443) triggered after 8666463. We were incorrectly assuming that a custom allocator could not free. (It's not clear yet this is the only problem in said issue.) I also noticed something similiar mentioned in the commit message of ab243e when scrolling back through history. Through, from what I can tell, that commit fixed symptom not root cause. The interface we have for library function detection is extremely error prone, but given the interaction between ``nobuiltin`` decls and ``builtin`` callsites, it's really hard to imagine something much cleaner. I may iterate on that, but it'll be invasive enough I didn't want to hold an obvious functional fix on it.

Commiting this patch for Augusto Noronha who is getting set up still. This patch changes Target::ReadMemory so the default behavior when a read is in a Section that is read-only is to fetch the data from the local binary image, instead of reading it from memory. Update all callers to use their old preferences (the old prefer_file_cache bool) using the new API; we should revisit these calls and see if they really intend to read live memory, or if reading from a read-only Section would be equivalent and important for performance-sensitive cases. rdar://30634422 Differential revision: https://reviews.llvm.org/D100338

Reviewed By: MaskRay, craig.topper Differential Revision: https://reviews.llvm.org/D100616

For v2f64, all VSX subtargets can insert an element with a single XXPERMDI.

If a module contains errors (ie. it was built with -fallow-pcm-with-compiler-errors and had errors) and was from the module cache, it is marked as out of date - see a2c1054. When a module is imported multiple times in the one compile, this caused it to be recompiled each time - removing the existing buffer from the module cache and replacing it. This results in various errors further down the line. Instead, only mark the module as out of date if it isn't already finalized in the module cache. Reviewed By: akyrtzi Differential Revision: https://reviews.llvm.org/D100619

The key here is HwMode indices. They're going to be small numbers, contiguous, and only a few different values. I don't think we need to go through the SmallDenseSet hashing. A BitVector would be even better, but we don't have the upper bound here.

…cit. This will help us catch errors like the ones fixed by the commit 31ed45d

Such attributes can either be unset, or set to "true" or "false" (as string). throughout the codebase, this led to inelegant checks ranging from if (Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true") to if (Fn->hasAttribute("no-jump-tables") && Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true") Introduce a getValueAsBool that normalize the check, with the following behavior: no attributes or attribute set to "false" => return false attribute set to "true" => return true Differential Revision: https://reviews.llvm.org/D99299

…gularObj; set wrap's initial binding to sym's Fix PR49897: if `__real_foo` has the isUsedInRegularObj bit set, we need to retain `foo` in .symtab, even if `foo` is undefined. The new behavior will match GNU ld. Before the patch, we produced an R_X86_64_JUMP_SLOT relocation referencing the index 0 undefined symbol, which would be erroed by glibc (see f96ff3c). While here, fix another bug: if `__wrap_foo` does not exist, its initial binding should be `foo`'s.

Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D89631

Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D89653

…r warning. NFCI.

It will not do anything useful for them, as we already know that they don't modref with any accessible memory. In particular, this prevents noalias metadata from being placed on noalias.scope.decl intrinsics. This reduces the amount of metadata needed, and makes it more likely that unnecessary decls can be eliminated.

Debug intrinsics are free to hoist and should be skipped when looking for terminator-only blocks. As a consequence, we have to delegate to the main hoisting loop to hoist any dbg intrinsics instead of jumping to the terminator case directly. This fixes PR49982. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D100640

We could optimize the first case, as the pointer is captured only after the loop.

The internalization pass only internalizes global variables with no users. If the global variable has some dead user, the internalization pass will not internalize it. To be able to internalize global variables with dead users, a global dce pass is needed before the internalization pass. This patch adds that. Reviewed by: Artem Belevich, Matt Arsenault Differential Revision: https://reviews.llvm.org/D98783

Add device variables to llvm.compiler.used if they are ODR-used by either host or device functions. This is necessary to prevent them from being eliminated by whole-program optimization where the compiler has no way to know a device variable is used by some host code. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D98814

GCC 8 introduced these new pragmas to control loop unrolling. We should support them for compatibility reasons and the implementation itself requires few lines of code, since everything needed is already implemented for #pragma unroll/nounroll.

hipRTC compiles HIP device code at run time. Since the system may not have development tools installed, when a HIP program is compiled through hipRTC, there is no standard C or C++ header available. As such, the HIP headers should not depend on standard C or C++ headers when used with hipRTC. Basically when hipRTC is used, HIP headers only provides definitions of HIP device API functions. This is in line with what nvRTC does. This patch adds support of hipRTC to HIP headers in clang. Basically hipRTC defines a macro __HIPCC_RTC__ when compile HIP code at run time. When this macro is defined, HIP headers do not include standard C/C++ headers. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D100652

This patch updates a couple of functions that unnecessarily took the input graph by value, when it was not needed. They can take the graph by const-reference instead, which does not require GraphT to provide a copy constructor. Split off from D100169.

This class initially had args to be generic to future needs. In particular, I thought that source location introspection should show the getBeginLoc of CallExpr args and the getArgLoc of TemplateSpecializationLocInfo etc. However, that is probably best left out of source location introspection because it involves node traversal. If something like this is needed in the future, it can be added in the future. Differential Revision: https://reviews.llvm.org/D100688

Reviewed By: xgupta Differential Revision: https://reviews.llvm.org/D100705

This keeps track of which modes are in VVT so we can find out if a mode is missing later. But we can just ask VVT whether it has a particular mode.

This fixes argument injection in clang command lines, by adding them before "--". Previously, the arguments were injected at the end of the command line and could be added after "--", which would be wrongly interpreted as input file paths. This fix is needed for a subsequent patch, see D92191. Differential Revision: https://reviews.llvm.org/D95099

clang-scan-deps contains some command line parsing and modifications. This patch adds support for clang-cl command options. Differential Revision: https://reviews.llvm.org/D92191

Replace branch on undef by branch on unknown condition.

At the moment, ReversePostOrderTraversal performs a post-order walk on the entry node of the passed in graph, rather than the graph type itself. If GT::NodeRef is the same as GraphT, everything works as expected and this is the case for the current uses in-tree. But it does not work as expected if GraphT != GT::NodeRef. In that case, we either fail to build (if there is no GraphTrait specialization for GT:NodeRef) or we pick the GraphTrait specialization for GT::NodeRef, instead of the specialization of GraphT. Both the depth-first and post-order iterators pick the expected specalization and this patch updates ReversePostOrderTraversal to delegate to po_begin & po_end to pick the right specialization, rather than forcing using GraphTraits<GT::NodeRef>, by first getting the entry node. This makes `ReversePostOrderTraversal<Graph<6>> RPOT(G);` build and work as expected in the test. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D100169

…. NFCI.

Extend the matchers gathering API for types to record template parameters. The TypeLoc type hierarchy has some types which are templates used in CRTP such as PointerLikeTypeLoc. Record the inherited template and template arguments of types inheriting those CRTP types in the ClassInheritance map. Because the name inherited from is now computed, the value type in that map changes from StringRef to std::string. This also causes the toJSON override signature used to serialize that map to change. Remove the logic for skipping over empty ClassData instances. Several classes such as TypeOfExprTypeLoc inherit a CRTP class which provides interesting locations though the derived class does not. Record it as a class to make the locations it inherits available. Record the typeSourceInfo accessors too as they provide access to TypeLocs in many classes. The existing unit tests use UnorderedElementsAre to compare the introspection result with the expected result. Our current implementation of google mock (in gmock-generated-matchers.h) is limited to support for comparing a container of 10 elements. As we are now returning more than 10 results for one of the introspection tests, change it to instead compare against an ordered vector of pairs. Because a macro is used to generate API strings and API calls, disable clang-format in blocks of expected results. Otherwise clang-format would insert whitespaces which would then be compared against the introspected strings and fail the test. Introduce a recursion guard in the generated code. The TypeLoc class has IgnoreParens() API which by default returns itself, so it would otherwise recurse infinitely. Differential Revision: https://reviews.llvm.org/D100516

Revert local modifications on SelectionDAGNodes.h caused by clang-format.

Revert local modifications on following files caused by clang-format. - clang/lib/CodeGen/CGExprScalar.cpp - libunwind/src/Registers.hpp - llvm/include/llvm/Analysis/DivergenceAnalysis.h - llvm/include/llvm/Analysis/LoopAccessAnalysis.h - llvm/lib/Analysis/TargetTransformInfo.cpp - llvm/lib/IR/CMakeLists.txt

VE hasn't implemented relative lookup table in /opt/nec/ve/bin/nld. So we need to disable new pass introduced by https://reviews.llvm.org/D94355 for VE.

…merge-upstream-20210417

kaz7 · 2021-07-11T22:58:11Z

Pass regression tests.

…callback The `TypeSystemMap::m_mutex` guards against concurrent modifications of members of `TypeSystemMap`. In particular, `m_map`. `TypeSystemMap::ForEach` iterates through the entire `m_map` calling a user-specified callback for each entry. This is all done while `m_mutex` is locked. However, there's nothing that guarantees that the callback itself won't call back into `TypeSystemMap` APIs on the same thread. This lead to double-locking `m_mutex`, which is undefined behaviour. We've seen this cause a deadlock in the swift plugin with following backtrace: ``` int main() { std::unique_ptr<int> up = std::make_unique<int>(5); volatile int val = *up; return val; } clang++ -std=c++2a -g -O1 main.cpp ./bin/lldb -o “br se -p return” -o run -o “v *up” -o “expr *up” -b ``` ``` frame #4: std::lock_guard<std::mutex>::lock_guard frame #5: lldb_private::TypeSystemMap::GetTypeSystemForLanguage <<<< Lock #2 frame #6: lldb_private::TypeSystemMap::GetTypeSystemForLanguage frame #7: lldb_private::Target::GetScratchTypeSystemForLanguage ... frame #26: lldb_private::SwiftASTContext::LoadLibraryUsingPaths frame #27: lldb_private::SwiftASTContext::LoadModule frame #30: swift::ModuleDecl::collectLinkLibraries frame #31: lldb_private::SwiftASTContext::LoadModule frame #34: lldb_private::SwiftASTContext::GetCompileUnitImportsImpl frame #35: lldb_private::SwiftASTContext::PerformCompileUnitImports frame #36: lldb_private::TypeSystemSwiftTypeRefForExpressions::GetSwiftASTContext frame #37: lldb_private::TypeSystemSwiftTypeRefForExpressions::GetPersistentExpressionState frame #38: lldb_private::Target::GetPersistentSymbol frame #41: lldb_private::TypeSystemMap::ForEach <<<< Lock #1 frame #42: lldb_private::Target::GetPersistentSymbol frame #43: lldb_private::IRExecutionUnit::FindInUserDefinedSymbols frame #44: lldb_private::IRExecutionUnit::FindSymbol frame #45: lldb_private::IRExecutionUnit::MemoryManager::GetSymbolAddressAndPresence frame #46: lldb_private::IRExecutionUnit::MemoryManager::findSymbol frame #47: non-virtual thunk to lldb_private::IRExecutionUnit::MemoryManager::findSymbol frame #48: llvm::LinkingSymbolResolver::findSymbol frame #49: llvm::LegacyJITSymbolResolver::lookup frame #50: llvm::RuntimeDyldImpl::resolveExternalSymbols frame #51: llvm::RuntimeDyldImpl::resolveRelocations frame #52: llvm::MCJIT::finalizeLoadedModules frame #53: llvm::MCJIT::finalizeObject frame #54: lldb_private::IRExecutionUnit::ReportAllocations frame #55: lldb_private::IRExecutionUnit::GetRunnableInfo frame #56: lldb_private::ClangExpressionParser::PrepareForExecution frame #57: lldb_private::ClangUserExpression::TryParse frame #58: lldb_private::ClangUserExpression::Parse ``` Our solution is to simply iterate over a local copy of `m_map`. **Testing** * Confirmed on manual reproducer (would reproduce 100% of the time before the patch) Differential Revision: https://reviews.llvm.org/D149949

preames and others added 30 commits April 16, 2021 14:22

[OpenMP] Allow affinity to re-detect for child processes

5ebbb36

Current atfork() handler for child processes does not reset the affinity masks array which prevents users from setting their own affinity in child processes. Differential Revision: https://reviews.llvm.org/D99218

[libc] Add endianness support

7e075ad

Add endianness detection support. This will be useful to implement `memcmp`. Differential Revision: https://reviews.llvm.org/D100571

[libc] Fix wrongly deduced type

2bfe158

[gn build] (manually) port ca67510

48cc5b0

[gn build] (manually) port ca67510 better

bc636c1

[clang][NFC] Fix a potential assert failure

06995fe

Reviewed By: MaskRay, craig.topper Differential Revision: https://reviews.llvm.org/D100616

[PowerPC] Minor improvement for insert_vector_elt codegen

ff769dd

For v2f64, all VSX subtargets can insert an element with a single XXPERMDI.

[ELF] Simplify R_386_TLS_GD computation. NFC

b2a3d31

[libc][NFC] Add common template test class for sqrt, sqrtf and sqrtl.

fb69b92

[libc][NFC] Make conversion from FPBits to the float point type expli…

fb706e0

…cit. This will help us catch errors like the ones fixed by the commit 31ed45d

[libc][NFC] Use explicit conversion in modfl_test.

bb8aa2a

[libc]NFC] Use explicit conversion on frexpl_test and logbl_test.

7db1102

[libc][NFC] Use explicit conversion in aarch64 FEnv.

bbba694

[Sanitizers] Undefined Behavior Sanitizer support for DragonFlyBSD

4583759

Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D89631

[Sanitizers] DragonFlyBSD adding support for builtins

0df0d6a

Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D89653

[Sanitizers] Fix build

61fc02d

[Support] AbsoluteDifference - add brackets to appease static analyze…

5953943

…r warning. NFCI.

[LICM] Add more tests for promotion and capture (NFC)

ae2da68

We could optimize the first case, as the pointer is captured only after the loop.

yxsamliu and others added 23 commits April 17, 2021 11:25

[libcxx][doc] Fixes typos.

f8f6029

[NFC] Fixed Typos

21bef4e

Reviewed By: xgupta Differential Revision: https://reviews.llvm.org/D100705

[TableGen] Remove local SmallSet from TypeSetByHwMode::insert.

a5e579c

This keeps track of which modes are in VVT so we can find out if a mode is missing later. But we can just ask VVT whether it has a particular mode.

[AST] Enable AST node introspection on WIN32

141945f

[Support] ThreadPool tests: silence warning unused variable 'It'

7b75a3a

[clang-scan-deps] Add support for clang-cl

bb26fa8

clang-scan-deps contains some command line parsing and modifications. This patch adds support for clang-cl command options. Differential Revision: https://reviews.llvm.org/D92191

[JITLink] Add testcase that was accidentally left out of 19e402d.

0c6ee50

[LoopUnroll] Make some tests more robust (NFC)

fe9a5a8

Replace branch on undef by branch on unknown condition.

[LoopUnroll] Regenerate test checks (NFC)

c456ab7

[libc++] Remove hard tabs, U+00AD, and U+200B from all libc++ headers…

863d5c4

…. NFCI.

[VE] Revert SelectionDAGNodes.h

1296af1

Revert local modifications on SelectionDAGNodes.h caused by clang-format.

[VP]workaround: Run ExpandVP in opt with legacy pm (default in isel)

8169b6b

[VE] Disable relative lookup table converter pass for VE

c26b4d3

VE hasn't implemented relative lookup table in /opt/nec/ve/bin/nld. So we need to disable new pass introduced by https://reviews.llvm.org/D94355 for VE.

Merge commit 'dd68942f1d79986267a58c9a9924522680d5c82b' into feature/…

8fcaf22

…merge-upstream-20210417

kaz7 merged commit 90b054e into develop Jul 11, 2021

kaz7 deleted the feature/merge-upstream-20210417 branch July 11, 2021 22:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/merge upstream 20210417 #50

Feature/merge upstream 20210417 #50

kaz7 commented Jul 10, 2021

kaz7 commented Jul 11, 2021

Feature/merge upstream 20210417 #50

Feature/merge upstream 20210417 #50

Conversation

kaz7 commented Jul 10, 2021

kaz7 commented Jul 11, 2021