[pull] main from llvm:main #170

pull · 2021-10-17T15:11:14Z

See Commits and Changes for more details.

Can you help keep this open source service alive? 💖 Please sponsor : )

Instead of creating real threads for trace tests create a new ThreadState in the main thread. This makes the tests more unit-testy and will also help with future trace tests that will need more than 1 thread. Creating more than 1 real thread and dispatching test actions across multiple threads in the required deterministic order is painful. This is resubmit of reverted D110546 with 2 changes: 1. The previous version patched ImitateTlsWrite to not expect ThreadState to be allocated in TLS (the CHECK failed for the fake test threads). This added an ugly hack into production code and was still logically wrong because we imitated write to the main thread TLS/stack when we started the fake test thread (which has nothing to do with the main thread TLS/stack). This version uses ThreadType::Fiber instead of ThreadType::Regular for the fake threads. This naturally makes ThreadStart skip obtaining stack/tls and imitating writes to them. 2. This version still skips the tests on Darwin and PowerPC to be on the safer side. Build bots reported failures for PowerPC for the previous version. Reviewed By: melver Differential Revision: https://reviews.llvm.org/D111156

SVE has predicated literal forms of some instructions for specific literals, which currently are generated correctly when using ACLE but not when those instructions are generated directly. This adds the patterns to generate those instructions when generating from standard LLVM IR instructions. Differential Revision: https://reviews.llvm.org/D99074

The C and C++ standards require the argument to __has_cpp_attribute and __has_c_attribute to be expanded ([cpp.cond]p5). It would make little sense to expand the argument to those operators but not expand the argument to __has_attribute and __has_declspec, so those were both also changed in this patch. Note that it might make sense for the other builtins to also expand their argument, but it wasn't as clear to me whether the behavior would be correct there, and so they were left for a future revision.

Previously, we reported the same value as for C17, now we report 202000L, which is the same value currently used by GCC. Once C23 ships, this value will be bumped to the correct date.

``` [5.1] 2.21.2 THREADPRIVATE Directive A variable that appears in a threadprivate directive must be declared in the scope of a module or have the SAVE attribute, either explicitly or implicitly. A variable that appears in a threadprivate directive must not be an element of a common block or appear in an EQUIVALENCE statement. ``` This patch supports the following checks for DECLARE TARGET Directive: ``` [5.1] 2.14.7 Declare Target Directive A variable that is part of another variable (as an array, structure element or type parameter inquiry) cannot appear in a declare target directive. A variable that appears in a declare target directive must be declared in the scope of a module or have the SAVE attribute, either explicitly or implicitly. A variable that appears in a declare target directive must not be an element of a common block or appear in an EQUIVALENCE statement. ``` As Fortran 2018 standard [8.5.16] states, a variable, common block, or procedure pointer declared in the scoping unit of a main program, module, or submodule implicitly has the SAVE attribute, which may be confirmed by explicit specification. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D109864

A few more tuples are being queried after D111546. Might be good to model them, They all require a lot of manual assembly surgery. The only sched models that for cpu's that support avx2 but not avx512 are: haswell, broadwell, skylake, zen1-3 For load we have: https://godbolt.org/z/YTeT9M7fW - for intels `Block RThroughput: <=212.0`; for ryzens, `Block RThroughput: <=64.0` So could pick cost of `212` For store we have: https://godbolt.org/z/vc954KEGP - for intels `Block RThroughput: <=90.0`; for ryzens, `Block RThroughput: <=24.0` So we could pick cost of `90`. I'm directly using the shuffling asm the llc produced, without any manual fixups that may be needed to ensure sequential execution. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D111940

A few more tuples are being queried after D111546. Might be good to model them, They all require a lot of manual assembly surgery. The only sched models that for cpu's that support avx2 but not avx512 are: haswell, broadwell, skylake, zen1-3 For load we have: https://godbolt.org/z/s5b6E6jsP - for intels `Block RThroughput: <=32.0`; for ryzens, `Block RThroughput: <=24.0` So could pick cost of `32` For store we have: https://godbolt.org/z/efh99d93b - for intels `Block RThroughput: <=48.0`; for ryzens, `Block RThroughput: <=32.0` So we could pick cost of `48`. I'm directly using the shuffling asm the llc produced, without any manual fixups that may be needed to ensure sequential execution. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D111942

A few more tuples are being queried after D111546. Might be good to model them, They all require a lot of manual assembly surgery. The only sched models that for cpu's that support avx2 but not avx512 are: haswell, broadwell, skylake, zen1-3 For load we have: https://godbolt.org/z/11rcvdreP - for intels `Block RThroughput: <=68.0`; for ryzens, `Block RThroughput: <=48.0` So could pick cost of `68` For store we have: https://godbolt.org/z/6aM11fWcP - for intels `Block RThroughput: <=64.0`; for ryzens, `Block RThroughput: <=32.0` So we could pick cost of `64`. I'm directly using the shuffling asm the llc produced, without any manual fixups that may be needed to ensure sequential execution. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D111943

A few more tuples are being queried after D111546. Might be good to model them, They all require a lot of manual assembly surgery. The only sched models that for cpu's that support avx2 but not avx512 are: haswell, broadwell, skylake, zen1-3 For load we have: https://godbolt.org/z/MTaKboejM - for intels `Block RThroughput: =32.0`; for ryzens, `Block RThroughput: <=16.0` So could pick cost of `32` For store we have: https://godbolt.org/z/v7xPj3Wd4 - for intels `Block RThroughput: =32.0`; for ryzens, `Block RThroughput: <=32.0` So we could pick cost of `32`. I'm directly using the shuffling asm the llc produced, without any manual fixups that may be needed to ensure sequential execution. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D111944

A few more tuples are being queried after D111546. Might be good to model them, They all require a lot of manual assembly surgery. The only sched models that for cpu's that support avx2 but not avx512 are: haswell, broadwell, skylake, zen1-3 For load we have: https://godbolt.org/z/9bnKrefcG - for intels `Block RThroughput: =40.0`; for ryzens, `Block RThroughput: =16.0` So could pick cost of `40` For store we have: https://godbolt.org/z/5s3s14dEY - for intels `Block RThroughput: =40.0`; for ryzens, `Block RThroughput: =16.0` So we could pick cost of `40`. I'm directly using the shuffling asm the llc produced, without any manual fixups that may be needed to ensure sequential execution. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D111945

The multiply() implementation is very slow -- it performs six multiplications in double the bitwidth, which means that it will typically work on allocated APInts and bypass fast-path implementations. Add an additional implementation that doesn't try to produce anything better than a full range if overflow is possible. At least for the BasicAA use-case, we really don't care about more precise modeling of overflow behavior. The current use of multiply() is fine while the implementation is limited to a single index, but extending it to the multiple-index case makes the compile-time impact untenable.

kito-cheng and others added 14 commits October 17, 2021 16:38

[RISCV][NFC] Fix build error

8efa651

[gn build] Port ff13189

1d7aadb

[InstCombine] Add some extra tests for truncated saturates. NFC

052b77e

Bump the value of __STDC_VERSION__ in -std=c2x mode

c8be774

Previously, we reported the same value as for C17, now we report 202000L, which is the same value currently used by GCC. Once C23 ships, this value will be bumped to the correct date.

pull bot added the ⤵️ pull label Oct 17, 2021

pull bot merged commit 274b243 into MaxMood96:main Oct 17, 2021

MaxMood96 mentioned this pull request Feb 3, 2024

[Snyk] Fix for 14 vulnerabilities #1160

Open

MaxMood96 mentioned this pull request Mar 23, 2024

[Snyk] Security upgrade vscode from 1.1.10 to 1.1.37 #1161

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] main from llvm:main #170

[pull] main from llvm:main #170

pull bot commented Oct 17, 2021 •

edited

Loading

[pull] main from llvm:main #170

[pull] main from llvm:main #170

Conversation

pull bot commented Oct 17, 2021 • edited Loading

pull bot commented Oct 17, 2021 •

edited

Loading