-
Notifications
You must be signed in to change notification settings - Fork 66
Add alpha support for SVE2.1 #257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
In this patch it is used for the prototype: * svptrue_c8 (and _c16/_c32/_c64) As described in: ARM-software/acle#257 Patch by: Sander de Smalen <sander.desmalen@arm.com> Reviewed By: sdesmalen, david-arm Differential Revision: https://reviews.llvm.org/D150953
commit 1496c57722c7db8db7e582b582317e15e719ceb0
Merge: f28ae00bf6a3 074276b9ae76
Author: ns_tester <ns_tester@intel.com>
Date: Wed Jun 7 22:32:59 2023 -0700
LLVM and SPIRV-LLVM-Translator pulldown (WW22)
LLVM: llvm/llvm-project@40c26ecSPIRV-LLVM-Translator: KhronosGroup/SPIRV-LLVM-Translator@c2ff406
commit f28ae00bf6a3dc946194e6f8b543a115fe241c20
Author: Nick Sarnie <sarnex@users.noreply.github.com>
Date: Wed Jun 7 23:47:13 2023 -0400
[ESIMD] More support for 64-bit offsets with accessors in stateless mode (#9591)
This adds support for 64-bit offsets with accessors in stateless mode
for the remaining APIs. Please let me know if I missed any.
Today, all of the APIs convert to 32-bit offsets with no error if passed
a 64-bit offset, except for vector offset versions of `lsc_gather`,
`lsc_scatter`, and `lsc_prefetch`. Do not error except in these three
cases in order to preserve backward compatibility.
I manually ran all of these tests on PVC and confirmed they pass with
this change and fail without it.
In some cases, in stateful mode, the underlying intrinsic we call only
supports 32-bit offsets, so we need to convert.
---------
Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
commit f34e5458aa63bb2a4362c327859f49474f873b9d
Author: Nick Sarnie <sarnex@users.noreply.github.com>
Date: Wed Jun 7 20:42:52 2023 -0400
[SYCL][ESIMD] Use SPIR-V intrinsic to cast image object to int (#9696)
We currently have a hack that relies on the type the Clang frontend
generates for images, see
[here](https://github.com/intel/llvm/blob/12dd0ad040ea61f1201fa9d82efd5079ce7dc6ca/sycl/include/sycl/ext/intel/esimd/detail/memory_intrin.hpp#L1171).
With opaque pointers, the Clang frontend generates image types as target
extension types instead of pointers, so the hack fails.
The cleanest way to fix this would be to do the cast at
reverse-translation time inside IGC, however the IGC team refused that
solution.
Instead, punt the cast to inside the SPIR-V translator when converting
to SPIR-V, where the type will be a pointer as well.
The `__spirv_ConvertPtrToU` function will be converted to
`OpConvertPtrToU` inside the SPIR-V translator.
This is definitely still a hack, but I don't think it's more hacky than
before, and I don't know of any other ways to fix this.
Note this solution works for both typed pointers and opaque pointers,
and for normal pointer accessors and image accessors.
Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
commit 8990c5503d47e397c837d991bf6bc5a0feda9b8a
Author: Igor Gorban <igor.gorban@intel.com>
Date: Wed Jun 7 22:19:33 2023 +0200
[SYCL] Fix handling unsupported attributes (#9756)
llvm::Attribute::ReadNone/ReadOnly/WriteOnly are no longer supports,
to have posibility to fix them with calls, generated by external library
(vc-intrinsics) - it is needed to remove them manually
It is impossible to fix this on vc-intrinsics side, because it works
not only with latest llvm-version and use this attributes in another
projects.
---------
Signed-off-by: Vyacheslav N Klochkov <vyacheslav.n.klochkov@intel.com>
Co-authored-by: Vyacheslav N Klochkov <vyacheslav.n.klochkov@intel.com>
commit 485221047281e3d47f7376394667b85a63173991
Author: aelovikov-intel <andrei.elovikov@intel.com>
Date: Wed Jun 7 08:48:04 2023 -0700
[CI] Generate test matrix on self-hosted runner (#9773)
Github's ubuntu-* runners could take multiple hours to allocate in our
organization. Switch to our self-hosted cuda runner that is sitting idle
because we perform CUDA testing in AWS.
commit 64bd50820262ded6fbd32d63bd96d5fdbf6861ac
Author: aelovikov-intel <andrei.elovikov@intel.com>
Date: Wed Jun 7 07:45:29 2023 -0700
[SYCL][CI] Fuse two post-commit builds into one (#9695)
commit 074276b9ae760528d97f75d767a1744e6f2a3f2f
Merge: 0ca2be5c82ec d48a5fb2b664
Author: Artur Gainullin <artur.gainullin@intel.com>
Date: Wed Jun 7 07:11:20 2023 -0700
Merge remote-tracking branch 'origin/sycl' into llvmspirv_pulldown
commit d48a5fb2b6645819a6811a65a402d322c222dc36
Author: fineg74 <61437305+fineg74@users.noreply.github.com>
Date: Wed Jun 7 04:59:29 2023 -0700
[SYCL][ESIMD] Update the test regression/atomic_update_test.cpp to improve reliability (#9715)
commit eb7e3f032ff98fa98b4927fe9a785e75a5c51240
Author: jinz2014 <7799920+jinz2014@users.noreply.github.com>
Date: Wed Jun 7 06:58:20 2023 -0400
[SYCL] Add unit tests for the HIP plugin (#9391)
The kernel test (test_kernels.cpp) is incomplete because how to generate
binary files properly for "piProgramCreateWithBinary" for the HIP
backend is not clear to me.
Thank you for reviewing and editing the PR.
---------
Co-authored-by: Jin Z <5zj@equinox.ftpn.ornl.gov>
Co-authored-by: Dmitry Vodopyanov <dmitry.vodopyanov@intel.com>
Co-authored-by: Jin Z <5zj@cousteau.ftpn.ornl.gov>
commit 3c19581f828c54ff1037a420b4614c01628bcc56
Author: jinge90 <ge.jin@intel.com>
Date: Wed Jun 7 16:35:10 2023 +0800
[SYCL][libdevice] Move fabs, fabs to imf_fp32/64_dl.cpp and add llabs (#9732)
fabsf, fabs and llabs are required by deep learning frameworks, so we
move fabsf and fabs to separate file imf_fp32/64_dl.cpp and add llabs to
imf_fp32_dl.cpp as well.
Signed-off-by: jinge90 <ge.jin@intel.com>
commit f96b85d002745aea35114c512aae020a0e5caaca
Author: Chris Perkins <chris.perkins@intel.com>
Date: Tue Jun 6 14:04:26 2023 -0700
disable ze_debug tests on Windows for known failures. (#9764)
some of the ze_debug=4 memory leak tests are failing on Windows. These
are not new failures, as the ze_debug=4 memory checker was disabled on
Windows for a long time. It has recently been re-enabled, and now these
tests are failing. The shutdown() procedure on Windows is not (yet)
parallel to Linux, work is ongoing on that front. This PR disables these
tests until we reach shutdown() parity.
FWIW, the Windows OS is super aggressive about reclaiming memory, and
the BKM in complex situations like this is to just let Windows reclaim.
Signed-off-by: Chris Perkins <chris.perkins@intel.com>
commit 19b6247ed9be9e2baae2e5a0a1ddddf4f412b1e7
Author: aelovikov-intel <andrei.elovikov@intel.com>
Date: Tue Jun 6 12:56:11 2023 -0700
[SYCL][Test E2E] Fix SG sizes detection in lit.cfg.py (#9761)
commit 0ca2be5c82ec6b5be0f5ef6850b3afbfbc99aba3
Author: Churina, Ksenia <ksenia.churina@intel.com>
Date: Tue Jun 6 12:45:28 2023 -0700
Disable Basic/stream/stream.cpp test for HIP until it is fixed
commit 93a487cc72a7e0c4852a41678d102a08e20192b0
Author: jinz2014 <7799920+jinz2014@users.noreply.github.com>
Date: Tue Jun 6 15:42:29 2023 -0400
[SYCL][HIP] Add the interop-buffer-hip test (#9705)
Co-authored-by: Jin Z <5zj@cousteau.ftpn.ornl.gov>
commit 4826c07e02c1df6cf4ac4f21b650efec37d583c4
Author: Pablo Reble <pablo.reble@intel.com>
Date: Tue Jun 6 14:41:05 2023 -0500
[SYCL] ABI check script improve path concatenation (#9482)
Patch fixes path concatenation issue. Script fails if the provided path
has no trailing slash.
Should work OS independently. Manually tested on Linux.
commit 3350c05baf495da71222590becc6ec7e9dae50f8
Author: Srividya Sundaram <srividya.sundaram@intel.com>
Date: Tue Jun 6 12:31:13 2023 -0700
[SYCL] Add ESIMD test to check kernel arg size (#9076)
commit ca55b912d4f08e04abdb654b9f5ed7f18dd87fd8
Author: fineg74 <61437305+fineg74@users.noreply.github.com>
Date: Tue Jun 6 12:21:26 2023 -0700
[ESIMD] Make the test regression/bfloat16Constructor.cpp executable on GEN12 (#9748)
commit 4a76d213c24cac4615a8f9e57fa3dc643c931956
Author: Fedor Veselovskiy <fedor.veselovsky@intel.com>
Date: Tue Jun 6 21:19:41 2023 +0200
[SYCL][InvokeSimd][E2E] Remove XFAIL status from InvokeSimd named barrier tests (#9741)
commit db6bec7b7e31ac18c92e71776fda833707678515
Author: Fraser Cormack <frasercrmck@users.noreply.github.com>
Date: Tue Jun 6 20:18:11 2023 +0100
[SYCL][Fusion] Add missing header (#9691)
This was causing build failures with some compilers.
commit c899a93410c23b26a600159762e4dab5f240bc1f
Author: Przemyslaw-Wisniewski-Mobica <93128086+pwisniewskimobica@users.noreply.github.com>
Date: Tue Jun 6 21:17:01 2023 +0200
[SYCL] Add sycl/detail/defines_elementary.hpp to bit_cast.hpp to be self contained (#9684)
commit f19cfe6a97699a11c78ae248ffe548a8889992bc
Author: Andrey Alekseenko <al42and@gmail.com>
Date: Tue Jun 6 21:13:13 2023 +0200
[SYCL][CUDA] Fix info::device::version (#9623)
Report major.minor instead of major.major
commit f73230d8a8ba75b0b43b27ce09253e1b51e1757f
Author: aelovikov-intel <andrei.elovikov@intel.com>
Date: Tue Jun 6 12:01:40 2023 -0700
[SYCL][ABI-break] Remove getOSModuleHandle usage (#9659)
test-e2e/SharedLib,SPVDumpUse show that we don't really need it.
commit f44d0133d4b0077298f034697a1f3818ff1d6134
Author: Dirk MG Seynhaeve <dirk.seynhaeve@intel.com>
Date: Tue Jun 6 11:00:34 2023 -0700
[NFC] Productize clang-offload-extract: clean up code for command line parsing and help (#9594)
* Impose the mandatory LLVM style for clang-format
* Remove any code that was trying to enhance the LLVM builtin help
functionality: the extra code only made for confusing help and error
messages.
* Don't provide any required options, but provide reasonable defaults.
* Clean up the descriptions for the help. Use easier-to-maintain
heredocs for the multiline descriptions.
* Use the more trivial `--stem` rather than `--output`. The `--output`
option is still supported, but labeled deprecated.
* Enforce double-dash long options.
* Provide more context in error diagnostics.
* Streamline the searches and predicates.
* Modernize LLVM (e.g. remove predicated makeArrayRef).
* More efficient iterators for the range-based for loops.
* Extensive comments.
commit 8364176393ad741b5dbf56ae58e2c0da1a908bad
Author: aelovikov-intel <andrei.elovikov@intel.com>
Date: Tue Jun 6 10:20:44 2023 -0700
[SYCL] Add tests for SYCL_DUMP_IMAGES/SYCL_USER_KERNEL_SPV (#9725)
That required to introduce an extra environment variable control -
`SYCL_DUMP_IMAGES_PREFIX` to control location of the produced images.
commit 0267c1b237409fb5ffc28c0511a30153c29fe29f
Author: JackAKirk <jack.kirk@codeplay.com>
Date: Tue Jun 6 14:51:55 2023 +0100
[SYCL][CUDA] Enable sycl-ls-gpu-default-any on CUDA (#9372)
This is a migration of this PR
https://github.com/intel/llvm-test-suite/pull/1144/commits
---------
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
commit d46d3d68700288203a5c709a7469b6883104f335
Author: JackAKirk <jack.kirk@codeplay.com>
Date: Tue Jun 6 14:50:30 2023 +0100
[SYCL][CUDA][DOC] Added Tensor Cores supported param combinations table to joint_matrix extension doc (#9019)
This PR documents the supported joint_matrix API parameters sets when
using `ext_oneapi_cuda`, similar to the XMX, AMX tables added here:
https://github.com/intel/llvm/pull/7964
This will allow us to point people who would like to use `joint_matrix`
on a specific architecture to the extension document. E.g.
https://github.com/intel/llvm/issues/8795
---------
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
commit c0ab9f8bf0d5f6722c03cfd0aba7aca0ae9a2e81
Author: Jakub Chlanda <jakub@codeplay.com>
Date: Tue Jun 6 15:48:37 2023 +0200
[SYCL] Add native half type flag for NVPTX >= SM_53 (#8906)
LLVM will now error out if builtins operating on half types are used
without explicitly passing `-fnative-half-type` (see:
https://reviews.llvm.org/D146715). PTX supports half types since
[SM_53](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html?highlight=half%20precision#half-precision-floating-point-instructions).
commit 1a283acaac3cb944746da21e6337ce6cdaea9711
Author: Christoph Bauinger <c.bauinger@gmx.at>
Date: Tue Jun 6 15:41:27 2023 +0200
[SYCL] Add proposal for append_and_shift extension (#8902)
Proposal extends the existing shift_group_left and shift_group_right
functions to first append and then shift such that all items in a sub
group can have well-defined values after the shift.
---------
Co-authored-by: Greg Lueck <gregory.m.lueck@intel.com>
commit 10d2f5b613f3f7e383a8140d38cabc35551b68ea
Author: mmoadeli <mahmoud.moadeli@codeplay.com>
Date: Tue Jun 6 14:40:15 2023 +0100
Adds explicit conversion of multi_ptr<T> to multi_ptr<const T>. (#9750)
This ctor has been previously removed, as it had conflict with existing
ones.
Not having the ctor produces failures to compile some cts tests. An
investigation is required.
commit fa501fd21a286c5d6d760249a88c6ffddaffe2e8
Author: Georgi Mirazchiyski <georgi.mirazchiyski@codeplay.com>
Date: Tue Jun 6 12:52:19 2023 +0100
[SYCL][CUDA] Add fix for local size calculation regression (#9736)
This PR fixes a performance regression wrt work-group size selection
when only `sycl::range` is used.
The regression was reported in issue
[#5627](https://github.com/intel/llvm/issues/5627).
We want the work-groups to be uniformly distributed but that could lead
to non-optmially sized work-groups is the global work size is not an
even number. Ideally, we want ensure that the work-group size is a power
of two.
commit 37bb6a2bab16f58d7fe8f7418688d36db9e4422a
Author: Petr Vesely <22935437+veselypeta@users.noreply.github.com>
Date: Tue Jun 6 12:16:21 2023 +0100
[SYCL][PI][UR] Fix pi2ur sampler return info (#9693)
pi2ur was missing a conversion from UnifiedRuntime sampler info values
to valid PI sampler Info values. This PR implements a valid conversion
between these values.
commit 2ab86f1149b7965bf352d2604bf9c95d98c0b350
Author: aelovikov-intel <andrei.elovikov@intel.com>
Date: Tue Jun 6 04:15:13 2023 -0700
[CI] Include check-libdevice to BUILD LIT checks (#9743)
commit 835ced6c88de821f9c3d97138153828845d3e631
Author: aelovikov-intel <andrei.elovikov@intel.com>
Date: Mon Jun 5 21:36:53 2023 -0700
[CI] Align installation steps between Linux/Windows (#9746)
* Use LLVM_INSTALL_UTILS=ON on Windows
* Move clang-{format,tidy} installation into its own step
* Reorder lines to match between Linux/Windows
commit 09f76e8afd2ffcd988cc490aebc775a304cd23a6
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date: Mon Jun 5 19:35:18 2023 -0700
Bump requests from 2.28.1 to 2.31.0 in /llvm/utils/git (#9560)
Bumps [requests](https://github.com/psf/requests) from 2.28.1 to 2.31.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/psf/requests/releases">requests's
releases</a>.</em></p>
<blockquote>
<h2>v2.31.0</h2>
<h2>2.31.0 (2023-05-22)</h2>
<p><strong>Security</strong></p>
<ul>
<li>
<p>Versions of Requests between v2.3.0 and v2.30.0 are vulnerable to
potential
forwarding of <code>Proxy-Authorization</code> headers to destination
servers when
following HTTPS redirects.</p>
<p>When proxies are defined with user info (<a
href="https://user:pass@proxy:8080">https://user:pass@proxy:8080</a>),
Requests
will construct a <code>Proxy-Authorization</code> header that is
attached to the request to
authenticate with the proxy.</p>
<p>In cases where Requests receives a redirect response, it previously
reattached
the <code>Proxy-Authorization</code> header incorrectly, resulting in
the value being
sent through the tunneled connection to the destination server. Users
who rely on
defining their proxy credentials in the URL are <em>strongly</em>
encouraged to upgrade
to Requests 2.31.0+ to prevent unintentional leakage and rotate their
proxy
credentials once the change has been fully deployed.</p>
<p>Users who do not use a proxy or do not supply their proxy credentials
through
the user information portion of their proxy URL are not subject to this
vulnerability.</p>
<p>Full details can be read in our <a
href="https://github.com/psf/requests/security/advisories/GHSA-j8r2-6x86-q33q">Github
Security Advisory</a>
and <a
href="https://nvd.nist.gov/vuln/detail/CVE-2023-32681">CVE-2023-32681</a>.</p>
</li>
</ul>
<h2>v2.30.0</h2>
<h2>2.30.0 (2023-05-03)</h2>
<p><strong>Dependencies</strong></p>
<ul>
<li>
<p>⚠️ Added support for urllib3 2.0. ⚠️</p>
<p>This may contain minor breaking changes so we advise careful testing
and
reviewing <a
href="https://urllib3.readthedocs.io/en/latest/v2-migration-guide.html">https://urllib3.readthedocs.io/en/latest/v2-migration-guide.html</a>
prior to upgrading.</p>
<p>Users who wish to stay on urllib3 1.x can pin to
<code>urllib3<2</code>.</p>
</li>
</ul>
<h2>v2.29.0</h2>
<h2>2.29.0 (2023-04-26)</h2>
<p><strong>Improvements</strong></p>
<ul>
<li>Requests now defers chunked requests to the urllib3 implementation
to improve
standardization. (<a
href="https://redirect.github.com/psf/requests/issues/6226">#6226</a>)</li>
<li>Requests relaxes header component requirements to support bytes/str
subclasses. (<a
href="https://redirect.github.com/psf/requests/issues/6356">#6356</a>)</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/psf/requests/blob/main/HISTORY.md">requests's
changelog</a>.</em></p>
<blockquote>
<h2>2.31.0 (2023-05-22)</h2>
<p><strong>Security</strong></p>
<ul>
<li>
<p>Versions of Requests between v2.3.0 and v2.30.0 are vulnerable to
potential
forwarding of <code>Proxy-Authorization</code> headers to destination
servers when
following HTTPS redirects.</p>
<p>When proxies are defined with user info (<a
href="https://user:pass@proxy:8080">https://user:pass@proxy:8080</a>),
Requests
will construct a <code>Proxy-Authorization</code> header that is
attached to the request to
authenticate with the proxy.</p>
<p>In cases where Requests receives a redirect response, it previously
reattached
the <code>Proxy-Authorization</code> header incorrectly, resulting in
the value being
sent through the tunneled connection to the destination server. Users
who rely on
defining their proxy credentials in the URL are <em>strongly</em>
encouraged to upgrade
to Requests 2.31.0+ to prevent unintentional leakage and rotate their
proxy
credentials once the change has been fully deployed.</p>
<p>Users who do not use a proxy or do not supply their proxy credentials
through
the user information portion of their proxy URL are not subject to this
vulnerability.</p>
<p>Full details can be read in our <a
href="https://github.com/psf/requests/security/advisories/GHSA-j8r2-6x86-q33q">Github
Security Advisory</a>
and <a
href="https://nvd.nist.gov/vuln/detail/CVE-2023-32681">CVE-2023-32681</a>.</p>
</li>
</ul>
<h2>2.30.0 (2023-05-03)</h2>
<p><strong>Dependencies</strong></p>
<ul>
<li>
<p>⚠️ Added support for urllib3 2.0. ⚠️</p>
<p>This may contain minor breaking changes so we advise careful testing
and
reviewing <a
href="https://urllib3.readthedocs.io/en/latest/v2-migration-guide.html">https://urllib3.readthedocs.io/en/latest/v2-migration-guide.html</a>
prior to upgrading.</p>
<p>Users who wish to stay on urllib3 1.x can pin to
<code>urllib3<2</code>.</p>
</li>
</ul>
<h2>2.29.0 (2023-04-26)</h2>
<p><strong>Improvements</strong></p>
<ul>
<li>Requests now defers chunked requests to the urllib3 implementation
to improve
standardization. (<a
href="https://redirect.github.com/psf/requests/issues/6226">#6226</a>)</li>
<li>Requests relaxes header component requirements to support bytes/str
subclasses. (<a
href="https://redirect.github.com/psf/requests/issues/6356">#6356</a>)</li>
</ul>
<h2>2.28.2 (2023-01-12)</h2>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/psf/requests/commit/147c8511ddbfa5e8f71bbf5c18ede0c4ceb3bba4"><code>147c851</code></a>
v2.31.0</li>
<li><a
href="https://github.com/psf/requests/commit/74ea7cf7a6a27a4eeb2ae24e162bcc942a6706d5"><code>74ea7cf</code></a>
Merge pull request from GHSA-j8r2-6x86-q33q</li>
<li><a
href="https://github.com/psf/requests/commit/302225334678490ec66b3614a9dddb8a02c5f4fe"><code>3022253</code></a>
test on pypy 3.8 and pypy 3.9 on windows and macos (<a
href="https://redirect.github.com/psf/requests/issues/6424">#6424</a>)</li>
<li><a
href="https://github.com/psf/requests/commit/b639e66c816514e40604d46f0088fbceec1a5149"><code>b639e66</code></a>
test on py3.12 (<a
href="https://redirect.github.com/psf/requests/issues/6448">#6448</a>)</li>
<li><a
href="https://github.com/psf/requests/commit/d3d504436ef0c2ac7ec8af13738b04dcc8c694be"><code>d3d5044</code></a>
Fixed a small typo (<a
href="https://redirect.github.com/psf/requests/issues/6452">#6452</a>)</li>
<li><a
href="https://github.com/psf/requests/commit/2ad18e0e10e7d7ecd5384c378f25ec8821a10a29"><code>2ad18e0</code></a>
v2.30.0</li>
<li><a
href="https://github.com/psf/requests/commit/f2629e9e3c7ce3c3c8c025bcd8db551101cbc773"><code>f2629e9</code></a>
Remove strict parameter (<a
href="https://redirect.github.com/psf/requests/issues/6434">#6434</a>)</li>
<li><a
href="https://github.com/psf/requests/commit/87d63de8739263bbe17034fba2285c79780da7e8"><code>87d63de</code></a>
v2.29.0</li>
<li><a
href="https://github.com/psf/requests/commit/51716c4ef390136b0d4b800ec7665dd5503e64fc"><code>51716c4</code></a>
enable the warnings plugin (<a
href="https://redirect.github.com/psf/requests/issues/6416">#6416</a>)</li>
<li><a
href="https://github.com/psf/requests/commit/a7da1ab3498b10ec3a3582244c94b2845f8a8e71"><code>a7da1ab</code></a>
try on ubuntu 22.04 (<a
href="https://redirect.github.com/psf/requests/issues/6418">#6418</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/psf/requests/compare/v2.28.1...v2.31.0">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts page](https://github.com/intel/llvm/network/alerts).
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
commit 9a82d283ae2bbc092e796571ec9defbc7eb9c4a6
Author: Michael Toguchi <michael.d.toguchi@intel.com>
Date: Mon Jun 5 18:34:02 2023 -0700
[Driver][SYCL] Fix optimization option processing for device options (#9703)
When using -O0, we imply -cl-opt-disable for device. This was
incorrectly being implied when we were overriding with an optimization
enabling option (-O0 -O2). Fix the logic.
commit 0ae900a9a6784f45833784b9f4262d622733a789
Author: Artur Gainullin <artur.gainullin@intel.com>
Date: Mon Jun 5 17:16:36 2023 -0700
[SYCL] Fix kernel-bundle-merge-options-env.cpp test
Test is supposed to check that options provided through the driver via
-Xsycl-target-linker and -Xsycl-target-frontend get overriden by options
provided through env variables SYCL_PROGRAM_COMPILE_OPTIONS and
SYCL_PROGRAM_LINK_OPTIONS. Test author used dummy options called "-bar"
and "-bar_compile" to check that they are overriden. But those are
actually considered not as a dummy option but as a real option "-b"
which was not the original intent.
After the commit in llorg which removes "-b" option from the driver:
commit 89d71c1efa85656b54bcd79b4278bc67690480e1
Author: Fangrui Song <i@maskray.me>
Date: Fri May 26 15:30:23 2023 -0700
[Driver] Reject AIX-specific link options on non-AIX targets
test started to fail. So, replace those options with "-DBAR" and
"-DBAR_COMPILE" respectively.
commit 1a3e99307e4f754d300a14f4dec8111322644d85
Author: Steffen Larsen <steffen.larsen@intel.com>
Date: Mon Jun 5 16:51:57 2023 +0100
[SYCL] Add missing SYCL 2020 image is_property_of specializations (#9652)
This commit adds specializations of is_property_of for
property::image::use_host_ptr, property::image::use_mutex, and
property::image::context_bound with unsampled_image and sampled_image.
Likewise, this commit adds specializations of is_property_of for
property::no_init with unsampled_image_accessor, sampled_image_accessor,
host_unsampled_image_accessor and host_sampled_image_accessor.
Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
commit c953fed97ef68a15efa82b5e211809be8695a0da
Author: jinge90 <ge.jin@intel.com>
Date: Mon Jun 5 22:45:25 2023 +0800
[SYCL][libdevice] Add libdevice lit test to check 'double' usage for fp32 spirv file on-double spirv file(#9711)
commit ef1f8462cf270a925b34b7c35da7e2f29c654355
Author: Steffen Larsen <steffen.larsen@intel.com>
Date: Mon Jun 5 13:29:14 2023 +0100
[SYCL][NFC] Remove unused parameter in preScreenAccessor (#9737)
Addresses post-commit failure after
https://github.com/intel/llvm/pull/9634
Signed-off-by: Larsen, Steffen <steffen.larsen@intel.com>
commit 0c0809590f378925415e6fd317867b8123aaf0e6
Author: mmoadeli <mahmoud.moadeli@codeplay.com>
Date: Mon Jun 5 08:35:05 2023 +0100
[SYCL] Allow accessor constructed with zero-size buffers (#9634)
* Allow accessor constructed with zero-size buffers. [Clarify behaviour
for range of zero](https://github.com/KhronosGroup/SYCL-Docs/pull/192)
* Remove existing error disallowing it.
* Add test
---------
Co-authored-by: Steffen Larsen <steffen.larsen@intel.com>
commit 2648b7c5e1a4af8e0ffded1c431b79813fb24777
Author: Artur Gainullin <artur.gainullin@intel.com>
Date: Fri Jun 2 16:10:24 2023 -0700
[SYCL] Rename win_proxy_loader to pi_win_proxy_loader (#9724)
Co-authored-by: Dale <stewart.t.dale@intel.com>
commit 4c5521c9edae675bff012c367cf53b457068f039
Author: aelovikov-intel <andrei.elovikov@intel.com>
Date: Fri Jun 2 14:58:00 2023 -0700
[CI] Fix pre-commit job dependencies on Windows (#9727)
Bug-fix for https://github.com/intel/llvm/pull/9709.
commit 35171b3c360092299bd43b3ad10ba885254ed805
Author: Erich Keane <erich.keane@intel.com>
Date: Fri Jun 2 14:00:14 2023 -0700
Finish fixing 2nd SemaSYCl test due to diag change.
My previous commit for SemaSYCL seemingly missed 1 spot, this
patch fixes that one too.
commit f874ec8410fd6bf94b996df0e32ca2087addcec5
Author: Erich Keane <erich.keane@intel.com>
Date: Fri Jun 2 13:45:22 2023 -0700
Fix 2 sycl tests: SemaSYCL/loop_fusion.cpp, SemaSYCL/fpga_pipes.cpp
Two tests failed because the diagnostic message format changed,
but emission of it was not updated. This patch corrects that.
commit 12dd0ad040ea61f1201fa9d82efd5079ce7dc6ca
Author: Byoungro So <byoungro.so@intel.com>
Date: Fri Jun 2 11:39:53 2023 -0700
[SYCL] Free allocated memory to avoid memory leak (#9722)
We just need to call free() to avoid memory leak.
Signed-off-by: Byoungro So <byoungro.so@intel.com>
commit c6500e41fdc02545ae1867e9c3a868734ecc62c2
Author: aelovikov-intel <andrei.elovikov@intel.com>
Date: Fri Jun 2 08:03:34 2023 -0700
Revert "[SYCL][CI] Cancel in-progress pre_commit job when PR is updated (#9706)" (#9721)
This reverts commit 1db96de9f9b394fbed0b8953849108f255dd31d7.
CI seems to be stuck after this PR has been merged.
commit 11ac7300305669b6e23bbac03c8c1fe0214cac8e
Author: Justin Cai <justin.cai@intel.com>
Date: Fri Jun 2 00:28:50 2023 -0700
[SYCL] Add support for scalar logical operators with group algorithms (#9298)
commit e33c2f666e3b5fc873c23e08963ca71c5fc39509
Author: tovinkere <vasanth.tovinkere@intel.com>
Date: Thu Jun 1 23:49:09 2023 -0700
[XPTI] CMakeFiles fix to support independent build of XPTI (#9262)
There have been requests from tools implementors to be able to
independently build XPTI proxy
library and the existing CMakeFiles.txt has issues that prevent this and
needed to be addressed.
---------
Signed-off-by: Vasanth Tovinkere <vasanth.tovinkere@intel.com>
commit e45834c363d0c26d9c461455ea9654fb1ff947eb
Author: rdeodhar <rajiv.deodhar@intel.com>
Date: Thu Jun 1 23:47:14 2023 -0700
[SYCL] [L0] Test adjustment for Windows (#9658)
Explicitly enable a default context so that all queues use that context
and immediate command list recycling happens as expected,
commit 260182a1ad758994a652b4241bbe22f6f13cc003
Author: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
Date: Thu Jun 1 20:36:03 2023 -0700
[SYCL][UR][L0] Clean up events on queue wait (#9643)
After the last command in an in-order queue has completed, clean up the
rest of the events so they are available for later reuse.
Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
commit a4283b33744d095743015f44a90afd003c2564ae
Author: aelovikov-intel <andrei.elovikov@intel.com>
Date: Thu Jun 1 20:32:20 2023 -0700
[SYCL][CI][WIN] Skip some checks depending on what files have changed (#9709)
Follow-up for #9589 implementing the same as it on Windows.
commit 1db96de9f9b394fbed0b8953849108f255dd31d7
Author: aelovikov-intel <andrei.elovikov@intel.com>
Date: Thu Jun 1 20:30:57 2023 -0700
[SYCL][CI] Cancel in-progress pre_commit job when PR is updated (#9706)
commit 66b2e89172001c8e9bc60f402b811e7b41e43e0a
Author: aelovikov-intel <andrei.elovikov@intel.com>
Date: Thu Jun 1 20:20:52 2023 -0700
[SYCL][CI] Improve compression performance (#9675)
This was originally implemented in
https://github.com/intel/llvm/pull/5678.
Start with Linux only for now. Benchmarking several compression
utilities for time/size:
| | Pack time | Upload time | Size |
| ------- | --------- | ----------- | ------ |
| xz | 5m 20s | 1m 30s | 350 MB |
| lz4 | 3s | 3m 10s | 660 MB |
| zstd -9 | 25s | 2m 4s | 467 MB |
The difference in size between xz/lz4 would result in 1m30s -> 3m
increase in artifacts upload time so the pack time gain would be
partially offset by that.
I don't see a way to get data about unpack from the CI, but locally on a
different machine (and likely with a different build) I had this:
| | Pack time | Unpack time |
| ---- | --------- | ----------- |
| xz | 28m 30s | 1m 13s |
| lz4 | 11s | 6s |
| zstd | 1m 22s | 8s |
Based on the data above we're switching to use `zstd -9` as our
compression algorithm.
commit 856ad1d77927ddef77a2a8e6ec5ed43eeb4b75eb
Author: fineg74 <61437305+fineg74@users.noreply.github.com>
Date: Thu Jun 1 15:46:40 2023 -0700
[ESIMD][E2E] Temporarily disable -ffast-math option for 7 LIT tests (#9660)
This PR is a work around for tests failing when compiled with icpx and
succeeding when compiled with clang++. The root cause of that behavior
is fast-math option that is enabled by default when using icpx and
disabled by default when using clang++. As a work around the affected
tests will be compiled with no-fast-math option.
commit 43d20039920ee187b379781188148fd4cccf6786
Author: Nick Sarnie <sarnex@users.noreply.github.com>
Date: Thu Jun 1 18:20:09 2023 -0400
[SYCL][ESIMD][E2E] Fix ext_math_ieee_sqrt_div on emulator (#9680)
Similar to the other ext_math tests, this needs -fno-fast-math as well.
Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
commit 27755824d050679127580ea7a7baf28cea38d91b
Author: Nick Sarnie <sarnex@users.noreply.github.com>
Date: Thu Jun 1 17:45:43 2023 -0400
[SYCL][ESIMD] Fix gather/scatter with accessors when passing scalar (#9674)
This regressed in
https://github.com/intel/llvm/commit/d04ebb03c1c891077974622c99027a72bad34b71
when we added a template arg. Since we have a template arg, we won't
also call the constructor.
Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
commit ac1c91e533ebffd8f0629c9c072ea91a807fcf0d
Author: Kseniya Tikhomirova <kseniya.tikhomirova@intel.com>
Date: Thu Jun 1 21:44:31 2023 +0200
[SYCL] Fix post commit fail related to std::unique_lock CTAD in unit tests (#9698)
Signed-off-by: Tikhomirova, Kseniya <kseniya.tikhomirova@intel.com>
commit 57187f6f14c1a9e9ed669bcfb2432f4ebfc90dbb
Author: aelovikov-intel <andrei.elovikov@intel.com>
Date: Thu Jun 1 09:36:36 2023 -0700
[SYCL][CI] Add zstd to our build image (#9681)
I will remove unneeded package (lz4 or/and zstd) once we settle which
one is the best for our use.
commit 01d7fc097ec6b5e380db1a07b6caee475e1c695f
Author: Maksim Sabianin <maksim.sabianin@intel.com>
Date: Thu Jun 1 18:26:15 2023 +0200
[SYCL] Remove reduntant sycldevice support (#9653)
commit 712138f6d84f45c14b7a6fb4dd1432a8b3aa1949
Author: aelovikov-intel <andrei.elovikov@intel.com>
Date: Thu Jun 1 07:57:34 2023 -0700
[SYCL][CI] Create nightly container based on the "build" image (#9685)
I plan to use it in post commit to merge two builds
[linux_default](https://github.com/intel/llvm/blob/ac8408c4761180835fb23ccd5183efd5c5c37d95/.github/workflows/sycl_post_commit.yml#L26-L38)
and
[self_build](https://github.com/intel/llvm/blob/ac8408c4761180835fb23ccd5183efd5c5c37d95/.github/workflows/sycl_post_commit.yml#L39-L51)
into one.
I can also imagine how we can use that in place of [HIP/CUDA image for
E2E
tests](https://github.com/intel/llvm/blob/24955697d9f08c0bc7e1f2b80182c7d967f53b70/.github/workflows/sycl_gen_test_matrix.yml#L10-L17)
for PRs that only update E2E tests.
commit cebe7da1e21072d158c089e258a28ffe7e951a7a
Author: jinz2014 <7799920+jinz2014@users.noreply.github.com>
Date: Thu Jun 1 10:43:06 2023 -0400
[SYCL][HIP] Display the backend name in intel-ext-device.cpp (#9688)
commit 54fcf80f2351a75281f627f9d80b9a86e686c6fc
Author: Kseniya Tikhomirova <kseniya.tikhomirova@intel.com>
Date: Thu Jun 1 16:38:40 2023 +0200
[SYCL] Fix and reenable unit test for xpti_trace (#9587)
Signed-off-by: Tikhomirova, Kseniya <kseniya.tikhomirova@intel.com>
commit 1dce70f413e686bc6fe3af30f99f478c954ee35f
Author: Justin Cai <justin.cai@intel.com>
Date: Thu Jun 1 06:15:30 2023 -0700
[SYCL] Enable proper behavior of optional kernel features with SYCL_EXTERNAL (#9611)
Currently, the code generated from a translation unit with a declaration
of a `SYCL_EXTERNAL` function with a `[[sycl::device_has(...)]]`
attribute, but with no definition of that function, is a LLVM module
with a declaration of the function but with no `sycl_declared_aspects`
metadata. Because of this, `SYCLPropagateAspectsPass` does not propagate
any used aspect information to functions that (transitively) call a
`SYCL_EXTERNAL` function. This causes `sycl-post-link` to fail to split
kernels that call `SYCL_EXTERNAL` functions with different required
aspects.
With this PR, the `sycl_declared_aspects` metadata is now attached to a
`SYCL_EXTERNAL` function even if there is no definition (in the same
translation unit). Additionally, `SYCLPropagateAspectsPass` now collects
aspects information for function declarations.
commit 1bae4b76f88bdee7c37d6f11b75cefe6f1a494eb
Author: Sven van Haastregt <sven.vanhaastregt@arm.com>
Date: Wed May 31 12:47:07 2023 +0100
Use clang to generate compile_commands (#2031)
Ensure the code-formatting job uses clang to generate
compile_commands.json, to avoid passing GCC-specific flags to
clang-format or clang-tidy.
Original commit:
https://github.com/KhronosGroup/SPIRV-LLVM-Translator/commit/c2ff406
commit 353f349fa7f689963f4cc59faa710c290522650e
Author: Nick Sarnie <sarnex@users.noreply.github.com>
Date: Tue May 30 06:42:59 2023 -0400
Skip spirv decoration metadata with --spirv-preserve-auxdata (#2013)
It's already explicitly handled for forward and reverse translation,
and it's a bit complicated to handle MDNode metadata. Just skip it so we don't assert.
If I see this come up in more cases I will add support for MDNode metadata.
Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
Original commit:
https://github.com/KhronosGroup/SPIRV-LLVM-Translator/commit/89d658c
commit 23a3ea0775149b04daba041de55c150785d2f101
Author: Dmitry Sidorov <dmitry.sidorov@intel.com>
Date: Sun May 28 18:41:04 2023 +0200
Relax consumer checks for checksum info (#2011)
It's a follow up for
https://github.com/KhronosGroup/SPIRV-LLVM-Translator/pull/1996
since I couldn't update the PR
Signed-off-by: Sidorov, Dmitry <dmitry.sidorov@intel.com>
Original commit:
https://github.com/KhronosGroup/SPIRV-LLVM-Translator/commit/8cbf726
commit e4ad410f1eeb38659f959bca24d74547e8871274
Merge: fdd609a5c724 d9a9f60248dc
Author: sys_ce_bb <sys_ce_bb@intel.com>
Date: Thu Jun 1 06:04:54 2023 -0700
Merge remote-tracking branch 'origin/sycl-web' into llvmspirv_pulldown
commit fdd609a5c724a69e24ac1a80fdea6b34714660c0
Author: Kseniya Tikhomirova <kseniya.tikhomirova@intel.com>
Date: Thu Jun 1 12:05:43 2023 +0200
[SYCL][ABI-break] Add code_location parameter to the rest of sycl::queue methods (#9603)
code_location helps to improve error reporting and allow to detect exact
code lines for failed command submission.
---------
Signed-off-by: Tikhomirova, Kseniya <kseniya.tikhomirova@intel.com>
commit 7618dffd78ae8456df9885c35d200604748233ec
Author: mmoadeli <mahmoud.moadeli@codeplay.com>
Date: Thu Jun 1 09:03:48 2023 +0100
[SYCL] Lost data during implicit conversion in local and host accessors. (#9669)
* Fix local_accessor and host_accessor lost data during implicit
conversion.
* Add relevant test.
commit 4eaaaa963ca2f58358ea0897d30374cf9928b80b
Author: Kseniya Tikhomirova <kseniya.tikhomirova@intel.com>
Date: Thu Jun 1 10:03:33 2023 +0200
[SYCL] Enable xpti::node_create signal emit for parallel_for that bypasses graph (#9565)
xpti::node_create signal is emitted when we create new node in graph.
Code related to it is present in Command::emitInstrumentationData and
Command successors. Although we have a path when no memory dependencies
is tracked for kernel (e.g. queue::parallel_for) and to speed up kernel
enqueue and eliminate extra overhead - node is not added to graph (and
related Command is not created too). This commit adds this node_create
signal to be emitted in this case.
---------
Signed-off-by: Tikhomirova, Kseniya <kseniya.tikhomirova@intel.com>
commit 9e5889918277e921ef8c4724fe22ab6d638fdfb4
Author: Vyacheslav Klochkov <vyacheslav.n.klochkov@intel.com>
Date: Wed May 31 22:41:15 2023 -0500
[ESIMD][DOC] Update description of accessor-based memory APIs (#9582)
ESIMD has got support of `local accessor`, methods `get_pointer()` and
`operator[]` of accessor class, new `slm_allocator` class to reserve
extra SLM for local needs.
Also, this patch described some existing restrictions for `slm_init`
function
---------
Signed-off-by: Vyacheslav N Klochkov <vyacheslav.n.klochkov@intel.com>
commit d3aaccc7561b3664fb2a039f6a32629c65fc9d05
Author: aelovikov-intel <andrei.elovikov@intel.com>
Date: Wed May 31 16:05:55 2023 -0700
[SYCL][CI] Skip some checks depending on what files have changed (#9589)
I'm using https://github.com/dorny/paths-filter to implement it.
I decided to call it from `sycl_precommit.yml` so that we can
potentially re-use its results between Linux/Windows tasks but that
might have its own drawbacks. I don't see a possibility to just pass the
result of the job between workflows (`sycl_precommit` ->
`sycl_linux_build_and_test`) which means that for every value I have to
thread it carefully via latter's inputs. That might complicate things in
future if we'd want to run just the modified end-to-end tests instead of
all of them.
Another approach would be to run the job inside
`sycl_linux_build_and_test` so that I'd have immediate access to its
output from anywhere in the workflow.
commit f110fd73f8e7e51d3b0eb0595162f129ea74cb21
Author: Byoungro So <byoungro.so@intel.com>
Date: Wed May 31 15:46:46 2023 -0700
[SYCL] Avoid unnecessary kernel retain (#9557)
We should retain the kernel only for OpenCL backend.
Signed-off-by: Byoungro So <byoungro.so@intel.com>
commit ac8408c4761180835fb23ccd5183efd5c5c37d95
Author: Joshua Cranmer <joshua.cranmer@intel.com>
Date: Wed May 31 17:47:32 2023 -0400
[SYCL][OpaquePtrs] Convert some sycl tests to opaque pointers. (#9536)
This does not fix all of the lit tests that fail with opaque pointers
enabled, but it does fix those where the test is looking for IR whose
form has changed with opaque pointers enabled.
commit 24955697d9f08c0bc7e1f2b80182c7d967f53b70
Author: Dmitry Vodopyanov <dmitry.vodopyanov@intel.com>
Date: Wed May 31 21:03:56 2023 +0200
[SYCL] Revert regression for atomic64 after #9561 (#9625)
Fixes regression introduced in https://github.com/intel/llvm/pull/9561
by reverting the affected code
commit d9a9f60248dc73b975e19c634cf6790db0473bf0
Merge: 182ec5bb2718 a88f496f8f3b
Author: Gainullin, Artur <artur.gainullin@intel.com>
Date: Wed May 31 14:30:11 2023 -0400
Merge from 'main' to 'sycl-web' (54 commits)
CONFLICT (content): Merge conflict in clang/lib/Sema/Sema.cpp
commit 182ec5bb2718e2676a616fc5a0ceaf2a339b50ff
Merge: 6532d2ee8b34 f9b489c7a88b
Author: iclsrc <ia.compiler.tools.git@intel.com>
Date: Wed May 31 10:53:04 2023 -0700
Merge from 'sycl' to 'sycl-web' (6 commits)
commit f9b489c7a88b3b130f22678de79d5cf4f00d6b2c
Author: aelovikov-intel <andrei.elovikov@intel.com>
Date: Wed May 31 10:10:06 2023 -0700
[SYCL][CI] Add lz4 to our build image (#9677)
commit 6532d2ee8b347a4f1e3c4db29229822e2f2865be
Merge: 916980317aa1 33ee5c466346
Author: Gainullin, Artur <artur.gainullin@intel.com>
Date: Wed May 31 12:57:09 2023 -0400
Merge from 'main' to 'sycl-web' (82 commits)
CONFLICT (content): Merge conflict in clang/lib/Sema/SemaDeclAttr.cpp
CONFLICT (content): Merge conflict in clang/lib/Sema/SemaType.cpp
commit b793a58559a21d89b2c6ef9a3ad2953597be3e17
Author: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
Date: Wed May 31 09:31:06 2023 -0700
[SYCL][UR][L0] Fix unused parameter (#9670)
Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
commit 06ed924eb112a001c7397c5fcee0b8a8f4ed08dd
Author: JackAKirk <jack.kirk@codeplay.com>
Date: Wed May 31 17:05:29 2023 +0100
[SYCL][CUDA] Check make_device doesn't create duplicate sycl::device (#9373)
Check make_device doesn't create duplicate sycl::device.
Migration of https://github.com/intel/llvm-test-suite/pull/1419
Tests https://github.com/intel/llvm/pull/7550. Checks that make_device
doesn't return a duplicate sycl::device if one already exists.
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
commit a88f496f8f3baa6c3b15532e37e3bdbb1c4ea0d0
Author: Kazu Hirata <kazu@google.com>
Date: Wed May 31 08:59:35 2023 -0700
[Sema] Remove unused function getFloat128Identifier
The last use was removed by:
commit bb1ea2d6139a72340b426e114510c46d938645a6
Author: Nemanja Ivanovic <nemanja.i.ibm@gmail.com>
Date: Mon May 9 08:52:33 2016 +0000
Differential Revision: https://reviews.llvm.org/D151608
commit 8e728adcfedd97fbc3759b5533d0cbada6b68aa6
Author: Marco Elver <elver@google.com>
Date: Wed May 31 17:57:07 2023 +0200
Revert "[compiler-rt] Avoid memintrinsic calls inserted by the compiler"
This reverts commit 4369de7af46605522bf7dbe3bc31d00b0eb4bee6.
Fails on Mac OS with "sanitizer_libc.cpp:109:5: error: aliases are not
supported on darwin".
commit fc8acb563ae019735e646f9964b254cab1efd529
Author: Caroline Concatto <caroline.concatto@arm.com>
Date: Wed May 31 14:12:08 2023 +0000
[Clang][SVE2.1] Add clang support for builtins using svcount_t
In this patch it is used for the prototype:
* svptrue_c8 (and _c16/_c32/_c64)
As described in: https://github.com/ARM-software/acle/pull/257
Patch by: Sander de Smalen <sander.desmalen@arm.com>
Reviewed By: sdesmalen, david-arm
Differential Revision: https://reviews.llvm.org/D150953
commit 71d5a94985c9569467c1ef8a62b8b326ee2036a6
Author: Peter Klausler <pklausler@nvidia.com>
Date: Thu May 25 16:01:52 2023 -0700
[flang] Don't fold SIZE()/SHAPE() into expression referencing optional dummy arguments
When computing the shape of an expression at compilation time as part of
folding an intrinsic function like SIZE(), don't create an expression that
increases a dependence on the presence of an optional dummy argument.
Differential Revision: https://reviews.llvm.org/D151737
commit 660e4530124356442ff63d61b1f6dcb9c1def7e6
Author: Nikita Popov <npopov@redhat.com>
Date: Wed May 31 10:10:47 2023 +0200
[KnownBits] Also test 1-bit values in exhaustive tests (NFC)
Similar to what we do with ConstantRanges, also test 1-bit values
in exhaustive tests, as these often expose special conditions.
This would have exposed the assertion failure fixed in D151788
earlier.
commit 6eef8d9b2bbfdb3920b6eeafc939a2d62ad5295b
Author: Kazu Hirata <kazu@google.com>
Date: Wed May 31 08:45:29 2023 -0700
[RISCV] Fix an unused variable warning
llvm-project/llvm/lib/Target/RISCV/RISCVISelLowering.cpp:3793:7:
error: unused variable 'XLenVT' [-Werror,-Wunused-variable]
commit d6a36619cec44d02a2a3526eceb2ac128d90e030
Author: Simon Pilgrim <llvm-dev@redking.me.uk>
Date: Wed May 31 15:33:44 2023 +0100
[X86] X86FixupVectorConstantsPass - use VBROADCASTSS/VBROADCASTSD for integer vector loads on AVX1-only targets
Matches behaviour in lowerBuildVectorAsBroadcast
commit f29f1c7e23d555c95a199f8e77fefe87e91664cf
Author: Mark de Wever <koraq@xs4all.nl>
Date: Sun May 28 14:23:12 2023 +0200
[libc++]{CI] Bumps clang-tidy version used.
The CI can no longer run with clang-tidy 16 increment it to version 17.
Whether permanently moving to the latest development version is being
discussed on Discourse.
Depends on D149455
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D151628
commit cf64668b8c414c60aec12cdd7374ea053fc99411
Author: Mark de Wever <koraq@xs4all.nl>
Date: Fri Apr 28 17:38:47 2023 +0200
[libc++][test] Prefers the newer clang-tidy version.
Module require Clang 17, since Clang 16 requires the magic # __FILE__
line. Therefore, if available, use clang-tidy 17 too. This change should
be reverted after LLVM 17 is released.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D149455
commit 5d4281d5493c7a2fc09d9ac9fc5b374676a4d8af
Author: Mark de Wever <koraq@xs4all.nl>
Date: Thu May 25 21:59:25 2023 +0200
[libc++] Gives ignore external linkage.
A slightly different fix is in D144994.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D151490
commit ac7d60f73a4a369fb4dcce734d54cb38fde80981
Author: Mark de Wever <koraq@xs4all.nl>
Date: Tue May 23 17:14:20 2023 +0200
[libc++] Fixes use-after move diagnostic.
The diagnostic is issued by clang-tidy 17.
This just suppressed the diagnostic. The move operations are non-standard extensions and the class itself is deprecated.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D151223
commit 7578672c96e18feb5982192e595459b2a65867cf
Author: Dave Lee <davelee.com@gmail.com>
Date: Sat May 20 10:05:44 2023 -0700
[lldb] Override GetVariable in ValueObjectSynthetic (NFC)
Make `GetVariable` a passthrough function the the underlying value object in `ValueObjectSynthetic`.
Differential Revision: https://reviews.llvm.org/D151384
commit 42e98c6ae875e952ee852f78234c0f8ed311472b
Author: Nikita Popov <npopov@redhat.com>
Date: Wed May 31 10:16:16 2023 +0200
[APInt] Support zero-width extract in extractBitsAsZExtValue()
D111241 added support for extractBits() with zero width. Extend this
to extractBitsAsZExtValue() as well for consistency (in which case
it will always return zero).
Differential Revision: https://reviews.llvm.org/D151788
commit 3825910c7316cf62549bd31c503c48e7526adcc2
Author: Nico Weber <thakis@chromium.org>
Date: Wed May 31 11:12:32 2023 -0400
[gn] port 4369de7af466
commit cb463c34dd4c3ad2ac6c13f98edcf684a3fcbe38
Author: Dave Lee <davelee.com@gmail.com>
Date: Fri May 26 21:19:10 2023 -0700
[lldb] Take StringRef name in GetChildMemberWithName (NFC)
`GetChildMemberWithName` does not need a `ConstString`. This change makes the function
take a `StringRef` instead, which alleviates the need for callers to construct a
`ConstString`. I don't expect this change to improve performance, only ergonomics.
This is in support of Alex's effort to replace `ConstString` where appropriate.
There are related `ValueObject` functions that can also be changed, if this is accepted.
Differential Revision: https://reviews.llvm.org/D151615
commit e0df106818ccb90dc46c5296ed5ef2eda75564ff
Author: Paul Scoropan <1paulscoropan@gmail.com>
Date: Tue May 30 15:07:44 2023 +0000
[Flang] Move several definitions to IntrinsicCall header for code cleanliness and reusability
In the future we intend to add support for many PowerPC-specific intrinsics that ideally will exist in a separate new PPCIntrinsicCall file. But first we need to move definitions to the IntrinsicCall header file to increase code cleanliness and readability and to make code reusable for when we add PPCIntrinsicCall.
Reviewed By: vzakhari
Differential Revision: https://reviews.llvm.org/D151715
commit 572cfa3fde5433c889b339e9cfa6dfaa23e5f2ee
Author: Florian Hahn <flo@fhahn.com>
Date: Wed May 31 16:00:57 2023 +0100
[LV] Use SCEV for uniformity analysis across VF
This patch uses SCEV to check if a value is uniform across a given VF.
The basic idea is to construct SCEVs where the AddRecs of the loop are
adjusted to reflect the version in the vectorized loop (Step multiplied
by VF). We construct a SCEV for the value of the vector lane 0
(offset 0) compare it to the expressions for lanes 1 to the last vector
lane (VF - 1). If they are equal, consider the expression uniform.
While re-writing expressions, we also need to catch expressions we
cannot determine uniformity (e.g. SCEVUnknown).
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D148841
commit 4369de7af46605522bf7dbe3bc31d00b0eb4bee6
Author: Marco Elver <elver@google.com>
Date: Tue May 30 11:59:22 2023 +0200
[compiler-rt] Avoid memintrinsic calls inserted by the compiler
D135716 introduced -ftrivial-auto-var-init=pattern where supported.
Unfortunately this introduces unwanted memset() for large stack arrays,
as shown by the new tests added for asan and msan (tsan already had this
test).
In general, the problem of compiler-inserted memintrinsic calls
(memset/memcpy/memmove) is not new to compiler-rt, and has been a
problem before.
To avoid introducing unwanted memintrinsic calls, we redefine
memintrinsics as __sanitizer_internal_mem* at the assembly level for
most source files automatically (where sanitizer_common_internal_defs.h
is included).
In few cases, redefining a symbol in this way causes issues for
interceptors, namely the memintrinsic interceptor themselves. For such
source files we have to selectively disable the redefinition.
Other alternatives have been considered, but simply do not work well in
the context of compiler-rt:
1. Linker --wrap: this does not work because --wrap only
applies to the final link, and would not apply when building
sanitizer static libraries.
2. Changing references to memset() via objcopy: this may work,
but due to the complexities of the build system, introducing
such a post-processing step for the right object files (in
particular object files defining memset cannot be touched)
seems infeasible.
The chosen solution works well (as shown by the tests). Other libraries
have chosen the same solution where nothing else works (see e.g. glibc's
"symbol-hacks.h").
v2:
- Fix ubsan_minimal build where compiler decides to insert
memset/memcpy: ubsan_minimal has work without RTSanitizerCommonLibc,
therefore do not redefine the builtins.
- Fix definition of internal_mem* functions with compilers that want the
aliased function to already be defined before.
- Fix definition of __sanitizer_internal_mem* functions with compilers
more pedantic about attribute placement around extern "C".
Reviewed By: vitalybuka, dvyukov
Differential Revision: https://reviews.llvm.org/D151152
commit 26d7b7bb8ff982b6cdcd9bf7538405356135b724
Author: Michael Liao <michael.hliao@gmail.com>
Date: Fri May 26 12:58:12 2023 -0400
[TableGen] Add !getdagarg and !getdagname
- This patch proposes to add `!getdagarg` and `!getdagname` bang
operators as the inverse operation of `!dag`. They allow us to examine
arguments of a given dag.
Reviewed By: simon_tatham
Differential Revision: https://reviews.llvm.org/D151602
commit e69318138e6cc88becbb8d095b1d2dcf76ac45e1
Author: Philip Reames <preames@rivosinc.com>
Date: Wed May 31 07:48:17 2023 -0700
[RISCV] Use v(f)slide1down for shuffle+insert idiom
This is a follow up to D151468 which added the vslide1down case as a sub-case of vslide1down matching. This generalizes that code into generic mask matching - specifically to point out the sub-vector insert restriction in the original patch. Since the matching logic is basically the same, go ahead and support vslide1up at the same time.
Differential Revision: https://reviews.llvm.org/D151742
commit 5442264744f4e6f925bcb06ae60687ec3c2e9d7f
Author: Nikita Popov <npopov@redhat.com>
Date: Wed May 31 16:39:41 2023 +0200
[InstCombine] Name instructions in test (NFC)
commit 66b9e114326462eb4a7b67dccf36cca875b8791b
Author: myl <yanliang.mu@intel.com>
Date: Wed May 31 22:33:07 2023 +0800
Temporarily add explicit '-O2' for Basic/image/image_read*.cpp to avoid GPU hang issue with O0 optimization. (#9664)
commit 6ef3efc9c46591e94165533f461ac5a17adc527d
Author: aelovikov-intel <andrei.elovikov@intel.com>
Date: Wed May 31 07:32:48 2023 -0700
[SYCL][CI] Fuse self-build and no-asserts build (#9655)
Co-authored-by: Alexey Bader <alexey.bader@intel.com>
commit f9b523ebc367f1535bf61797383471e567b24b75
Author: Kazu Hirata <kazu@google.com>
Date: Wed May 31 07:30:14 2023 -0700
[Analysis] Remove unused class LegacyAARGetter
The last use was removed by:
commit fa6ea7a419f37befbed04368bcb8af4c718facbb
Author: Arthur Eubanks <aeubanks@google.com>
Date: Mon Mar 20 11:18:35 2023 -0700
Once we remove it, createLegacyPMAAResults and createLegacyPMAAResults
become unused, so this patch removes them as well.
Differential Revision: https://reviews.llvm.org/D151787
commit 8634b43a03945971c2939833ac686728bee5a760
Author: Fangrui Song <i@maskray.me>
Date: Wed May 31 07:19:44 2023 -0700
[ELF][RISCV] --wrap=foo: Correctly update st_value(foo)
With --wrap=foo, we may have `d->file != file` for a defined symbol `foo`.
For the object file defining `foo`, its symbol table may not contain
`foo` after `redirectSymbols` changed the `foo` entry to `__wrap_foo` (see D50569).
Therefore, skipping `foo` with the condition `if (!d || d->file != file)` may
cause `__wrap_foo` not to be updated. See `ab.o w.o --wrap=foo` in the new test
(originally reported by D150220).
We could adjust the condition to `if (!d)`, but that would leave many `anchors`
entries if a symbol is referenced by many files. Switch to iterating over
`symtab` instead.
Note: D149735 (actually not NFC) allowed duplicate `anchors` entries and fixed
`a.o bw.o --wrap=foo`.
Reviewed By: jobnoorman
Differential Revision: https://reviews.llvm.org/D151768
commit e9c9d54cf5959fa020cf76e47ced4575793f6d60
Author: Vyacheslav Klochkov <vyacheslav.n.klochkov@intel.com>
Date: Wed May 31 09:16:30 2023 -0500
[ESIMD][LIT] Fix usage of -fno-fast-math and -fno-slp-vectorize with cl (#9661)
clang-cl driver does not understand -fno-fast-math and
-fno-slp-vectorize. Usage of those options requires adding "/clang:"
before the option.
Signed-off-by: Vyacheslav N Klochkov <vyacheslav.n.klochkov@intel.com>
commit 408f4196ba4ac66328ebfcf41cb372572257c4f6
Author: Tom Eccles <tom.eccles@arm.com>
Date: Wed May 17 16:07:41 2023 +0000
[flang] use greedy mlir driver for stack arrays pass
In upstream mlir, the dialect conversion infrastructure is used for
lowering from one dialect to another: the passes are of the form
XToYPass. Whereas, transformations within the same dialect tend to use
applyPatternsAndFoldGreedily.
In this case, the full complexity of applyPatternsAndFoldGreedily isn't
needed so we can get away with the simpler applyOpPatternsAndFold.
This change was suggested by @jeanPerier
The old differential revision for this patch was
https://reviews.llvm.org/D150853
Re-applying here fixing the issue which led to the patch being reverted. The
issue was from erasing uses of the allocation operation while still iterating
over those uses (leading to a use-after-free). I have added a regression
test which catches this bug for -fsanitize=address builds, but it is
hard to reliably cause a crash from the use-after-free in normal builds.
Differential Revision: https://reviews.llvm.org/D151728
commit 543705641adb1d3533be141947264ca1b7b04479
Author: Paul Robinson <paul.robinson@sony.com>
Date: Wed May 31 06:43:27 2023 -0700
[Headers][doc] Fix typo in avx2intrin.h doc
commit f6a631d4060c5b539fd51b7221205ee05ec50ee8
Author: Jan Sjodin <jan_sjodin@yahoo.com>
Date: Tue May 30 14:28:12 2023 -0500
[MLIR] Remove dependency on omp dialect in LLVM dialect.
This fixes a buildbot failure where the dependency on the omp dialect
in the LLVM dialect caused error. Instead of accessing the interface
defined in the omp dialect we directly access the attributes
instead. To make this work the IsDeviceAttr is removed and replaced
with a BoolAttr instead.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D151745
commit e5399f1d7cabfca90030ca03f52818e892aa389f
Author: Paul Robinson <paul.robinson@sony.com>
Date: Tue May 30 13:30:12 2023 -0700
[Headers][doc] Add shuffle-like intrinsic descriptions to avx2intrin.h
Differential Revision: https://reviews.llvm.org/D151749
commit 0a3dc73e700b4a37bc435bf7c02213161b27f54a
Author: Dmitry Makogon <d.makogon@g.nsu.ru>
Date: Wed May 31 20:23:19 2023 +0700
[Test] Move LoopStrengthReduce/pr62563.ll to X86 specific test folder (NFC)
The test case is X86 specific. Should unblock buildbots after 253e3e2.
commit 6bcbb3af059b05056c7343cafd99004d4cd4cd35
Author: Florian Hahn <flo@fhahn.com>
Date: Wed May 31 14:22:44 2023 +0100
[ConstraintElim] Move logic to remove stack entry to helper (NFC).
Preparation for follow-up patch that uses the logic in a separate place.
commit 97f0e7b06e6b76fd85fb81b8c12eba2255ff1742
Author: Nikita Popov <npopov@redhat.com>
Date: Wed May 31 14:53:44 2023 +0200
[AA] Fix comparison of AliasResults (PR63019)
Comparison between two AliasResults implicitly decayed to comparison
of AliasResult::Kind. As a result, MergeAliasResults() ended up
considering two PartialAlias results with different offsets as
equivalent.
Fix this by adding an operator== implementation. To stay
compatible with extensive use of comparisons between AliasResult
and AliasResult::Kind, add an overload for that as well, which
will ignore the offset. In the future, it would probably be a
good idea to remove these implicit decays to AliasResult::Kind
and add dedicated methods to check for specific AliasResult kinds.
Fixes https://github.com/llvm/llvm-project/issues/63019.
commit 4d64ffa94170eadd79954e2a5f13d1f1d16e9e2c
Author: Nikita Popov <npopov@redhat.com>
Date: Wed May 31 14:55:11 2023 +0200
[GVN] Add test for PR63019 (NFC)
commit ce97312d109b21acb97d3ea243e214f20bd87cfc
Author: Arnaud Bienner <arnaud.bienner@gmail.com>
Date: Wed May 31 10:54:27 2023 +0200
Implement BufferOverlap check for sprint/snprintf
Differential Revision: https://reviews.llvm.org/D150430
commit 916980317aa18cd55727feae689026d4bd5a23e2
Merge: 606c74d747f2 0000fa6a925e
Author: iclsrc <ia.compiler.tools.git@intel.com>
Date: Wed May 31 05:37:05 2023 -0700
Merge from 'sycl' to 'sycl-web'
commit 0b42ee46b06fb9fb396eca8b335166d8e92b70cd
Author: LLVM GN Syncbot <llvmgnsyncbot@gmail.com>
Date: Wed May 31 12:30:10 2023 +0000
[gn build] Port 26bda9e95a9d
commit dd2fea9c23e6dabd83d3f4ee7d000ceb16cace55
Author: Thorsten Schütt <schuett@gmail.com>
Date: Thu May 25 17:47:00 2023 +0200
[GlobalIsel][X86] Legalize G_CTLZ and G_CTPOP for 32-bit
Note that 32-bit support is very limited
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D151459
commit 344e91a6f00840e67fc03bcfeca6c34fa6d34b17
Author: Nico Weber <thakis@chromium.org>
Date: Wed May 31 08:17:44 2023 -0400
[gn] port 301eb6b68f3 (AttrTokenKinds.inc)
commit 64bd5bbb9bbb72de5f59755c74dae4b4881d93d5
Author: rikhuijzer <rikhuijzer@pm.me>
Date: Wed May 31 14:13:08 2023 +0200
[mlir] Avoid tensor canonicalizer crash on negative dimensions
Fixes #59703.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D151611
commit c76a3e795ef6bd5262b5860ebcc902fab3fab607
Author: Guillaume Chatelet <gchatelet@google.com>
Date: Wed May 31 12:06:45 2023 +0000
[libc][NFC] Fixing various typos
commit 0000fa6a925ef8d0fcd97c1765a7f24b85110610
Author: JackAKirk <jack.kirk@codeplay.com>
Date: Wed May 31 13:02:04 2023 +0100
[SYCL][CUDA] opportunistic_group, fixed_size_group, and ballot_group impls. (#9280)
This basic cuda support does not include any algorithm support.
Algorithm support will follow in a later PR.
S…
rsandifo-arm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments below, but LGTM otherwise. Once the SME2 stuff is in, I think we should consolidate the intrinsics that are common between SME2 and SVE2p1, rather than duplicating them. I agree the current form makes sense until then though.
main/acle.md
Outdated
| // _u64base_u8, _u64base_u16, _u64base_s16, _u64base_u32, _u64base_s32, | ||
| // _u64base _u64, _u64base_s64 | ||
| // _u64base_bf16, _u64base_f16, _u64base_f32, _u64base_f64 | ||
| svint8_t svld1q_gather[_u64base_s8](svbool_t pg, svint64_t zn, const void *rm); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should provide the same addressing modes as for LDNT1 gather:
svld1q_gather[_u64base]_xx(svbool_t pg, svuint64_t zn)(notesvuint64_trather thansvint64_t)svld1q_gather[_u64base]_offset_xx(svbool_t pg, svuint64_t zn, int64_t offset)svld1q_gather[_u64base]_index_xx(svbool_t pg, svuint64_t zn, int64_t index)svld1q_gather_[u64]offset[_xx](svbool_t pg, const xx_t *base, svuint64_t offset)svld1q_gather_[u64]index[_xx](svbool_t pg, const xx_t *base, svuint64_t index)for 16-bit, 32-bit and 64-bitxx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I imagine we should do the same for the ST1Q scatter quadrword, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, same thing there.
main/acle.md
Outdated
| // Variants are also available for: | ||
| // _s8 _u16, _s16, _u32, _s32, _u64, _s64 | ||
| // _bf16, _f16, _f32, _f64 | ||
| void svst2q[_u8](svbool_t pg, uint8_t *rn, svuint8x2_t zt); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@CarolineConcatto Is there a reason why the pointers for the structured quad-word stores use uint8_t *, instead of the int8_t * for the svld2q, etc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type is meant to vary with the suffix, so it's uint8_t * for the [_u8] function shown, and would be int8_t * for the [_s8] version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doh! Of course, silly me. :)
main/acle.md
Outdated
|
|
||
| #### LD1Q | ||
|
|
||
| Gather Load Quadword. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is only an unscaled variant of this instruction, so maybe don't have both offset and index?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the other SVE load and store intrinsics, we tried to provide a consistent interface and set of addressing modes. So the deciding factor wasn't so much whether the call mapped to a single instruction, but whether the underlying instruction could easily emulate the mode. “Single instruction” is a bit of nebulous concept anyway for loads and stores, since a single C address expression might need several operations to compute.
Since scaling is just a shift left, I think it's worth providing both index and offset variants.
main/acle.md
Outdated
|
|
||
| #### ST1Q | ||
|
|
||
| Scatter store quadwords. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is only an unscaled version of this instruction? So maybe don't have both index and offset?
main/acle.md
Outdated
|
|
||
| // Variants are also available for: | ||
| // _s8, _u16, _s16, _u32, _s32, _u64, _s64 | ||
| svuint8_t svpmov_lane_u8_z(svbool_t pn); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/ svuint8_t svpmov_lane_u8_z(svbool_t pn);/ svuint8_t svpmov_u8_z(svbool_t pn);/
ThomasBamelis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the increased use of x4 vectors in 2.1, would it be the right time to introduce svreinterpret variants for x4 types as well?
With data rearranging, load/storing and element wise bit manipulation changing element size can come in quite handy.
As described in: ARM-software/acle#257 Patch by : David Sherwood <david.sherwood@arm.com> Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D150961
As described in: ARM-software/acle#257 Reviewed By: hassnaa-arm Differential Revision: https://reviews.llvm.org/D151081
As described in: ARM-software/acle#257 Patch by : Sander de Smalen<sander.desmalen@arm.com> Reviewed By: kmclaughlin Differential Revision: https://reviews.llvm.org/D151197
As described in: ARM-software/acle#257 Patch by : Sander de Smalen<sander.desmalen@arm.com> Reviewed By: dtemirbulatov Differential Revision: https://reviews.llvm.org/D151199
As described in: ARM-software/acle#257 Patch by : David Sherwood <david.sherwood@arm.com> Reviewed By: kmclaughlin Differential Revision: https://reviews.llvm.org/D151307
Patch by : David Sherwood <david.sherwood@arm.com> As described in: ARM-software/acle#257 Reviewed By: kmclaughlin Differential Revision: https://reviews.llvm.org/D151433
As described in: ARM-software/acle#257 Patch by: David Sherwood <david.sherwood@arm.com> Reviewed By: dtemirbulatov Differential Revision: https://reviews.llvm.org/D151439
As described in: ARM-software/acle#257 Patch by: Kerry McLaughlin <kerry.mclaughlin@arm.com> Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D151461
As described in: ARM-software/acle#257 Patch by: Rosie Sumpter <rosie.sumpter@arm.com> Reviewed By: dtemirbulatov Differential Revision: https://reviews.llvm.org/D151709
This patch implements the builtins in Clang and the LLVM-IR intrinsic for the following: // Variants are also available for: // _s8, _s16, _u16, _s32, _u32, _s64, _u64, // _f16, _f32, _f64uint8x16_t svaddqv[_u8](svbool_t pg, svuint8_t zn); // Variants are also available for: // _s8, _u16, _s16, _u32, _s32, _u64, _s64 uint8x16_t svandqv[_u8](svbool_t pg, svuint8_t zn); uint8x16_t sveorqv[_u8](svbool_t pg, svuint8_t zn); uint8x16_t svorqv[_u8](svbool_t pg, svuint8_t zn); // Variants are also available for: // _s8, _u16, _s16, _u32, _s32, _u64, _s64; uint8x16_t svmaxqv[_u8](svbool_t pg, svuint8_t zn); uint8x16_t svminqv[_u8](svbool_t pg, svuint8_t zn); // Variants are also available for _f32, _f64 float16x8_t svmaxnmqv[_f16](svbool_t pg, svfloat16_t zn); float16x8_t svminnmqv[_f16](svbool_t pg, svfloat16_t zn); According to the PR#257[1] The reduction instruction uses scalable vectors as input and fixed vectors as output, therefore we changed SVEEmitter to emit fixed vector types in case the neon header(arm_neon.h) is not present. [1]ARM-software/acle#257 Co-author: Dinar Temirbulatov <dinar.temirbulatov@arm.com>
This patch changes the following intrinsic
```svst1uwq[_{d}] replaced by svst1wq[_{d}]
svst1uwq_vnum[_{d}] replaced by svst1wq_vnum[_{d}]
svst1udq[_{d}] replaced by svst1dq[_{d}]
svst1udq_vnum[_{d}] replaced by svst1dq_vnum[_{d}]
```
Drops 'u' from the quadword stores because it is simply truncating the
quadwords to 32 bits
```
svextq_lane[_{d}] replaced by svextq[_{d}]
```
EXTQ follows the previous defined EXT intrinsics
```
svdot[_{d}_{2}_{3}] replaced by svdot[_{d}_{2}]
```
Introduced with the latest SME2 ACLE change
[1]ARM-software/acle#257
This patch changes the following intrinsic
```svst1uwq[_{d}] replaced by svst1wq[_{d}]
svst1uwq_vnum[_{d}] replaced by svst1wq_vnum[_{d}]
svst1udq[_{d}] replaced by svst1dq[_{d}]
svst1udq_vnum[_{d}] replaced by svst1dq_vnum[_{d}]
```
Drops 'u' from the quadword stores because it is simply truncating the
quadwords to 32 bits
```
svextq_lane[_{d}] replaced by svextq[_{d}]
```
EXTQ follows the previous defined EXT intrinsics
```
svdot[_{d}_{2}_{3}] replaced by svdot[_{d}_{2}]
```
Introduced with the latest SME2 ACLE change
[1]ARM-software/acle#257
main/acle.md
Outdated
| // _s8, _s16, _u16, _s32, _u32, _s64, _u64 | ||
| // _bf16, _f16, _f32, _f64 | ||
| svuint8_t svextq_lane[_u8](svuint8_t zdn, svuint8_t zm, uint64_t imm); | ||
| svuint8_t svextq[_u8](svuint8_t zdn, svuint8_t zm, uint64_t imm); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we dropping the _lane part here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Richard pointed out that the other ext do not have lane in it.
// Variants are also available for:
// _s8, _s16, _u16, _s32, _u32, _s64, _u64
// _bf16, _f16, _f32, _f64
svuint8_t svextq_lane[_u8](svuint8_t zdn, svuint8_t zm, uint64_t imm);
Member
@rsandifo-arm rsandifo-arm 3 weeks ago
I'm not sure these should be lane intrinsics. The instructions are really a form of permutation. (FWIW, the corresponding non-Q intrinsics don't have the _lane suffix.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
|
Hello @CarolineConcatto, You have forgotten DUPQ instruction for sve2p1 . Prototype will look like this : This is different to svdupq_lane intrinsic and they have different behaviour |
|
I merged SVE2.1 and SME2 intrinsics to 1 section. But I am not sure that is the best. |
This patch adds new intrinsics and types for supporting SVE2.1.
rsandifo-arm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This version seems to add the shared SVE2.1/SME intrinsics back into the SME section (with __arm_streaming attributes). Is that deliberate?
I think we should only document each intrinsic once, as in the previous version. It's just that the relationship between streaming/non-streaming/streaming-compatible and SME/SME2/SVE2/SVE2.1 can't be expressed directly using attributes (and so needs to be specified in words instead).
No, they should not be in the SME section with streaming attribute. |
rsandifo-arm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM apart from the typo below.
main/acle.md
Outdated
| non-zero or __ARM_FEATURE_SME2 are non-zero. | ||
| For convenience, these the intrinsics for these instructions are listed in | ||
| the following section. | ||
| For convenience, the intrinsics fo these instructions are listed in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| For convenience, the intrinsics fo these instructions are listed in the | |
| For convenience, the intrinsics for these instructions are listed in the |
sallyarmneale
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One very minor comment.
This patch adds new intrinsics and types for supporting SVE2.1. This patch depends on Pull-Request#217
(#217),
because some intrinsic in this specification are also in Pull-Request#217.
Depends on: #217
name: Pull request
about: Technical issues, document format problems, bugs in scripts or feature proposal.
Thank you for submitting a pull request!
If this PR is about a bugfix:
Please use the bugfix label and make sure to go through the checklist below.
If this PR is about a proposal:
We are looking forward to evaluate your proposal, and if possible to
make it part of the Arm C Language Extension (ACLE) specifications.
We would like to encourage you reading through the contribution
guidelines, in particular the section on submitting
a proposal.
Please use the proposal label.
As for any pull request, please make sure to go through the below
checklist.
Checklist: (mark with
Xthose which apply)PR (do not bother creating the issue if all you want to do is
fixing the bug yourself).
SPDX-FileCopyrightTextlines on topof any file I have edited. Format is
SPDX-FileCopyrightText: Copyright {year} {entity or name} <{contact informations}>(Please update existing copyright lines if applicable. You can
specify year ranges with hyphen , as in
2017-2019, and usecommas to separate gaps, as in
2018-2020, 2022).Copyrightsection of the sources of thespecification I have edited (this will show up in the text
rendered in the PDF and other output format supported). The
format is the same described in the previous item.
tricky to set up on non-*nix machines). The sequence can be
found in the contribution
guidelines. Don't
worry if you cannot run these scripts on your machine, your
patch will be automatically checked in the Actions of the pull
request.
introduced in this PR in the section Changes for next
release of the section Change Control/Document history
of the document. Create Changes for next release if it does
not exist. Notice that changes that are not modifying the
content and rendering of the specifications (both HTML and PDF)
do not need to be listed.
correctness of the result in the PDF output (please refer to the
instructions on how to build the PDFs
locally).
draftversionis set totruein the YAML headerof the sources of the specifications I have modified.
in the README page of the project.