From 2bbf2cb0a8356ff5d969db59a7201287d8829c1b Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Mon, 12 Aug 2024 04:59:51 -0700 Subject: [PATCH 01/30] [SYCL][Doc] Release Notes for Jul'24 release --- sycl/ReleaseNotes.md | 2062 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 2062 insertions(+) diff --git a/sycl/ReleaseNotes.md b/sycl/ReleaseNotes.md index bb592c570db9..506bd62d8ce6 100644 --- a/sycl/ReleaseNotes.md +++ b/sycl/ReleaseNotes.md @@ -1,3 +1,2065 @@ +# Release notes Jul'24 + +Release notes for commit range +[d2817d6d317db1](https://github.com/intel/llvm/commit/d2817d6d317db1143bb227168e85c409d5ab7c82) +... +[ebb3b4a21b3b0e](https://github.com/intel/llvm/commit/ebb3b4a21b3b0e977f44434781729df7de83e436) + +## New Features + +### SYCL Compiler + +- Added `-fsycl-range-rounding` command line option which allows to + control range rounding feature. In comparison with previously available + `-fsycl-disable-range-rounding` command line option and + `__SYCL_DISABLE_PARALLEL_FOR_RANGE_ROUNDING__` macro the new flag also allows + to _force_ range rounding which will complete disable generation of + non-rounded kernels, thus improving binary size. intel/llvm#12715 +- Added `-fsycl-exp-range-rounding` command line option that enables + experimental range rounding mode in which range rounding is performed across + all dimensions. intel/llvm#12690 +- Added support for the so-called [new offloading model](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/design/OffloadDesign.md). + It can be enabled by `--offload-new-driver` command line option and should + allow us to improve link time by reducing amount of external processes and + temporary files used by the compiler. **Do we need to list PRs here?** There + were many of them and some of them were merged in scope of a previous release. +- Added `-fsycl-fp64-conv-emu` command line option which allows to enable + partial (only conversion operations are supported) emulation of `double` data + type. This mode is only supported by Intel GPUs. intel/llvm#13912 +- Introduced `__PTX_VERSION__` macro that corresponds to the PTX version used + when compiling NVPTX. intel/llvm#14621 + +commit 5b3e7c8f60f0c66ec92f2a19b43ec147d40bd5ed + [SYCL][New offload driver][LLVM-SPIRV] Send all translator options to linker wrapper (#13394) +commit cd24d808382598565f9ec3d9e8faf3e83bfa9aa4 + [Driver][SYCL] Pass full set of sycl-post-link options to linker wrapper (#13648) +commit ece73ad61b49eaf9ecb6e2060e5f20e09e26def6 + [NewOffloadModel][SYCL DeviceLib] Generate SYCL device library objects using new offload model (#13579) +commit 90c659e4a04e0c74a0eb99d4c72d79c8bc0f783e + [Driver][SYCL][NewOffloadModel] Hook up -fsycl-device-only behaviors (#13672) +commit 91b840c2c44cbc10149f2c77aa4094d6c73bd73f + [Driver][SYCL][NewOffloadModel] Hook up -fsycl-device-obj support (#13688) +commit a8609a5925a3fcb2bd85636702556d15ae5574f4 + [New offload][llc] Pass -relocation-model=pic option to llc when building shared libraries (#13687) +commit 16007fa8be4292159f0b19e5fb911b90e3f84aa4 + [SYCL][Device libs][New offload] Add missing fallback SYCL device library files (#13869) +commit 5ddc6881a4f5c2ee5f0ccbbd873e57a62bceb30d + [Driver][SYCL][NewOffloadModel] Improve arch association for device (#13898) +commit 7439fb46f1469cf401d89bf203f91ea22bc7ee57 + [Driver][SYCL][NewOffloadModel] Hook up options for the offload-wrapper (#14001) +commit 3aee3dbc23d2981c049fdfa6b12b7f759062326d + [Driver][SYCL][NewOffload] Update option passing for packager and AOT (#14066) +commit c2cdfccf4aa90a6da35826c55815f680db3b3805 + [New offload driver][sycl-post-link] Move sycl-post-link target specific options generation to linker wrapper (#14101) +commit 934b46f2fff602bcbe7c99c13a1dd7ed6955ae4f + [Driver][SYCL][NewOffload] Fix duplication of device targets (#14143) +commit 6ecce4fdab98b31a897af23444ca4d636af89862 + [New offload driver][Device lib] Add SYCL device library files for all targets (#14102) +commit f9fd95ec6c2aa4b77b0503ae1a7a82d6747df105 + [Driver][SYCL][NewOffloadModel] Incorporate -device settings for GPU (#14151) +commit 9691782beff5456c297063223ff831d54a8cd624 + [SYCL][NFC][New offload model][llvm-spirv] Refactor llvm-spirv options generation for enabling correct use under new offload model (#14253) +commit 3e474e050206e234759115a6442cdd5fb084d3f6 + [New offload model] Cleanup the way sycl-post-link options are generated (#14177) +commit fe2b47f08bee73be8bd978c71f3946852e94d790 + [SYCL][AOT][New offload model] Add AOT support in clang-linker-wrapper for Intel CPUs/GPUs (#14252) +commit 1c13e6f5e6bae9df42b483852c60631609422043 + [SYCL][ClangLinkerWrapper] Unconditionally pass -properties to sycl-post-link (#14541) + +### SYCL Library + +- Added support for JIT-compilation for AMD and NVIDIA backends. intel/llvm#14280 +- Implemented [`sycl_ext_oneapi_prod`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/supported/sycl_ext_oneapi_prod.asciidoc) extension. intel/llvm#13555 +- Implemented [`sycl_ext_oneapi_profiling_tag`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_profiling_tag.asciidoc) extension. intel/llvm#12838 +- Implemented [`sycl_ext_oneapi_forward_progress`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/proposed/sycl_ext_oneapi_forward_progress.asciidoc) extension. intel/llvm#13389 +- Implemented [`sycl_ext_oneapi_private_alloca`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_private_alloca.asciidoc) extension. intel/llvm#12966 intel/llvm#13490 intel/llvm#13181 +- Implemented [`sycl_ext_oneapi_enqueue_functions`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_enqueue_functions.asciidoc) extension. intel/llvm#13512 +- Added support for `get_backend_info` API into various SYCL classes (`platform`, `context`, etc.). intel/llvm#12906 +- Implemented [`sycl_ext_oneapi_group_load_store`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/proposed/sycl_ext_oneapi_group_load_store.asciidoc). + Please note that the implementation is naive and does not expose any special + HW capabilities, it won't provide any performance benefit over how a group + load/store could be done without this extension using simple `for` loop and + group barriers. intel/llvm#13043 +- Implemented [`sycl_ext_codeplay_enqueue_native_command`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_codeplay_enqueue_native_command.asciidoc) extension. intel/llvm#14136 +- Added initial support for [dynamic linking](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/design/SharedLibraries.md). + Current implementation lacks support for `kernel_bundle` API and AOT mode. intel/llvm#14587 +- Added initial support for [`sycl_ext_oneapi_free_function_kernels`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/proposed/sycl_ext_oneapi_free_function_kernels.asciidoc) extension. intel/llvm#13207 intel/llvm#13885 + Known limitations: + - free function kernels are only supported if defined at file scope + - `SYCL_EXTERNAL` has to be used alongside `SYCL_EXT_ONEAPI_FUNCTION_PROPERTY` + to define free function kernel + - compiler won't emit any diagnostics if some restrictions from the + extension specification are violated + - arguments of a free function kernels cannot be composite data types like + structs or SYCL classes like `accessor` + - using `-fsycl-dead-args-optimization` (ON by default) can lead to failures + - `info::kernel::num_args` won't return the right result for free function + kernels + + +### SYCLcompat library + +- Added support for 2-byte and 4-byte `memset` operations. intel/llvm#13409 +- Added `compare`, `compare_both`, `compare_mask` and their `unordered_*` + counterparts. intel/llvm#12998 +- Added APIs for performing arithmetic operations on 33-bit extended values. intel/llvm#13006 +- Added APIs for performing bitwise operations on 33-bit extended values. intel/llvm#13727 +- Added `device_count` and `get_device_id` utility APIs. intel/llvm#14013 +- Added experimental `launch` API overloads that accept sub-group size. intel/llvm#13767 +- Added `wait` and `wait_and_throw` free functions. intel/llvm#13029 +- Added vectorized comparison `extend_vcompare[2|4]` APIs. intel/llvm#14079 +- Added vectorized math `extend_v*2` APIs. intel/llvm#13953 +- Added vectorized math `extend_v*4` APIs. intel/llvm#14078 +- Added bitfield manipulation APIs `bfe_safe` and `bfi_safe`. intel/llvm#14006 +- Added dot-product accumulate APIs `dp4a`, `dp2a_lo` and `dp2a_hi`. intel/llvm#14032 +- Added `wait_and_free` API. intel/llvm#14015 +- Added `filter_device` and `list_devices` APIs. intel/llvm#14016 +- Added `funnelshift_*` APIs. intel/llvm#13825 +- Added `match_[any|all]_over_sub_group` APIs. intel/llvm#12973 +- Added API to manage kernel libraries loading/unloading. intel/llvm#13053 +- Added `cmul_add` API. intel/llvm#12969 +- Added experimental APIs for maksed operations over sub-groups (`select`, `shift`, etc.). intel/llvm#12972 + +commit e0d020a74fee74a1fcda97b9a9854ad07bde4eae + [SYCL][COMPAT] Added utility helpers to simplify code translation (#12970) + ??? + +commit 4ade7b71db910a694e1da4d73495fd1903da1622 + [SYCL][COMPAT] Added support for multiple math ops (#13005) + ??? + +### Documentation + +- Added specification for [`sycl_ext_oneapi_group_load_store`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/proposed/sycl_ext_oneapi_group_load_store.asciidoc) extension. intel/llvm#7593 +- Added specification for [`sycl_ext_oneapi_work_group_memory`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/proposed/sycl_ext_oneapi_work_group_memory.asciidoc) extension. intel/llvm#13725 +- Added [implementation design document](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/design/PrivateAlloca.md) for [`sycl_ext_oneapi_private_alloca`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_private_alloca.asciidoc) extension. intel/llvm#13514 +- Added specification for [`sycl_ext_intel_fpga_task_sequence`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/proposed/sycl_ext_intel_fpga_task_sequence.asciidoc) extension. intel/llvm#6348 +- Added specification for [`sycl_ext_codeplay_enqueue_native_command`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_codeplay_enqueue_native_command.asciidoc) extension. intel/llvm#14136 +- Added specification for [`SPV_INTEL_bindless_images`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/design/spirv-extensions/SPV_INTEL_bindless_images.asciidoc) extension. intel/llvm#12927 + + + +commit 9876e19f4ff387b35b0c98c7d62e5f50e6de187d + [SYCL][XPTI] 'queue_id' metadata feature refactoring (#13070) + bugfix? + +commit 3800814750da51d6da852ce404bde91e1dbe02b8 + [SYCL] Key/Value sorting with fixed-size private array input (#14399) + +commit 7b3f21527abb904cb5c63e9ea32c7f0d65636436 + [SYCL] [ABI-Break] Partial implementation of sycl_ext_oneapi_cuda_cluster_group (#14113) + +commit 8e3b8ce77f41d85687ae3bceedf5d1dc6e0e3155 + [SYCL] Add sorting APIs for fixed-size private array input (#14185) + +commit bd97f283c9f982b89a3347754edf184a38762a4a + [Bindless][Exp] Windows & DX12 interop. Semaphore ops can take values. (#13860) + +commit 3910d0c1393247313c8987b3f68a8d540d940673 + [SYCL] Add support for key/value sorting APIs (#13942) + +commit 5e269c88bcfafd82719d1266a5b8a2bb7b90045d + [SYCL] Initial changes for the second version of sycl_ext_oneapi_group_sort extension (#13908) + +commit 55b547e59a28c4c446a797bb8c51a83156609327 + [SYCL][ESIMD] Introduce load2d/store2d/prefetch2d API that accepts compile time properties (#13046) + +commit d06724a7c304d393500b7edbb84f5c7e59f6b319 + [SYCL][Graph] Specify API for explicit update using indices (#12486) +commit 2bc8b5bc8cbc44cf8ef1deb095c10450348904d8 + [SYCL][Graph] Implementation of explicit update with indices (#12840) + +commit c8ae6c68943b9635cd9822f3c9ee7b5cc8d98acc + [ESIMD][NFC][DOC] Add load/store/prefetch_2d functions, L1/L2 hint combinations(#13218) + +## Improvements + +### SYCL Compiler + +- Improved compilation flow around intergation footer when no 3rd-party host + compiler is used. New compilation flow creates less temporary files and + therefore should result in a slightly faster compilation. intel/llvm#13607 intel/llvm#14402 +- Added support for `truncf`, `sinpif`, `rsqrtf` and `exp10f` functions in SYCL + kernels are part of [C-CXX-StandardLibrary](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/supported/C-CXX-StandardLibrary.rst) extension. intel/llvm#14132 +- Added support for more IMF functions as part of [C-CXX-StandardLibrary](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/supported/C-CXX-StandardLibrary.rst) extension. intel/llvm#13786 +- Added `-fsystem-debug` command line option to complement existing + `-fno-system-debug`. intel/llvm#13256 +- Improved wording of an error about implicit `this` capture in a kernel. intel/llvm#14100 +- Improved `--save-temps` to work with `-fsycl-host-compiler`. intel/llvm#114751 + +### SYCL Library + +- Added support for `sqrt` and `rsqrt` ESIMD function for `double` data type. intel/llvm#13254 +- Updated [`sycl_ext_oneapi_bindless_images`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_bindless_images.asciidoc) extension implementation to support cubemap images. intel/llvm#12996 +- Added ESIMD API for dynamic allocation of named barriers. intel/llvm#13826 +- Updated [`sycl_ext_oneapi_bindless_images`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_bindless_images.asciidoc) extension implementation to support sampled image arrays. intel/llvm#14237 +- Added implementation for whole graph update (`executable_command_graph::update`). intel/llvm#13220 +- Added a warning about use of the deprecated `` header. intel/llvm#13569 +- Made `local_accessor::get_pointer` and `local_accessor::get_multi_ptr` throw + `invalid` exception if they are called on host. intel/llvm#13747 +- Extended detection of nested `queue` operations to support shortcut methods. intel/llvm#13659 +- Added overloads of various ESIMD APIs (`atomic_update`, `block_[load|store]` + and some other) which allow to omit some template arguments, thus simplifying + the interface. intel/llvm#14043 intel/llvm#14065 intel/llvm#14000 + intel/llvm#14024 intel/llvm#13978 intel/llvm#13964 intel/llvm#13977 + intel/llvm#13956 intel/llvm#13941 intel/llvm#13920 +- Updated [`sycl_ext_oneapi_bfloat16_math_functions`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_bfloat16_math_functions.asciidoc) + extension implementation to support vectors of `bfloat16` to be passed to math + functions. intel/llvm#14002 intel/llvm#14106 +- Improved performance of `sycl::vec::as` by optimizing implementation of + `sycl::detail::memcpy`. Resolved intel/llvm#7901. intel/llvm#13751 +- Updated implementation to throw SYCL 2020 exceptions instead of legacy + SYCL 1.2.1 exception sub-classes everywhere. intel/llvm#14484 intel/llvm#14545 + intel/llvm#14520 intel/llvm#14485 intel/llvm#14510 intel/llvm#14483 + intel/llvm#14487 intel/llvm#14488 +- Added support for `sycl::vec::convert` to/from `vec`. intel/llvm#14105 +- Deprecated `marray::operator++/--`, `accessor::get_multi_ptr` for + non-device accessors. intel/llvm#13443 +- Moved ESIMD named barrier APIs out of `experimental` namespace. intel/llvm#13704 +- Implemented latest revision of [`sycl_ext_oneapi_free_function_queries`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/supported/sycl_ext_oneapi_free_function_queries.asciidoc) + extension. intel/llvm#13257 +- Extended `sycl-ls --verbose` to print device's UUID, information about its + sub- and sub-sub- devices and its architecture. intel/llvm#13999 intel/llvm#13976 +- Added support for compile-time properties to `copy_to` and `copy_from` ESIMD + APIs. intel/llvm#13586 +- Switched `experimental::printf` implementation to use non-variadic interface + by default. This should improve usability when printing `float` values on + devices that doesn't support `fp64` aspect by disabling `float` -> `double` + promotion in `printf` arguments. intel/llvm#13055 +- Added a diagnostic if `slm_init` ESIMD API is called more than once in + a kernel. intel/llvm#12804 +- Updated implementation of + [`sycl_ext_oneapi_device_architecture`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_device_architecture.asciidoc) + extension to return `unknown` enumerator on an unsupported HW. intel/llvm#14190 +- Extended list of known Intel GPU architectures available through + [`sycl_ext_oneapi_device_architecture`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_device_architecture.asciidoc) + extension. intel/llvm#13520 +- Extended mechanism to clear in-memory cache in heavy tests to also work on + `opencl` backend. intel/llvm#14119 +- Moved bit shift and rotate ESIMD functions out of `experimental` namespace. + intel/llvm#13545 +- Added check for template argument `N` of `media_block_load` ESIMD API. intel/llvm#13668 + +commit c5b174d8507cad1328b3121e650120e85f1da213 + [SYCL] Implement latest version of sycl_ext_oneapi_free_function_queries (#13257) + +commit 398aa20350aa38d76d9e95a8b76e3858c38faae5 + [SYCL] Support shuffle algorithms for non-uniform groups (#12705) + +commit ebb3b4a21b3b0e977f44434781729df7de83e436 + [SYCL] Remove plugin interface (#14145) + + +### Documentation + +- Added more detailed description of some of ESIMD methods and functions in a + new [`sycl_ext_intel_esimd_functions`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/supported/sycl_ext_intel_esimd/sycl_ext_intel_esimd_functions.md) document. intel/llvm#13071 +- Updated [`sycl_ext_oneapi_bindless_images`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_bindless_images.asciidoc) extension to support cubemap images. intel/llvm#12996 +- Updated [`sycl_ext_oneapi_bindless_images`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_bindless_images.asciidoc) extension to support sampled image arrays. intel/llvm#14237 +- Updated [`sycl_ext_oneapi_bindless_images`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_bindless_images.asciidoc) extension to support support default-construction of `image_descriptor`. intel/llvm#13781 +- Updated [ESIMD functions documentation](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/supported/sycl_ext_intel_esimd/sycl_ext_intel_esimd_functions.md) to list restictions for `atomic_update` functions. intel/llvm#13202 +- Updated [`sycl_ext_oneapi_bfloat16_math_functions`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_bfloat16_math_functions.asciidoc) + extension to support vectors of `bfloat16` to be passed to math functions. intel/llvm#14002 +- Updated [`sycl_ext_intel_device_info`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/supported/sycl_ext_intel_device_info.md) + extension to clarify behavior of `free_memory` query when there are multiple + processes using the same device. intel/llvm#14640 +- Promoted [`sycl_ext_oneapi_profiling_tag`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_profiling_tag.asciidoc) + extension from proposed into experimental status. intel/llvm#14165 +- Promoted [`sycl_ext_oneapi_group_sort`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_group_sort.asciidoc) + extension from proposed into experimental status. intel/llvm#14531 +- Moved [`sycl_ext_oneapi_enqueue_functions`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_enqueue_functions.asciidoc) + extension from proposed to experimental status. intel/llvm#14017 +- Updated [`sycl_ext_oneapi_device_architecture`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_device_architecture.asciidoc) + to return new `unknown` enumerator if device architecture cannot be properly + detected. intel/llvm#14077 + +commit ffc0de03f900da2d0262ea8ec41ac3847a1edbcc + [SYCL][Graph][Doc] Remove outdated limitation from spec (#13163) + +commit 486b3dd1a2b2924e4445f1e36e5c341a09ba784f + [SYCL][Graph][Doc] Tidy of graph extension design doc (#13065) + +commit 09c93842ffe51602e118504e4e3229d41b2a4fb2 + [SYCL][Graph] Clarify graph enable_profiling property in finalize() (#14067) + +commit ecd3b903f4ddb6b32892f03c326151faa9fa63e8 + [SYCL][Joint Matrix Spec] Add new API for out of bounds fill/load/store (#11172) + +commit f1e66f5f0f59b958ff352a558dbad8b42df63175 + [ESIMD][NFC][DOC] Add fence to the ESIMD SPEC functions (#13135) + +commit 1e2e6baaf86009f0f9067b1146a8ca7923436e60 + [SYCL][Bindless] Add image_mem_handle to image_mem_handle devices copies. (#12449) + +### SYCLcompat + +- Added non-`const` `image2d_max` and `image3d_max` getters. intel/llvm#14138 + +commit 17d2e2d8a483e7a4c33cd542a3c1381b767452bc + [SYCL][COMPAT] Add version & release process (#14457) + +commit d8c0a9342a7e71420883c8a750f89679897b9ca1 + [SYCL][COMPAT] Memory Header cleanup (#13143) + is it really user-visible? + +commit 89eeb02519cfee2f1d88ffac9f07dd131099b7dd + [SYCL][COMPAT] defs.hpp update with Windows macros. SYCLCOMPAT_CHECK_ERROR added. (#13027) + +commit 0b05577790f2a81cb10a41262324ff9558614f09 + [SYCL[COMPAT][CUDA] Impl masked compat shuffles on cuda (#13363) + +commit 13c9d0ef964b17dd3e2c297b1ceb2ecb8ea2ffe9 + [SYCL][Bindless][Doc][ABI-Break] Rename external semaphore destroy to release (#14535) + +commit fb561b9f336f8f9c286a1125631dedf1b5fb1e4b + [SYCL][Bindless][Doc][ABI-Break] Add const qualifiers to copies (#14140) + + +commit 0eeae2ac96ea179099dd5d57c241260ccfe65f73 + [SYCL][Graph] Update design doc for copy optimization and add test (#13051) + +commit 4acca904c0e07fd6b504f7938f539bc1a0e94ce0 + [CLC][AMDGPU] Refactor fence helper to process order semantic explicitly (#12872) + ??? + +commit 13a7b3ad2f229099fe964016f591f17a66b0ea15 + [SYCL] [libdevice] Add vector overloads of ConvertBFloat16ToFINTEL and ConvertFToBFloat16INTEL (#14085) + +commit 0dcad16c36f27e6254e7b831faaad8c6e07f8cfb + [SYCL][Bindless] Update spirv fetch-sampled and fetch/write-array (#13946) + bugfix?? + +commit af65855fa6b6df0eded078bd3dbe3bf4a6a2b2e3 + [SYCL][ESIMD]Replace use of intrinsics with spirv functions (#13553) + do we even need to mention this? +commit 990b1d1ba053d60a803ae5e750803ae6583119f9 + [ESIMD]Replace use of vc intrinsic with spirv extension for rdtsc API (#13536) +commit 1f1be9c642889b7c0fd045b073d411e544dc6007 + [SYCL][ESIMD] Move fmax to SPIR-V intrinsic (#14020) + this one is also problematic +commit bcca7a80adf50b04c0991ef48745353ac7829016 + [SYCL][ESIMD] Move a few math operations to SPIR-V intrinsics and support new functions (#13383) + that is a regression, not an improvement :) should be noted in known issues + +commit 1d2007ba7c661322584a60d84a40777e0e0d9567 + [SYCL][COMPAT] kernel_function and kernel_library constexpr constructors (#13932) + if those APIs were added in this release, we should squash two items into one + +commit 74602458d5583cf69ca575a9167def51dad15052 + [SYCL][Bindless] Replace 'image_channel_order' field in 'image_descriptor' with number of channels (#13745) + +commit 83db85f1964338d9ce67bb536f8e6c5eebe8893b + [SYCL][Bindless] Update and add support for SPV_INTEL_bindless_image extension new revision (#13753) + +commit d2a5e8d095c0176957f5da2c5232d8966f8ff1bf + [SYCL][Matrix] Add generation of spirv.CooperativeMatrixKHR type (#13645) + internal improvement that can be ignored? + +commit 82aaf27f6f0cf97ba89b58f88a18b09e23097afc + [SYCL][Driver] Refactor device config parsing to better match HIP and CUDA targets (#13617) + +commit b11a19b1896cc2f7ab43735aacf265182e22832c + [Bindless][SYCL][Doc] Add HintT tparam to cubemap fetch and sample (#13742) + +commit 9d1cbc51854f19f89105d502db9156b11e4507f4 + [SYCL][COMPAT] nd_range barriers seq_cst by default in supported devices (#12974) + +commit 3756fd1b778ae4ab36bd3988bfdf9ba910b779fd + [ESIMD] Enable FADD/FSUB for slm_atomic_update (#13535) + ??? + +commit c65bed1073460fb8d6dbb319f5e7ff2c9c7c9422 + [SYCL][Graph] Update begin_recording and end_recording (#13480) + +commit d6dfd0c77b2212f4e3e926d2e289bd3dc6e18b49 + [SYCL][Graph][DOC] add an edge case for record&replay mode (#12916) + +commit 89132855d4312536f5f40792194b6251d4cde819 + [SYCL][Joint Matrix] Add a new overload for joint_matrix_apply to be able to return result into a different matrix (#13151) + +commit 8847c110c78684a86ec7e62d7255f1bb9c6efd4f + [SYCL][NATIVECPU][libclc]Mark opencl_c_generic_address_space as unsupported on Native CPU (#13109) + +commit 07e3bcf9f3be46234deb471e25d94b5692353688 + [SYCL][ESIMD] Use LSC for unsupported surface index block stores (#13150) + +commit ed0619b4caa24af8e78053ecef2e5e808e0e2b08 + [SYCL][Joint Matrix] Support 1x64x16 bf16 combination (#13391) + +commit 03233e57e5585813ec2c0dbc7a10ceb4a6d15a71 + [SYCL] Add missed intel math functions in sycl_ext_intel_math header (#13762) + +commit 6cb77fcfb37ffb445ab62ea1545422dc52128da1 + [SYCL] Add -fPIC for Intel math function host code (#13800) + +commit 434d5edfae78307969ade6764e5bafeb17ce5073 + [SYCL] Remove redundant detail::empty_properties_t (#13777) + +commit 84bae21d3f63f04ca50bfffc5203909ba3fd95a6 + Implement missing overloads for generic AS in generic target (#13938) + +commit da379ecfa649a520f49f8adfb97e73c72ff3fb06 + [SYCL] Add support for multiple missing math ops (#13714) + +commit 0f6f57b43afa7ee442744b89e7a034673d58c8d8 + [SYCL][Doc] Fix typos and formating of SYCLCompat README (#13961) + +commit 0d1dd2d2b1e8655b96940edecef84447866e87bc + [SYCL] Add a module flag for device compilations (#13880) + +commit 29b4d855fa1a378e89182795e0d368304c40c3f6 + [SYCL][CUDA] Enable support of msvc math functions for nvptx target. (#14007) + +commit 9f1cee573782772f8d062f6490128c3ee6fa6911 + [SYCL][CUDA] Improve kernel launch error handling for out-of-registers (#12604) + +commit b49303c7e13ca0a69454eaaaeb8c3d094916218d + [SYCL][COMPAT] Add Image Max dims to device_info. Updated Max ND Range Size (#13973) + +commit db54535fb389331b167807a5d8f1ed16b5695474 + [AMDGPU][SYCL] Make unsafe atomic fadd opt in (#13955) + +commit dce651bd69ea12c935c70990ed3290007a00c6c5 + [SYCL][COMPAT] Migrate bug fixes & refactor of get_*version APIs (#14011) + +commit 4e36825beabb4b4a7435470ac633768dcbd7b376 + [SYCL] Record aspect names when computing device requirements (#13974) + +commit a35f862445b5666c63469cda2656b0a9946df25c + [SYCL][Graph] fix the address pointer in graph print (#13595) + +commit f204869281570959af82fff638df6b34151718f4 + [SYCL] Add sm90a Cuda target architecture support (#14075) + +commit c1b17e00f9b5c51db1f8385435d7a591224b01e0 + [SYCL] Enable CET for wqlibsycl-devicelib-host.a (#14135) + +commit e7defabdcc3d5b460cfc593822156836b874f092 + [SYCL] Use `std::array` as storage for `sycl::vec` on device (#14130) + +commit 0e24ac5677d8d91aed2fcc72d52d9d6b40f5985a +commit ea2111c1a022a1bd7a818ef9796d70d22f3b92d0 + [SYCL] Re-implement diagnostics about virtual calls (#14141) + +commit c2ebf84fd7ffcc8f40dd9eef2aed163437792cd5 + [SYCL] Make `vec` conversion operator to scalar non-template (#14668) + +commit 4240ef0d9db3577b057d27233c5393cc7f6b774e + [SYCL] Add check for valid SYCL triple for NVidia GPUs. (#14673) + +commit 3fdfbfed1ed0062b9f3848a100093b340183c6a3 + [SYCL][NATIVECPU] Support reqd_work_group_size on Native CPU (#13175) + +commit fe1859085b621ea901cd8da81659923122417688 + [SYCL][NVPTX] Emit reqd_work_group_size attributes as NVVM annotations (#14502) + related to above? + +commit 9e4768ca9849e7188221c0e2894282730e3b1bde + [SYCL][libclc] Add generic addrspace overloads of math builtins (#13015) + +commit 183832b9cebd471586c0ed251876972939442327 + [SYCL][PI] Add PI_ERROR_UNSUPPORTED_FEATURE error code (#13036) + +commit c1e2957be8db95425f1c17df258a0830c83dcf47 + [CUDA][LIBCLC] Implement RC11 seq_cst for PTX6.0 (#12516) + +commit 73be194fc27cd20968c264afdb71befc181d51ec + [SYCL] Add support for optional kernel features in AOT x86_64 compilation (#14590) +commit f51e43b2f0616934116626dc48c83282a84090ce + [SYCL] Add more aspect information for intel_gpu_* in device config file (#14188) + +commit 7a9d3b1e9483b69baa0b8c6f1097016efd52854c + [SYCL][NVPTX] Do not decompose SYCL functor unless necessary (#14434) + +commit d42d90e52d71c16739e26de353f69930cbe1f860 + [SYCL] Change the ext_intel_device_info spec to throw a feature not supported error when a query is not supported (#14576) + +commit 3561c9bb854d35eeb9fc4da3550334faaf316a4f + [SYCL] Add support of more Intel GPU arch versions to sycl_ext_oneapi_device_architecture (#14582) + +commit e51002c81cdf32f383104907cca820e4ed3452ba + [SYCL] Enable intel joint matrix on GNR. (#14436) + +commit 21c2e1c2213171d12acb5e6c41a713db30a0d5d4 + [SYCL] Make swizzle mutating operators const friends (#13012) + +commit da02e023e60d89824aad440c4f7bb558e70501a4 + [SYCL] Workaround for seg fault in `vec::convert<>` for OpenCL CPU at O0 (#14498) + +commit 17ee3e24e2874690f7526dcda9d8bc4679fe7edc + [SYCL][NATIVECPU] Add device library and initial subgroup support (#13979) + +commit 0b9fc099f63feadb5e476c5862de3d8fa977a655 + [SYCL][Graph] Test WGU kernel mismatch (#14379) + +commit 0ccb0b7d3dd614707f82ea8f99790e2d3b08496d + [SYCL][ABI-Break] Improve Queue fill (#13788) + +commit 005622d177c9a17dc9defefd507921daf7affc28 + [SYCL][Doc] Update work-group-specific extension (#14271) + +commit 93fef86cd4fb8e18c126365c404eea1ed0f1a7fa + [SYCL][Graph] Permit empty & barrier nodes in WGU (#14236) + +commit 02ac8a414c1fd9b209d139c100cd1bbeae3729d2 + [SYCL][LIBCLC][NATIVECPU] Add checks for fp16 and fp64 in Native CPU libclc (#14242) +commit a25d27bc9fbb2925519e966b9e7043be04274b27 + [SYCL][NATIVECPU][LIBCLC] Implement missing builtins for half type (#13829) + +commit 4151c799ef36f2912fab3f6b9e305240ef4ff327 + [SYCL][Graph] Wait instead of flush dep events in update command (#14167) + +commit 47a03418ac74f3a5492213afc192569eae1393ec + [SYCL][LIBCLC][NATIVECPU] Add aarch64 target triple for Native CPU (#13911) + +commit f2cd2a80e7277fc62d8802673ce6ab2fac6fcbd0 + [SYCL] Disable in-order queue barrier optimization while profiling (#14123) + +commit e34b7fffedbe9ff73d41b172eb48c189170f99f9 + [Doc] Document Unified Runtime update process (#14097) + +commit 2e1f14adb3bf6d9e9c55e4b0ced9e1ece2172a4a + [SYCL] Fix UB and alignment issues in the SYCL default sorter (#13975) + +commit 4222b4ccd6dc499248c8bf026bcdd0f207000b35 + [SYCL] Restrict `sycl::vec` and swizzle operations to types mentioned in the SPEC (#13947) + +commit 5b6cc5eb7bb2106ff426815702d89569e166c4f9 + [SYCL][Matrix] Enable SPV_KHR_cooperative_matrix extension (#13923) + +commit 03b994ead80bb381d59b1390f255119b8d211a1f + [SYCL] Add code location information to enqueue free functions (#13924) + +commit 58382507f0c7bd8a5c21e3b7e1d3360f0835f26a + [ESIMD]Add support for double data type to inv API (#13838) + +commit 0ce40f46ef4e2f5e8eed75e28352a90c9b8ecbaf + [SYCL] [NATIVECPU] Implement generic atomic store for generic target (#13428) + +commit ccca3b73769bfd8a27eff9956630fe86a2e4832d + [SYCL] Optimize SG group_store via BlockWriteINTEL in simple cases (#13734) +commit 48a0ff5b4b5bc21dedab37380c4ac93676277f91 + [SYCL] Optimize SG group_load via BlockReadINTEL in simple cases (#13673) + +commit 0a1381d286f7c32a256a6dab49917870769f1238 + [SYCL][Graph] Add wording about arbitrary C++ code in CGFs (#13699) +commit ece19f298b1029121da17a423b801bc2a9267a8d + [SYCL][Graph] Clarify graph in-order and out-of-order properties (#13681) + +commit 67f3bf292ff58136adc6383a3c5a1b19779e4120 + [SYCL][COMPAT] Removed sycl/sycl.hpp include (#13108) + +commit 7d55eb8a8419dac64f065bbf84125ed1d78dc992 + [SYCL][Docs] Behavioral changes to in-order queue events extension (#13624) + +commit 771ffa4e967f3058c500c87297c2d1a7be156a9b + [SYCL] Remove get_child_group() (#13482) + +commit e1119d9d2753dc9165e10c2e8c11e222cc549ba9 + [SYCL][ESIMD] Add more compile time checks to rdregion and wrregion API (#13158) + +commit c5cf452d663b96479341daa182c7e305baf542aa + [SYCL][libdevice] Add simple rand for ease of use in device (#13506) + +commit a36e9f8969a5ad4346f84c925aa89e1a00128b7f + [SYCL] APIs cleanup (#13443) + +commit ef6d2bb3caf36eaa1149369f8aee1578d6e31a6e + [SYCL][ESIMD] Add support for transposed prefetch for 1/2 byte elements (#13452) + +commit 5a07640e1ce68584a60b1a0450526e928340d1e0 + [SYCL][ESIMD] Add native FMA function (#13366) + +commit a4fdfdad53c3f1b2e423cdf5f5f0f977ce055593 + [ESIMD][NFC][DOC] Fix misprints and punctuation in esimd-functions doc (#13239) + +commit 0557c7bb247f6b90ce9e389b9bde341f98dca667 + [NFC][SYCL] Use __SYCL2020_DEPRECATED macro for any/all builtins (#13237) + +commit 3f841ff1d21bac2601beaf087bc3d09170af6d35 + [SYCL] Deprecate legacy information descriptors (#13279) + +commit 902dadc476617379a8d38206d6f2183b657acf62 + [SYCL] Add alternative to deprecated barrier() function for sub-group (#13276) + +commit fc9d62f6c93a47b5b980e4c2840f349c4b2db93a + [SYCL][AMDGCN] Provide a more helpful --offload-arch error (#13078) + +commit 0c0b58686a79c8d9a8ef547a96b5c1642480e591 + [XPTI][INFRA] Sample E2E data collection timing test for XPTI (#13045) + +commit 24699750a7f816b7ad4ebe19342210693e20a9f3 + [ESIMD][NFC][DOC] Add 'restrictions' section to gather/scatter() doc (#13196) + +commit 8867d446c360048b62064828693f4d50c945a55c + [spir-v][clang] Allow spirv32/spirv64 as target triples for sycl offloading (#13083) +commit b8f394203ec4436ddd31f72193c4c1a52e3747df + [SYCL] Fix device libraries and SYCL headers with spirv64 target (#13288) +commit 8bc909e01ece4e177ae25168995be21f0d37abc6 + [SYCL][libdevice] Build for spirv64 on Linux (#13302) +commit 363fceff578dcfa5a488b89f71f259da80aad2d7 + [SYCL][ESIMD] Don't override target triple to genx64 (#13445) +commit 9bb2b343de3308994892961b0b48838ce7f2e91d + [SYCL][ClangLinkerWrapper] Fix SYCL binary creation with spirv64 triple (#14686) +commit f8926a63ce5a1634cb0533f4ab8eab2b6898caac + [SYCL][Libdevice] Build for spirv64 on Windows (#13649) +commit d0744751abe535c1470ca8833d5dd3b3d1a72c6b + [SPIR-V][Headers] Enable programs that include system headers on Windows for SPIRV32 and SPIRV64 targets (#13548) + +commit 38e663ecd37de513d8e31afdfdf245cf8c9d17f0 + [SYCL] Declare __devicelib_assert_read only when fallback assert is enabled (#13241) + +commit 66865607bb90f7a7ca7602e5e18d8314659ffba5 + [SYCL][NFC] Remove legacy SYCL_EXT_ONEAPI_MATRIX_VERSION usages (#13235) + +commit 9612159998a1c05525f08f0a6775d875d86da518 + [Driver][SYCL] Cleanup redundant dependency steps (#13217) + +commit c821dc934dc7934b0209b5d3f88a280bbaa7145c + [SYCL] Add support for multiple filtered outputs in sycl-post-link (#12727) + merge with other optional kernel features AOT improvements + +commit 3ea29b2a9028b485b76339e16754e3e74c9cc7a6 + [SYCL] Update root_group extension to use `this_work_item` namespace (#13304) + +commit fb66f1b83559366e541381251de4281bb554613d + [SYCL] Replace __builtin_bit_cast with sycl::bit_cast in imf headers (#13313) + is it a bugfix? + +commit 65bdffb1c9d4c474316d3e330fc3c59338e004f6 + [SYCL][libclc][NATIVECPU] Implement generic atomic load for generic target (#13249) + new feature? + +commit d932fcae4aa83d12a3eb30a3003d1718429a9df1 + [SYCL][COMPAT] Extended device_info properties. (#13050) + +commit 05644a470303c2af3385b9533b8d23ebdea99eb7 + [OpenCL] Config dependent-load flag to exclude CWD from DLL search path (#13327) + do we report security issues? + +commit e9befa2d10f6c23a66ac780df7a1ddda55279230 + [SYCL][DebugInfo] Switch to nonsemantic-shader-200 for non-FPGA HW on linux (#13107) + do we need to mention it? + +commit a0d8f01c82dda1ed5227945001a179f97774474f + [SYCL][ESIMD] Move rdtsc function out of experimental namespace (#13417) + +commit 2a1002b9fac9c4b878c6625c3cfafa61dea07ea2 + [SYCL][JIT] Load SYCL JIT lazily (#13433) + +commit 4f5a5f0fba71593888f1737e0b4dbaf49c85e04b + [SYCL] Fix WA for ocl query of CL_DEVICE_PROFILE (#13584) + +commit 893059138f61aabeb0e1063549d7f4dd533fdfd1 + [SYCL][Matrix spec] Add 1x64x16 combination for Intel XMX (PVC only) (#13587) + or rather a new feature? + +commit e17632f32fcc160add43742ccdaa6cc80cc1b0c0 + [Driver][SYCL] Use LLVM-IR based device libraries for device linking (#13604) +commit 67d8ea1cdaef29afd75f7f085f0b6c6d73af81a3 + [Driver][SYCL][FPGA] Use bundled device libraries for FPGA targets (#13693) + +commit 1665cc0dd57266d2677c625725d38973cce3e8d9 + [SYCL][Graph] Enable in-order cmd-list (#13088) + +commit d13fdbe4ee02c39b1939bae7da61392e75ce2c78 + [Bindless][Exp] Add texture fetch functionality (#12447) + or a new feature? + +commit fbd10436a5911b12b8d77ba50397a24e6905e7a3 + [Driver][SYCL]Adding 'aoc -vpfp-relaxed' with -fintelfpga and -fp-model=fast (#13651) + +commit 8993f3fc55489023603ceafa631e8f19824979b3 + [SYCL][ESIMD] Use old intrinsic for named_barrier_signal for now (#13255) + does it revert the patch below? +commit d4a9254d764a0ff0be8514a6854afda833a268ce + [SYCL][ESIMD] Use intrinsic for named_barrier_signal (#12982) + ??? + +commit 51ffc04f0f317e0395c678e1fecd654df51db955 + [SYCL][libclc] Add generic addrspace overloads of vload/vstore builtins (#13092) + ???? + +commit 75300ab1ceee835e07086925d990f74107a84a1d + [SYCL][libclc] Add generic fp16 math builtins for generic SPIR-V target (#13361) + ??? + +commit 7271d613156f2268d538f20d92ecd52b1fbc488f + [SYCL][Docs] Add deprecation notice to SPV_INTEL_global_variable_decorations (#13772) + do we really need to mention SPIR-V specs? + +commit 0678c5ce0fe3af6363bd4b374ffaedb800a5b1e1 + [SYCL][Joint matrix] clarify the range of the prefetch templated arguments (#13796) + +commit bdaf1e27310dc2218a95f05731a422a32ea5a658 + [libclc] Separate out generic AS support macros (#13792) + ???? + +commit 24a6b3b2f2d2a160a737fb1162c78f4cce9a8f1d + [SYCL] Generate imported symbol files in sycl-post-link (#14189) +commit 62ea97e34e9245fb50f5718861da06e5e4425c2e + [SYCL] Exclude SYCL_EXTERNAL functions from device image with the option -support-dynamic-linking (#14103) + +commit d4f2fe54047a1b415af2402a497f20e918094580 + [SYCL][Bindless][Exp] Remove const from non-reference and non-pointer type parameters (#14238) + +commit 9800153d373eed9bb5d23acf965541ab0a99b316 + [MATRIX][DOC][E2E] Add note on sm version nvidia device issue. (#14178) + +commit 2bac63f5ebd62b29c8fe916a89b8b42ae536d609 + [ESIMD] Infer address space of pointer that are passed through invoke_simd to ESIMD API to generate better code on BE (#14628) + +commit 14aabdd3d081fea4ab7f66edc42b4b53eb9c50fe + [SYCL] Throw exception when device does not support queries in sycl_ext_intel_device_info (#14788) + +commit 2442ef047a4e9e9c135beed18a92029e1aad6cad + [DeviceSanitizer] Disable handling no return calls (#14652) + // bugfix? + +commit e38dcdc8bb547f4b63c7b860c1cd9948c090ffc8 + [SYCL] Add compile target to device image properties (#14757) + Not user visible, need to merged with other optional kernel features AOT patches + +## Bug Fixes + +### SYCL Compiler + +- Fixed that using `-fsycl-link-targets` flag would inadvertently trigger some + additional device code linking steps. intel/llvm#13004 +- Fixed a bug that when AOT-compiling for Intel GPUs the compiler would pass + some PVC-specific flags even if target device is not a PVC. intel/llvm#13794 +- Fixed a bug with incorrect file extensions being emitted in AOT compilation + when `--save-temps` is used. intel/llvm#14214 +- Made `-fsycl-add-default-spec-consts-image` available with `clang-cl`. intel/llvm#13168 + +### SYCL Library + +- Fixed a situation when querying + `sycl::ext::oneapi::experimental::info::device` could result in exception + being thrown instead of empty vector being returned. intel/llvm#13968 +- Fixed `esimd::atan` implementation under `-ffast-math` flag. intel/llvm#13186 +- Fixed an issue that component devices were not considered to be a descendent + from their composite devices when creating a queue. intel/llvm#13513 +- Fixed an issue that querying for + `ext::oneapi::experimental::info::device::composite_device` would *not* + throw an exception if device is not a component device. intel/llvm#13868 +- Fixed an issue that querying for composite devices may result in some devices + returned twice. intel/llvm#14442 +- Fixed a bug in copy-constructor of `config_2d_mem_access` ESIMD class which + would lead to compilation errors. intel/llvm#13632 +- Fixed an issue that use of `atomic_ref` would not be detected as a use + of `atomic64` aspect leading to errors due to speculative compilation. intel/llvm#14052 +- Fixed `ctanh` and `cexp` returning incorrect value in some edge cases. intel/llvm#14329 +- Fixed a bug where values passed to `-Xs` option through `build_options` + property were not passed down to device compiler when using + [`sycl_ext_oneapi_kernel_compiler`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_kernel_compiler.asciidoc) + extension. intel/llvm#14522 +- Fixed a bug in + [`sycl_ext_oneapi_kernel_compiler`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_kernel_compiler.asciidoc) + extension implementation + TODO: add description here. @cperkinsintel . intel/llvm#14490 +- Fixed a bug where defining kernel as a named functor whilst using + `-fno-sycl-unnamed-lambda` would lead to a compilation error about unnamed + lambdas being unsupported. intel/lvm#14614 +- Fixed an issue on CUDA & AMDGPU backends where `multi_ptr` relational + operators taking `std::nullptr_t` would produce different result comparing + to standard C++ helpers like `std::less`. intel/llvm#13201 +- Fixed a compilation issue with `-fpreview-breaking-changes` flag when + `windows.h` is included caued by conflict with `min`/`max` macro. intel/llvm#14260 +- Fixed strict alias violations in `sycl::vec::operator[]` + implementation that could lead to spurious errors. intel/llvm#13596 +- Fixed a bug where a barrier submitted into a command queue with host tasks + could ignore them. intel/llvm#13094 +- Fixed a compilation issue occurring when `printf` is used on CUDA backend + on Windows. intel/llvm#13784 + TODO: was it really a compilation issue? +- Fixed an issue where compiler could emit SPIR-V instructions for reversing + bits in a variable which are not supported by device compilers. intel/llvm#13810 +- Fixed a bug where having a default-constructed `local_accessor` as an argument + could lead to runtime errors reported about being unable to set kernel + argument. The issue manifested itself on Windows and under `-O0` optimization + level on Linux as well. intel/llvm#13382 +- Fixed a hang when invalid values were passed into `ONEAPI_DEVICE_SELECTOR`. intel/llvm#13367 +- Fixed shuffles over non-uniform groups on CUDA backend. intel/llvm#13230 +- Fixed an issue with persistent cache where under certain circumstances (like + cache directory being located on a network drive on Windows) SYCL RT would + fail to create necessary directories for the cache to work. intel/llvm#13019 +- Fixed a bug where querying a kernel by name from a kernel bundle would crash + a program. intel/llvm#13155 + TODO: ask @cperkinsintel for feedback about the wording here. +- Fixed an error handling bug where non-blocking `pipe` operations would lead + to exceptions being mistakenly thrown. intel/llvm#13166 +- Fixed compilation issues happening when non-uniform group built-ins were used + with `marray` and `vec`. intel/llvm#14364 +- Fixed a bug where memory attributes applied to a struct that is used as a type + of a `device_global` variable would be silently dropped and ignored. + intel/llvm#14414 +- Added missing `value_type` and `vector_t` member type aliases to swizzles. intel/llvm#13040 +- Fixed shutdown sequence issues when SYCL RT is used from an application or + library that has its own shutdown sequence using global destructors. intel/llvm#14153 + +commit c2a4054980fd4ce4c4d6cfa425cbc71b20d5f450 + [SYCL] Fix barrier with wait list handling (#13863) + is it just a bugfix for intel/llvm#13094? +commit 11046e7d8bf07afebd79f30a528c3cbe5493d8ed + [SYCL] Fix queue fields cleanup for barrier vs host task deps (#14268) + looks like bugfix for intel/llvm#13094 + +commit 1d24713c299aa16113f390c87d4444af5b83a586 + [SYCL] Fix ONEAPI_DEVICE_SELECTOR handling of discard filters. (#13927) + What was happening before this patch? + +commit 775dccb43494b1d38fb84de728446053b11bd05a + [SYCL] Allow empty and unsupported case for component_devices (#13931) + This is later modified in another commit, so those two should be squashed + +commit 33325d4af0b66c33f7a42f0bf584645972a738a8 + [SYCL] Fix enqueue functions taking both kernel and properties (#14743) + +### Documentation + +- Actualized default value for `SYCL_PI_LEVEL_ZERO_USM_ALLOCATOR` debugging + environment variable (it is now set to enabled by default). intel/llvm#12088 +- Fixed link to range rounding + [implementation design document](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/design/ParallelForRangeRounding.md) + from [environment variables](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/EnvironmentVariables.md) + documentation. +- Corrected installation steps for CPU/FPGA low-level runtimes in + [Get Started Guide](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/GetStartedGuide.md). intel/llvm#14204 + + +### SYCLcompat + +commit 29230c80d117e29ba113ac8522b6fd2946ac56f9 + [SYCL][COMPAT] Fix using address of a temporary queue_ptr in util.hpp (#14440) + Was it user-visible? + +commit 510965a0a098313cc19e8a68cc405098dc9e9501 + [SYCL][COMPAT] fixed byte-dot products to properly call cuda intrinsics (#14463) + +commit 3caa78ecf53644ead4f1d5fa8bc7b4a81a1f4961 + [SYCL][COMPAT] Fixes SYCLCOMPAT_PROFILING_ENABLED codepath (#14574) + +commit 7b538cdc4ecb33e88682eb1b36be33b73ac68caf + [SYCL][COMPAT] fixed atomic_compare_exchange_strong not using addressSpace template parameter (#13821) + +commit d66b0baed24483da96fa135082e7c544498ce2d9 + [SYCL][COMPAT] Add inline in max and min functions (#13708) + +commit e40283b1234e0846d1a19be537948e865a31f360 + Task sequence revert (#14359) + This reverts PR #12453 and #13080 + not sure which section this should go into + +commit 93a0ec4c465ceff1bed641422e23f13ca6b8a7cd + [SYCL] all_props_are_keys_of fix (#14433) + +commit a14c0917ad741a3a27b50040e4589b56262462bc + [SYCL][Bindless] Update spirv read/fetch from sampled image and sampled image array (#14493) + +commit c1ee064428a2d4038021dc3284a4c2f3aa897cb8 + [SYCL][Bindless] Fix OpaqueFD/Win32Handle's scope in piextImportExternalMemory/Semaphore (#14266) + +commit 493e78be6020ef436634b21d93069467fa6c69e7 + [SYCL][Graph] Fix PI Kernel leak in graph update (#14029) + +commit 9ec73a21782de1d11d08e97d63a27fa8b208c1e5 + [SYCL] Add work_group_num_dim metadata (#13600) + Fixes reqd_work_group_size for HIP + +commit 14ee7e1cca79cac97ecc41ddc15d5d724011c89a + [SYCL][Bindless][Exp] Remove unneeded function argument causing memory leak in image create functions (#13364) + +commit 4b993a7b32f7743980bce646765a1b427b0996b6 + Revert "[SYCL][Driver] Link with sycl libs at link step of clang-cl -fsycl (#12793)" (#13326) + revert commit seems to be a part of a previous release + +commit ea7ba1b965302277fc23ef48dba83b10e6c734e9 + [ESIMD] Restore the lowering of lsc_load_stateless in sycl-post-link (#13104) + +commit 2053be298d1bf2417ad0b2efaf0d9360650ed491 + [SYCL][COMPAT] Reverted nd_barrier atomic_ref to acq_rel (NVPTX) (#13641) + do we ned to mention this? do we need to drop some other item? + +commit 267a03cd1ba5eaa55db95800712f978b93842bc5 + [SYCL] [NATIVECPU] Select right libclc file for native cpu (#13478) + ??? +commit 5794326b965071a69273a1f653405670b728e66b + [SYCL][NATIVECPU][DRIVER] Select remangled libclc variant for Native CPU (#13765) + ??? + +commit 014004cf0f7cc21195a4a0ed4f16a003ecb7be72 + [SYCL] event() fail fast (#13419) + What was the problem with this? + +commit bc9e30eff79a091bb3db3fc1a005009049734798 + [SYCL] Use 32-bit integers where it's appropriate for matrix instructions (#12867) + do we even need to mention this? + +commit 0fde69dbfa18e0c9b477a916477297a832e194a3 + [SYCL] Do not enable SPV_KHR_bit_instructions until downstream tools are ready (#13044) + Perhaps it can be fully omitted, because it may have been "reverted" later + +commit f170c63ed329c1fa5271d67e68144ec5d7808079 + [SYCL] Fix kernel shortcut path for inorder queue (#13333) + could be related to a commit made post-March release, i.e. it can probably be squashed with some other line + +commit 5332773b17efbf10e1b72cd633c1d7e2b4f75125 + [SYCL][ESIMD] atomic_update with data size less than 4 bytes should use LSC atomics (#13340) + +commit 4b14d706d93891cdb5b0e6a8d4b0b027c1d54ab8 + [SYCL][DeviceSanitizer] Use -asan-constructor-kind=none to disable ctor/dtor (#13259) +commit 13a80f87098f4e6db25b75b46736ebd967110953 + [DeviceSanitizer] Strip off pointer casts and inbounds GEPs (#13262) + all device sanitizer PRs can probably be merged into a single line +commit 4723efc481cc18160cfa2f76d89378a84c43df64 + [SYCL][DeviceSanitizer] Checking "sycl::free" related errors (#12882) +commit 247e5e0a68b25af8d0f76855d231b9e5045b9c9a + [SYCL][DeviceSanitizer] Checking out-of-bounds error on sycl::local_accessor (#13503) +commit 7b4fbac8f29faa533b55c80bf4adbc51f5afe833 + [DeviceSanitizer] Support multiple error reports (-fsanitize-recover=address) (#13948) +commit 2a5f9137ca2a6c1004a059ed95d3bfd79cf3ad41 + [DeviceSanitizer] Support detecting misaligned access error (#14148) +commit a0cc14f9ff5aad889b31534a27e4a39d5b2c25c2 + [DeviceSanitizer]Change ASan shadow scale from 3 to 4 (#13857) + +commit 0939f39818225ce3e469e08f6a45711b449a8ad4 + [SYCL] Align assert ext name with libdevice implementation (#13312) + can likely be ommitted + +commit 5ab1f762821abf36412b2b8d0e529285553fa472 + [SYCL][ESIMD] Fix simd_view template argument and add nested simd_view tests (#13231) + +commit 3c7f99d891cdd7c929b38b18bf6877c3c8dba163 + [SYCL][Graph] Fix potential issue with command buffer commands (#13224) + +commit c4c456bd74945b0ea2faf9ca54b28bb02f36cd49 + [SYCL] fix for kernel_compiler (#13214) +commit bcf7d4df6acf33a75c195215afad78113d14ae2d + [SYCL] kernel_compiler opencl query fix (#13448) + +commit d6340b67391cd8e9e4c7775a3c1ada8f2755bb06 + [SYCL][Graph] in-order queue barrier fix (#13193) + +commit fffe9a10d1d65d97302fd0ec88ce015ab625033d + [clang][FE][Cuda] Fix a sm90a cuda arch define check in TargetInfo (#12885) + +commit 0e892be1316ccf019688e420eaa770ec4a4a30fa + [SYCL][COMPAT] Specify proper namespace for abs with sycl::complex (#13518) + +commit 628ede6edf2448c531bea7f818dc6819d9e7393f + [libclc] Fix UB in double->int conversions (#13546) + +commit 13a06d8c6bb468165fbdd2a2fc24dc79d6110b4f + [ESIMD] Use 1-element mask for load_2d()/store_2d()/prefetch_2d() (#13613) + +commit 646db9cdb1899bbfefbbcb77d6ea256c4e9789c0 + [SYCL][Matrix] Fix checked matrix instructions (#13287) + If I understand correctly, that is a non-functional change. @MrSidims? + +commit 6b2fb665e9aa0bb7f2e034a22a153b7006c19d8a + [SYCL] Fix Level-Zero's `sycl::make_device` interop (#13483) + +commit 64cb0cf96de28bfd495e577b4dd46c26dbb6b197 + [SYCL][Graph] Fix minor issues in graph update code (#13660) + +commit b4e0450207b5a85d5b985de0c0ff6fecdfebf0da + [SYCL][Graph] Export missing graph node symbols (#13744) + +commit 6934bcfb13415dc5bda85876b5cfc361678523f4 + [SYCL] Do not attach reqd_work_group_size info when multiple are detected (#13523) + +commit f90554f8ec4a9e58a9e96865afed98deb9615ef4 + [SYCL][Docs] Fix variadic properties ctor (#13676) + +commit 601f12103f2cb3bed8c61a9b25122144bf9a663c + [clang][FE] Remove duplicate preprocessor defines of HIP memory scope (#12871) + +commit 563904b2aebb791adf0e1ad955a43e226c9a6caf + [SYCL] Add aspect names to sycl_used_aspects before cleaning up (#13486) + part of optional kernel features AOT? + +commit 8eff95ca51963dc6b4ec629da0dfaf134239cefc + [SYCL] Fix FloatVecToBF16Vec build (#14161) + Is it a user-visible fix? + +commit 82f77d10dd092ea419115f61a7715655f055b7bb + [SYCL][Graph] Fix queue recording barrier to different graphs (#14212) + +commit e22cb798f8363f8e2a95a7e6df9a294b34c52fc4 + Fix Basic/image/srgba-read.cpp failure under SYCL_PREFER_UR with ONEAPI_DEVICE_SELECTOR=opencl:cpu (#14233) + +commit 1b5c5a8e96502b196c91251fa6513a6ede1257f5 + [SYCL] Fix SYCL_EXTERNAL device code when linking with a static lib (#14256) + +commit d77a348776672316f59c59dc3b11ebf5aa79f936 + [SYCL][NVPTX] Emit 'grid_constant' annotations for by-val kernel params (#14332) + +commit 5ad97902643da043233ec21ac203cca329df07b2 + [SYCL][Graph] Fix profiling info when bypassing scheduler (#14678) + +commit daaece06ce68544eaae078899c559f571297d8c0 + [SYCL][Graph] Fix access modes not being respected (#13011) + +commit c63b49ddfacf2f17135663a320e15e93be2971aa + [Driver][SYCL] Address issue with improper bundler call with -fsycl-link (#13002) + +commit 6e9a3dd987ce6f1c7384623713cea14f084cab9d + [SYCL] Fix 'ignore-device-selectors' sycl-ls CLI option on windows (#13047) + +commit b13a3c4c39a356c47cda983350f06000330a42f1 + [libclc][hip] Fix half shuffles and reenable reduction test (#13016) + +commit 0360e6af2a353210d508633a60ff02327094f7e7 + [SYCL] Follow up fixes for group_sort extension (#14591) + +commit d39563ad1faa1d503c0396a137afd6664756b358 + [SYCL][Clang] Fix address space for virtual table support (#13629) + +## API/ABI Breaking Changes + +This release is an *ABI* breaking release, meaning that any applications which +were built using older versions of the toolchain have to be recompiled in order +to be launched using newer versions of SYCL runtime library. + +- Bumped major version of SYCL runtime library to `8`. intel/llvm#13097 +- Cleaned up list of symbols exported from SYCL runtime library: dropped some + legacy symbols, hidden some symbols which shouldn't have been exported in the + first place, etc. + intel/llvm#14638 intel/llvm14626 intel/llvm#14624 intel/llvm#14615 + intel/llvm#14585 intel/llvm#14616 intel/llvm#14494 intel/llvm#14460 + intel/llvm#14368 intel/llvm#13493 intel/llvm#13191 intel/llvm#14549 + intel/llvm#13597 intel/llvm#13271 +- Updated ABI of several functions/methods to avoid using `std::string` and + some other objects in library interface. This should allow to use SYCL RT + in applications which were built with pre-C++11 ABI. + intel/llvm#13183 intel/llvm#13549 intel/llvm#13560 intel/llvm#13212 + intel/llvm#13213 intel/llvm#13447 + +Several API breaking changes were made as well, mostly compltely dropping +support for previosly deprecated APIs and in some cases switching implmentations +of some classes to use so-called preview implementation. + +- Removed `sycl::abs` overload taking floating-point argument. intel/llvm#13286 +- Removed `sycl::host_ptr` and `sycl::device_ptr`. intel/llvm#13240 +- Removed `queue::discard_or_return`. intel/llvm#14550 +- Removed `sycl::make_unique_ptr`. intel/llvm#13232 +- Removed `use_primary_contaxt` property. intel/llvm#13496 +- Methods and functions related to previously removed host device like + `platform::is_host`. intel/llvm#14258 +- Removed SYCL 1.2.1 exception subclasses: + - `runtime_error`, `nd_range_error`, `invalid_parameter_error`. intel/llvm#14546 + - `device_error`. intel/llvm#14486 + - `feature_not_supported`. intel/llvm#14423 +- Removed `queue::mem_advice` overload accepting `pi_mem_advice`. intel/llvm#14618 +- Removed number of deprecated ESIMD APIs. intel/llvm#14415 +- Removed deprecated overloads of math built-ins accepting raw pointers. intel/llvm#13238 + Is it negated by the following commit? + - commit efed3bb04f3c43baf3373bf35d8924bbcf91f385 + [SYCL] Allow raw pointers in SYCL math builtins (#13893) +- Removed non-standard `sycl::id` -> `sycl::range` conversion operator. intel/llvm#13293 +- Removed deprecated APIs from + [`sycl_ext_oneapi_bindless_images`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_bindless_images.asciidoc) + extension implementation. intel/llvm#14555 +- Renamed SYCLcompat function `async_free` to `enqueue_free`. intel/llvm#14015 +- Enforced restrictions on first argument of lambdas/functors passed to + `parallel_for(range)` and `parallel_for(nd_range)`. intel/llvm#13198 +- Switched `sycl::vec` implemetation to use its preview version. intel/llvm#14317 intel/llvm#13182 +- Switched `sycl::exception` implementation to us its preview version. intel/llvm#14548 +- Switched math built-ins implementation to use their preview version. intel/llvm#13152 +- Switched `bfloat16` implementation to use its preview version. intel/llvm#13233 +- Switched `sycl::nd_item` implementation to use its preview version. intel/llvm#13197 +- Enforced restriction that `buffer`'s elemennt type must be device copyable. intel/llvm#13200 +- Restructured SYCL headers so that `` and `` are not included + in there anymore. intel/llvm#11528 +- Dropped support for `SYCL_DEVICE_FILTER` environment variable. intel/llvm#13192 +- Updated `accessor::get_pointer` interface to return `global_ptr` + which can be `const`-qualified if an accessor data type is `const`-qualified, + or if an accessor is read-only. intel/llvm#13443 +- Removed deprecated APIs related to `sycl_ext_oneapi_free_function_queries`. + intel/llvm#13257 +- Moved `slm_allocator` ESIMD APIs into `experimental` namespace. intel/llvm#13901 + +Breaking changes were also made to compiler flags: + +- Removed `-fsycl-link-huge-device-code` flag (it was deprecated in favor of + `-flink-huge-device-code`). intel/llvm#14731 +- Updated `-sycl-std` to disallow selection of SYCL 1.2.1. intel/llvm#14544 +- Removed deprecated `-fsycl-[add|link]-targets` flag. intel/llvm#13834 +- Removed deprecated `-foffload-static-lib` and `-foffload-whole-static-lib`. + Corresponding functionality is already available without the need to pass any + special options. intel/llvm#13835 +- Deprecated `-fsycl-disable-range-rounding` flag in favor of the new + `-fsycl-range-rounding`. intel/llvm#12715 + +commit 00b9b6d5db3de7257229f5c8b6aba4163a8f8977 + [SYCL][ESIMD][ABI Break] Remove predec atomic op (#14480) + Is it user visible? Is it an API break? + +commit 9457144dd784d786c8f7e994bcb804f123cfb587 + [ABI-Break][SYCL] Remove ESIMD emulator code from pi.cpp (#13234) + // is it really user-visible? + +## Known Issues + +commit 33c0829f3e3389006662845784980b930faf3b38 +Author: Igor Chorążewicz +Date: Thu Jul 25 23:00:19 2024 -0700 + + [UR] Bump UR version and enable dynamic linking with UMF (#13343) + + Testing PR for: https://github.com/oneapi-src/unified-runtime/pull/1430 + + --------- + + Co-authored-by: Krzysztof Swiecicki + Co-authored-by: Steffen Larsen + +commit 450683b6fa1d1be1b9391905f43073b7a9555aa1 +Author: Yang Zhao +Date: Thu Jul 25 00:02:46 2024 +0800 + + [SYCL][DeviceSanitizer] Support GPU DG2 Device (#13450) + + UR: https://github.com/oneapi-src/unified-runtime/pull/1521 + + - Add MemToShadow_DG2 + - Enable lit tests for GPU, decrease the global workgoup size in some + tests due to the limit of GPU memory + + Although, the "_DG2" suffix might be misleading: DG2 present all 48bits + virtual address devices, and PVC present all 58bits virtual address + devices. + + --------- + + Co-authored-by: Wenju He + Co-authored-by: Kenneth Benzie (Benie) + +commit bd97280007ee79bf118fdbade3d9cb14721b9014 +Author: aarongreig +Date: Wed Jul 24 07:20:14 2024 +0100 + + [UR] Bump main tag to 9b209642 (#14553) + + * https://github.com/oneapi-src/unified-runtime/pull/1791 + * https://github.com/oneapi-src/unified-runtime/pull/1856 + * https://github.com/oneapi-src/unified-runtime/pull/1861 + + --------- + + Co-authored-by: Kenneth Benzie (Benie) + +commit 5667218ed6be6dee8877efcc2fbcfc2ecd515cff +Author: Kenneth Benzie (Benie) +Date: Tue Jul 16 18:05:26 2024 +0100 + + [UR] Bump main tag to 7e38af77 (#14552) + + * https://github.com/oneapi-src/unified-runtime/pull/1826 + * https://github.com/oneapi-src/unified-runtime/pull/1852 + * https://github.com/oneapi-src/unified-runtime/pull/1849 + * https://github.com/oneapi-src/unified-runtime/pull/1828 + * https://github.com/oneapi-src/unified-runtime/pull/1772 + * https://github.com/oneapi-src/unified-runtime/pull/1862 + +commit 44861fec406fff7a20bd4791c4288d71828912cc +Author: Callum Fare +Date: Thu Jul 11 15:15:12 2024 +0100 + + [UR] Bump UR and implement changes to bindless image handle types (#14516) + + https://github.com/oneapi-src/unified-runtime/pull/1829 + + --------- + + Co-authored-by: Kenneth Benzie (Benie) + +commit c30769b122d99eb4d05bcb78f15e593491fe31ae +Author: Neil R. Spruit +Date: Wed Jul 10 21:58:04 2024 -0700 + + [UR][L0] Use Intel Level Zero Driver Version String extension (#14426) + + pre-commit PR for + https://github.com/oneapi-src/unified-runtime/pull/1816 + + --------- + + Signed-off-by: Neil R. Spruit + Co-authored-by: Kenneth Benzie (Benie) + +commit 8ddd7291219256f9bcb78328cc85322037736171 +Author: Ross Brunton +Date: Wed Jul 10 15:12:23 2024 +0100 + + [UR] Update to new urProgramLink interface (#13085) + + Pre-commit PR for + https://github.com/oneapi-src/unified-runtime/pull/1458 + + --------- + + Co-authored-by: Kenneth Benzie (Benie) + +commit 13ae57f97cfb45cbcee8db6155ac8b0f7b7fbb82 +Author: Kenneth Benzie (Benie) +Date: Wed Jul 10 10:53:12 2024 +0100 + + [UR] Bump main tag to 9d3bce6a (#14499) + + https://github.com/oneapi-src/unified-runtime/pull/1822 + +commit db4d83e3969a5f7b5313aa5fb8466dd2ebbf9283 +Author: Neil R. Spruit +Date: Tue Jul 9 06:56:01 2024 -0700 + + [UR][L0] Fix Queue get info and fix Queue release decrement (#14411) + + pre-commit PR for + https://github.com/oneapi-src/unified-runtime/pull/1814 + + --------- + + Signed-off-by: Neil R. Spruit + Co-authored-by: Kenneth Benzie (Benie) + +commit 78ae397aab9b2040be945ee2f7f73d93404ffa06 +Author: Artur Gainullin +Date: Tue Jul 9 02:37:27 2024 -0700 + + [UR] Uplift UR tag to the fix in the L0 adapter regarding event timestamps (#14360) + + UR PR: https://github.com/oneapi-src/unified-runtime/pull/1806 + +commit ac556f9273e479c033e7dc76248fdb6861377ce7 +Author: Fábio +Date: Mon Jul 8 16:23:41 2024 +0100 + + [UR] Update main tag for L0 CommandBuffer refactor (#14240) + + Co-authored-by: Kenneth Benzie (Benie) + +commit eb03091539daa68a582ceab950379ca482e118d9 +Author: Neil R. Spruit +Date: Mon Jul 8 05:50:54 2024 -0700 + + [UR][L0] Fix Device Info return code to report unsupported enumeration (#14407) + + pre-commit PR for + https://github.com/oneapi-src/unified-runtime/pull/1809 + + --------- + + Signed-off-by: Neil R. Spruit + Co-authored-by: Kenneth Benzie (Benie) + +commit 577c349c5f3b1c893160de2470aff5ee3f87f0bc +Author: Neil R. Spruit +Date: Fri Jul 5 04:30:49 2024 -0700 + + [UR][L0] Fix immediate command list use in Command Queues (#14341) + + pre-commit PR for + https://github.com/oneapi-src/unified-runtime/pull/1802 + + --------- + + Signed-off-by: Neil R. Spruit + Co-authored-by: Kenneth Benzie (Benie) + Co-authored-by: Aaron Greig + +commit f2bd076eb55a2cc79de2e9d4748967ed3cb13c9b +Author: Wu Yingcong +Date: Thu Jun 27 02:26:23 2024 -0700 + + [UR] fix use-after-free problems (#13855) + + UR PR: https://github.com/oneapi-src/unified-runtime/pull/1637 + + --------- + + Co-authored-by: Callum Fare + +commit c6428bee93a01009291ee704dca9db6262045aed +Author: Neil R. Spruit +Date: Tue Jun 25 07:03:05 2024 -0700 + + [UR][L0] Fix Handle used in calls to L0 Driver zex apis given multi d… (#14250) + + …rivers + + pre-commit PR for + https://github.com/oneapi-src/unified-runtime/pull/1778 + + Signed-off-by: Neil R. Spruit + +commit 088a9475e7c5f39ecb2b74f79a479380c9dd64be +Author: aarongreig +Date: Fri Jun 21 13:52:08 2024 +0100 + + [UR] Pull in changes from UR PR #805 (#12270) + + https://github.com/oneapi-src/unified-runtime/pull/805 + + --------- + + Co-authored-by: Kenneth Benzie (Benie) + +commit 350b56fda217ffc4677c5a3443a7844e13ac209d +Author: Hugh Delaney +Date: Fri Jun 21 10:30:11 2024 +0100 + + [UR] Update main tag to 975313cb (#14225) + + https://github.com/oneapi-src/unified-runtime/pull/1774 + + --------- + + Co-authored-by: Kenneth Benzie (Benie) + +commit ab77ba800e6b36d0217dea053d125435f0a0b2db +Author: Kenneth Benzie (Benie) +Date: Tue Jun 18 17:51:02 2024 +0100 + + [UR] Bump L0 tag to 2a31795d (#14213) + + https://github.com/oneapi-src/unified-runtime/pull/1623 + +commit 174f7510328f49d6f24c578b226acea085489082 +Author: Steffen Larsen +Date: Mon Jun 17 15:01:45 2024 +0200 + + [UR] Bump main tag to 33eb5ea8 (#13950) + + Pull in changes from + https://github.com/oneapi-src/unified-runtime/pull/1678. + + --------- + + Signed-off-by: Larsen, Steffen + Co-authored-by: Kenneth Benzie (Benie) + + +commit 5e9e7a73ce11182af6ceafc1e91996b6c79f7180 +Author: aarongreig +Date: Mon Jun 17 10:30:32 2024 +0100 + + [UR] Pull in change to make urPlatformCreateWithNativeHandle take an adapter. (#14012) + + Co-authored-by: Kenneth Benzie (Benie) + +commit 579484f0ae9e5e30b9c9bd468799e1688d5de890 +Author: Neil R. Spruit +Date: Fri Jun 14 05:45:42 2024 -0700 + + [UR][L0] Maintain Lock of Queue while syncing the Last Command Event (#14150) + + pre-commit PR for + https://github.com/oneapi-src/unified-runtime/pull/1749 + + --------- + + Signed-off-by: Neil R. Spruit + Co-authored-by: Kenneth Benzie (Benie) + +commit ae79b95cc07ab68fcf706d47851b93e5b299dc87 +Author: Hugh Delaney +Date: Wed Jun 12 16:46:31 2024 +0100 + + [UR] Bump main tag to b13c5e1f (#14042) + + https://github.com/oneapi-src/unified-runtime/pull/1711 + + --------- + + Co-authored-by: Kenneth Benzie (Benie) + +commit 1a885ecacc468ab324c812ab47b4af7f3b086e52 +Author: Artur Gainullin +Date: Wed Jun 12 07:30:29 2024 -0700 + + [UR] Update UR tag to include L0 loader related changes (#14109) + + Co-authored-by: Kenneth Benzie (Benie) + +commit 7c530e154021d103259c8437233e7ba13ce98146 +Author: aarongreig +Date: Wed Jun 12 13:15:51 2024 +0100 + + [UR] Bump main tag to 78d02039 (#12269) + + https://github.com/oneapi-src/unified-runtime/pull/1128 + + --------- + + Co-authored-by: Kenneth Benzie (Benie) + +commit c168f213381645e695eaed3500a7ba7bcc655321 +Author: Andrey Alekseenko +Date: Wed Jun 12 07:01:54 2024 +0200 + + [UR] Fix size confusion for several device property queries (#12488) + + For testing https://github.com/oneapi-src/unified-runtime/pull/1282 + + --------- + + Co-authored-by: Kenneth Benzie (Benie) + +commit c41a0562e9a4a9ace16373b819d15a38ec467c4e +Author: Omar Ahmed +Date: Mon Jun 10 15:37:57 2024 +0100 + + [UR] Remove redundant mem type (#13058) + + Testing PR for [UR + PR](https://github.com/oneapi-src/unified-runtime/pull/1409) + + --------- + + Co-authored-by: Kenneth Benzie (Benie) + +commit 3935e06bc2e3794b7eac715c069e28c30aeaee9c +Author: Ewan Crawford +Date: Mon Jun 10 12:46:11 2024 +0100 + + [SYCL][Graph] Combined L0 Graph Update fixes (#14111) + + Bumps the L0 adapter UR commit to include several merged fixes to the L0 + adapter for implementing the SYCL-Graph update feature: + + * [Use fence rather than event for sync in L0 command-buffer + update](https://github.com/oneapi-src/unified-runtime/pull/1629) + * [Fix lifetime of pointer used in L0 + update](https://github.com/oneapi-src/unified-runtime/pull/1721) + * [Fix L0 Event leak without return sync + point](https://github.com/oneapi-src/unified-runtime/pull/1706) + +commit fcfe36b705fa715b4813de95565bbba9a5b88223 +Author: Kenneth Benzie (Benie) +Date: Mon Jun 10 10:16:54 2024 +0100 + + [UR] Bump main tag to f06bc02a (#14047) + + Includes the following: + + * https://github.com/oneapi-src/unified-runtime/pull/1653 + * https://github.com/oneapi-src/unified-runtime/pull/1568 + * https://github.com/oneapi-src/unified-runtime/pull/1634 + * https://github.com/oneapi-src/unified-runtime/pull/1669 + +commit 0cec12826baea60a15483081b0feece49013049f +Author: Kenneth Benzie (Benie) +Date: Wed Jun 5 11:20:25 2024 +0100 + + [UR] Bump HIP tag to 399430da (#14037) + +commit 2838f40382bedddbda0a5f20ebeeba86310044da +Author: Ewan Crawford +Date: Wed Jun 5 09:20:03 2024 +0100 + + [SYCL][Graph][L0] Correctly report when device supports update (#13987) + + Bump UR L0 commit to + https://github.com/oneapi-src/unified-runtime/pull/1694 so that the SYCL + device aspect for supporting update in graphs is correctly reported for + L0 devices. Currently, support can be incorrectly reported. + + --------- + + Co-authored-by: Kenneth Benzie (Benie) + +commit 20991b1c2ee906148706aa1e7ae62c1084834799 +Author: Kenneth Benzie (Benie) +Date: Wed Jun 5 08:48:18 2024 +0100 + + [UR] Bump CUDA tag to 0e38fda0 (#14030) + +commit 18c4fb2c57f3b937451becda4ca25468397128f5 +Author: Pietro Ghiglio +Date: Tue Jun 4 18:37:40 2024 +0200 + + [SYCL] [NATIVECPU] Report correct memory order capabilities for Native CPU (#13469) + + Testing for https://github.com/oneapi-src/unified-runtime/pull/1527 + +commit 781b75abfd1dac36a2c68fbc13bd6f1bb845d35b +Author: Wu Yingcong +Date: Tue Jun 4 06:09:03 2024 -0700 + + [UR] Test for unified runtime PR (#12902) + + UR PR: https://github.com/oneapi-src/unified-runtime/pull/1385 + + --------- + + Co-authored-by: Kenneth Benzie (Benie) + +commit f2a2de3b6e735ee4a54ecc212b648f370e47abbc +Author: Ewan Crawford +Date: Thu May 30 14:28:35 2024 +0100 + + [SYCL][Graph] Add debug logging for L0 Graph kernel update (#13892) + + Bumps UR to Level Zero adapter change from + https://github.com/oneapi-src/unified-runtime/pull/1654 + + --------- + + Co-authored-by: Kenneth Benzie (Benie) + +commit 1fa2ac88a1fb3a5eba0315c03faa03c2d8e3c5f7 +Author: Kenneth Benzie (Benie) +Date: Thu May 30 12:46:50 2024 +0100 + + [UR][HIP] Implement kernel set spec constant query (#13809) + + https://github.com/oneapi-src/unified-runtime/pull/1604 + +commit e147f3673c77e566a63a1d4d57d6f5da0153cbdb +Author: Konrad Kusiak +Date: Thu May 30 12:46:39 2024 +0100 + + [UR] Modify fill emulation to work for patterns which are not powers of 2 (#13779) + + https://github.com/oneapi-src/unified-runtime/pull/1603 + +commit 8086df575d7f622017521fcd2f8b2b90fdd49d39 +Author: Neil R. Spruit +Date: Thu May 30 02:57:41 2024 -0700 + + [UR][L0] Fix Multi Device Event Cache for shared Root Device (#13917) + + pre-commit PR for + https://github.com/oneapi-src/unified-runtime/pull/1667 + + Signed-off-by: Neil R. Spruit + +commit 16e0670ab6e2425a20e13aec2c7f5896fd4eabfc +Author: Ross Brunton +Date: Fri May 24 14:25:29 2024 +0100 + + [UR][OpenCL] Bump UR OpenCL adapter for invalid kernel args (#13658) + + For UR merge request + https://github.com/oneapi-src/unified-runtime/pull/1501 + + --------- + + Co-authored-by: Kenneth Benzie (Benie) + +commit 7fa793bc7d17b9447ac0726bd01eb33680432d38 +Author: Kenneth Benzie (Benie) +Date: Fri May 24 13:33:08 2024 +0100 + + [UR] Bump L0 tag to e4287455 (#13910) + +commit f05c1c82d07a81050db4931eef6b8d02d359a325 +Author: Hugh Delaney +Date: Wed May 22 14:37:14 2024 +0100 + + [UR] CUDA multi device ctx (#13616) + + https://github.com/oneapi-src/unified-runtime/pull/1565 + + For UR multi device context, buffer interop is now deprecated since a + buffer refers to multiple device pointers. + + --------- + + Co-authored-by: Kenneth Benzie (Benie) + +commit 5a09c6a15279484434df299d9164d94b96d3507a +Author: Kenneth Benzie (Benie) +Date: Wed May 22 10:42:06 2024 +0100 + + [UR][L0] Return device version based on DeviceIpVersion (#13812) + + https://github.com/oneapi-src/unified-runtime/pull/1401 + +commit ba19132218050c4791c5aa82316cc10e38986f75 +Author: Hugh Delaney +Date: Thu May 16 15:40:30 2024 +0100 + + [UR][HIP] Get Device From Queue (#13575) + + https://github.com/oneapi-src/unified-runtime/pull/1553 + + --------- + + Co-authored-by: Kenneth Benzie (Benie) + +commit 99d7097c10ae92805c6a100ffb544bdf0630c063 +Author: Neil R. Spruit +Date: Thu May 16 07:06:42 2024 -0700 + + [UR][L0] ensure a valid kernel handle for the device when reading max wg (#13797) + + pre-commit PR for + https://github.com/oneapi-src/unified-runtime/pull/1611 + + --------- + + Signed-off-by: Neil R. Spruit + Co-authored-by: Kenneth Benzie (Benie) + +commit 34292bbc89f71233ef687652c33c52b55a38839e +Author: Neil R. Spruit +Date: Wed May 15 07:43:11 2024 -0700 + + [UR][L0] Fix timestamp event evict after delete (#13717) + + pre-commit PR for + https://github.com/oneapi-src/unified-runtime/pull/1592 + + Signed-off-by: Neil R. Spruit + Co-authored-by: Kenneth Benzie (Benie) + +commit 9bf81044bfbe229b6846c96819a470e62065469a +Author: Ewan Crawford +Date: Wed May 15 15:05:06 2024 +0100 + + [CUDA][SYCL] Bump UR CUDA Tag (#13746) + + https://github.com/oneapi-src/unified-runtime/pull/1596 + + --------- + + Co-authored-by: Kenneth Benzie (Benie) + +commit a5da94d1fb9a46f0a8334db500f26d30b62c1c02 +Author: Neil R. Spruit +Date: Fri May 10 02:20:00 2024 -0700 + + [UR][L0] Disable Usage of Driver In order Lists by default (#13715) + + pre-commit PR for + https://github.com/oneapi-src/unified-runtime/pull/1591 + + Signed-off-by: Neil R. Spruit + +commit 8736efe32c7280335607b3d50f85692e29038097 +Author: Fábio +Date: Wed May 8 13:25:04 2024 +0100 + + [UR][EXP][Command-Buffer] Remove duplicated code from headers (#13374) + + UR PR: https://github.com/oneapi-src/unified-runtime/pull/1499 + + +commit c6be822ba3fbf1dc7c2f89805493400704ad89b5 +Author: Neil R. Spruit +Date: Tue May 7 02:47:56 2024 -0700 + + [UR][L0] Fix the repo tag for the L0 adapter to use the global variable (#13667) + + Signed-off-by: Neil R. Spruit + +commit dd183bf2a706571e29428a425d3a5f9bb6133a69 +Author: aarongreig +Date: Tue May 7 10:30:00 2024 +0100 + + [UR][L0] Pull in some minor fixes for L0 device queries. (#13424) + + UR PR https://github.com/oneapi-src/unified-runtime/pull/1513 + +commit 85037b20a9131400ce7cddac9c215adf563b6577 +Author: Kenneth Benzie (Benie) +Date: Fri May 3 19:22:38 2024 +0100 + + [UR] Bump L0 tag to fb342f06 (#13646) + + https://github.com/oneapi-src/unified-runtime/pull/1549 + +commit d1dddccded89ee1b34a120575726022ef8c97634 +Author: Piotr Balcer +Date: Fri May 3 12:57:22 2024 +0200 + + [UR][L0] fix queue locking behavior when creating event lists (#13564) + + Co-authored-by: Kenneth Benzie (Benie) + +commit 1a5595f8e43a12fc361c8868b04f265182259657 +Author: Neil R. Spruit +Date: Thu May 2 11:31:48 2024 -0700 + + [UR][L0] Enable Batching out of order commands without signal events (#13462) + + - pre-commit PR for + https://github.com/oneapi-src/unified-runtime/pull/1526 + + --------- + + Signed-off-by: Neil R. Spruit + Co-authored-by: Kenneth Benzie (Benie) + +commit f34a65012c21192d6f90c10a893cffb35a250dff +Author: Konrad Kusiak +Date: Thu May 2 07:22:06 2024 +0100 + + [UR] CI for: Emulate Fill with copy when patternSize is not a power of 2 (#12912) + + https://github.com/oneapi-src/unified-runtime/pull/1412 + + --------- + + Co-authored-by: Kenneth Benzie (Benie) + +commit 4c7baa7aa553ce5a6f68eeb74851ece279efbd3d +Author: jinge90 +Date: Tue Apr 30 21:20:23 2024 +0800 + + [UR] Intercept urProgramLinkExp in ur ASAN layer (#13048) + + UR part: https://github.com/oneapi-src/unified-runtime/pull/1452 + + --------- + + Signed-off-by: jinge90 + Co-authored-by: Kenneth Benzie (Benie) + +commit 15c9c62bc171c849588fa58029f4c40dc142e80f +Author: Omar Ahmed <30423288+omarahmed1111@users.noreply.github.com> +Date: Tue Apr 30 11:04:50 2024 +0100 + + Testing add validation tests to getInfo tests (#12782) + + Testing PR for [UR + PR](https://github.com/oneapi-src/unified-runtime/pull/1346) + + After correcting hip program info returned for program device to return + context device rather than the binary associated device that have fixed + the kernel fusion cooperative kernels e2e test. + + --------- + + Co-authored-by: Kenneth Benzie (Benie) + +commit 7cd48cbac6ddcb3e950748b76acd42812fab18bb +Author: Ben Tracy +Date: Mon Apr 29 08:37:48 2024 +0100 + + [SYCL][Graph] Bump UR commit for in-order L0 optimization (#13565) + + - Bumps commit only and includes minimal pi2ur changes for new + descriptor members + - In-order path not currently used, enable profiling by default (match + previous behaviour) + + UR PR: https://github.com/oneapi-src/unified-runtime/pull/1442 + +commit 95420f09ea81539d8c18fbb7d7406ec82947aeb5 +Author: Winston Zhang +Date: Fri Apr 26 06:56:11 2024 -0700 + + [UR][L0] Testing for counter-based-events implementation in URT draft (#12848) + + commit tag: 4134bfce72d33e89eebcad11186bdf00310bba83 + URT PR: https://github.com/oneapi-src/unified-runtime/pull/1370 + + --------- + + Signed-off-by: Zhang, Winston + Co-authored-by: Kenneth Benzie (Benie) + +commit fc94a16a8a97464f96ea07bf77600d6337c00f76 +Author: Neil R. Spruit +Date: Fri Apr 26 03:16:19 2024 -0700 + + [UR][L0] reset command lists on error unknown (#13522) + + pre-commit PR for + https://github.com/oneapi-src/unified-runtime/pull/1539 + + --------- + + Signed-off-by: Neil R. Spruit + Co-authored-by: Kenneth Benzie (Benie) + +commit 719207dbac44ebb1bcae96eca992276171172120 +Author: Ewan Crawford +Date: Wed Apr 24 09:10:44 2024 +0100 + + [SYCL][Graph] Bump UR commit to OpenCL kernel update (#12724) + + Test the UR commit that enables updating kernel commands in a + command-buffer in the OpenCL adapter + https://github.com/oneapi-src/unified-runtime/pull/1358 + +commit 96b07cf9c3b8407194d0082b0b30170f4f232a39 +Author: Kenneth Benzie (Benie) +Date: Tue Apr 23 11:14:52 2024 +0100 + + [UR] Bump main tag to 31d0fe15 (#13511) + + * https://github.com/oneapi-src/unified-runtime/pull/1032 + * https://github.com/oneapi-src/unified-runtime/pull/1183 + * https://github.com/oneapi-src/unified-runtime/pull/1243 + +commit 723b7b7b043783f04b6b0ec2195971a5e95f216b +Author: aarongreig +Date: Fri Apr 19 22:15:55 2024 +0100 + + [UR][L0] Update main UR tag to 717791b (#13474) + + This pulls in fixes from: + https://github.com/oneapi-src/unified-runtime/pull/1298 + https://github.com/oneapi-src/unified-runtime/pull/1495 + https://github.com/oneapi-src/unified-runtime/pull/1517 + + Also remove now unnecessary XFAIL from Basic/kernel_max_wg_size.cpp + +commit 8cd2eb0ac2efc65cd109e0bfce02aedd69ce4cf2 +Author: Igor Chorążewicz +Date: Tue Apr 16 11:17:23 2024 -0700 + + [UR][L0] fix ze commands matching in level_zero_eager_init.cpp (#13277) + + If the test is run with UR_L0_LEAKS_DEBUG var set, UR will print ze call + count summary. This summary can cause the test to fail as it will + contain zeCommandQueueCreate, etc. + + Fix this by making CHECK-NOT only match output generated by UR_L0_DEBUG. + +commit 9958a742ab498b89fb5c49ccbe94fe6f9a7a6bf6 +Author: Kenneth Benzie (Benie) +Date: Tue Apr 16 18:00:03 2024 +0100 + + [UR] Bump CUDA tag to 3f5f5688 (#13399) + + https://github.com/oneapi-src/unified-runtime/pull/1510 + +commit c959b5313c6c74f206609b526272206a4b144315 +Author: Hugh Delaney +Date: Tue Apr 16 07:56:06 2024 -0500 + + [UR] Bump HIP tag to 15233fd2 (#13020) + + https://github.com/oneapi-src/unified-runtime/pull/1437 + +commit 684cd90e22fe67d4a524be92c69e026cca262f1c +Author: aarongreig +Date: Tue Apr 16 13:17:43 2024 +0100 + + [UR][L0] Pull in a batch of L0 fixes (#13400) + + Pulls in fixes + https://github.com/oneapi-src/unified-runtime/pull/1492 + https://github.com/oneapi-src/unified-runtime/pull/1494 + https://github.com/oneapi-src/unified-runtime/pull/1507 + +commit e6d9d4c6bfabae78c29aa3b376e568974860a219 +Author: Kenneth Benzie (Benie) +Date: Tue Apr 16 10:29:06 2024 +0100 + + [UR] Bump CUDA tag to 1333d4a0 (#13398) + + https://github.com/oneapi-src/unified-runtime/pull/1342 + +commit a884a54914f9e9cf052591d70eb5cac20a25a210 +Author: Neil R. Spruit +Date: Mon Apr 15 01:58:14 2024 -0700 + + [UR][L0][Image] Set ZeImageDesc member of _ur_image in release build … (#13338) + + …for legacy image + + - reenable the image interop test with fix to image interop in release + builds + - precommit PR for + https://github.com/oneapi-src/unified-runtime/pull/1498 + + --------- + + Signed-off-by: Neil R. Spruit + Co-authored-by: Aaron Greig + +commit 71358f095be30b1cccd8c39a5ac2224fab9491b5 +Author: Kenneth Benzie (Benie) +Date: Mon Apr 15 09:49:02 2024 +0100 + + [UR] Bump CUDA tag to 68e525a4 (#13376) + + https://github.com/oneapi-src/unified-runtime/pull/1317 + +commit d7c5a9c6b2c9edb52f14adca5c84c3c3e3419d7b +Author: Konrad Kusiak +Date: Mon Apr 15 08:43:44 2024 +0100 + + [UR] [NATIVECPU] CI for: Extended usm fill to bigger patterns than 1 byte (#13263) + + https://github.com/oneapi-src/unified-runtime/pull/1489 + + Co-authored-by: Kenneth Benzie (Benie) + + +commit 16da4ec202cc818b9e79b75ecd9b7e301e3bea53 +Author: Konrad Kusiak +Date: Fri Apr 12 15:33:18 2024 +0100 + + [UR] Bump HIP tag to 1473ed8a (#12898) + + https://github.com/oneapi-src/unified-runtime/pull/1395 + +commit 6ba50805672c72654c8288d33960f36c09cc89bb +Author: Neil R. Spruit +Date: Fri Apr 12 02:07:48 2024 -0700 + + [UR][L0] Fix regular in order command list reuse given inorder queue (#13195) + + pre-commit PR for + https://github.com/oneapi-src/unified-runtime/pull/1483 + + --------- + + Signed-off-by: Neil R. Spruit + Co-authored-by: Aaron Greig + +commit 1c89e51aa23fbd01eab1a2bba98ffc3598470e93 +Author: Ewan Crawford +Date: Fri Apr 12 10:05:29 2024 +0100 + + [UR] Bump HIP tag to 760eaa38 (#12758) + + Bump UR commit to include a bugfix for HIP UR adapter dereferencing a + nullptr https://github.com/oneapi-src/unified-runtime/pull/1357 + +commit e404d9984d1587ca130d267c342d10747bc09a1f + [SYCL][NATIVECPU] Threadpool implementation for Native CPU (#13176) + Native CPU backend improvement to be able to run work-groups in parallel? + +commit 1d52f907d28edab7e23f69175a5b00d1bbe0acdc +Author: Fábio +Date: Wed Apr 10 17:56:05 2024 +0100 + + [UR] Bump CUDA tag to 6e76c98a (#12285) + +commit 7cf70ddd403d3262b51d0729cdc8a19e1bec7fab +Author: Kenneth Benzie (Benie) +Date: Wed Apr 10 17:43:57 2024 +0100 + + [UR] Bump HIP tag to 08b3e8fe (#13352) + +commit a14d0b548e96014c643b00927be128193781769c +Author: Kenneth Benzie (Benie) +Date: Wed Apr 10 16:06:52 2024 +0100 + + [UR] Bump Native CPU tag to e2b5b7fa (#13349) + +commit 60a5c90b5dc4736ff818586072b7c7a270ac40c1 +Author: Georgi Mirazchiyski +Date: Wed Apr 10 16:06:37 2024 +0100 + + [HIP][UR] Fix memory type detection in allocation info queries and USM copy2D (#13059) + + Test CI for https://github.com/oneapi-src/unified-runtime/pull/1455 + + --------- + + Co-authored-by: Aaron Greig + +commit e3b112bae042f3293d13dd64dc825809a4348dff +Author: Fábio +Date: Wed Apr 10 16:06:20 2024 +0100 + + [UR] Bump CUDA tag to cda0cd94 (#12287) + +commit 090323ea1c1007c12e184f8c990d6a45238529a0 +Author: Kenneth Benzie (Benie) +Date: Wed Apr 10 12:24:30 2024 +0100 + + [UR] Bump CUDA tag to 05b58992 (#13344) + +commit cb28e0941683b921583553d9c3c5f29add7e42c2 +Author: Kenneth Benzie (Benie) +Date: Tue Apr 9 17:47:00 2024 +0100 + + [UR][OpenCL] Revert urMemBufferCreate extension function lookup error (#13331) + + Revert https://github.com/oneapi-src/unified-runtime/pull/1448, pulls in + OpenCL adapter changes from + https://github.com/oneapi-src/unified-runtime/pull/1496. + +commit c74a14414b1ae8070421ee07b037bd8e9b1e704a +Author: Neil R. Spruit +Date: Mon Apr 8 01:54:49 2024 -0700 + + [UR][L0] Fix DeviceInfo global mem free to report unsupported given MemCount==0 (#13209) + + pre-commit PR for + https://github.com/oneapi-src/unified-runtime/pull/1486 + + Signed-off-by: Neil R. Spruit + +commit d86a50045bbbe488869991be49cbfe3213809d72 + [UR][CL] Atomic order memory capability for Intel FPGA driver (#13041) + Potentially user-visible fix. + +commit 2e2010e2cc4acf1375cf88ce65d3a5cb8cbc9427 + [UR] Add DEVICE_NOT_AVAILABLE UR error code and PI translation for same. (#13206) + Does it fix any actual issues in some negative cases where we previosly + reported a wrong error if device is not available? + +commit 3288a66d48d5aee7412ad12118794f28e6634550 +Author: aarongreig +Date: Mon Apr 1 10:04:11 2024 +0100 + + [UR] Pull in UR changes to add exec error status to events. (#13127) + + UR PR: https://github.com/oneapi-src/unified-runtime/pull/1467 + +commit 93a1abb42f352eff587cd1a081e90089c232339b +Author: Piotr Balcer +Date: Wed Mar 27 12:11:36 2024 +0100 + + [UR][L0] fix a deadlock on a recursive event rwlock (#13112) + +commit dd78c6e9c0dc6afc6fb5757fb88c4c5b0b0fe5b5 +Author: Raiyan Latif +Date: Fri Mar 22 09:35:09 2024 -0700 + + [UR][L0] Enable default support for L0 in-order lists (#13033) + + Signed-off-by: Raiyan Latif + Co-authored-by: Kenneth Benzie (Benie) + +commit 7c70e59db3ec813021beb970ebd21034586da53e +Author: Ewan Crawford +Date: Thu Mar 21 10:28:46 2024 +0000 + + [SYCL][Graph][HIP] Set minimum ROCm version for graphs (#13035) + + Tests UR PR https://github.com/oneapi-src/unified-runtime/pull/1447 that + only reports support for UR command-buffers on ROCm 5.5.1 and later to + work around HIP driver bugs related to HIP-Graph in earlier version. + + This requirement is also explicitly mentioned in the design doc. + +commit 43f096308b03fa4c5a7f6845461a133d6cfaceae +Author: Hugh Delaney +Date: Wed Mar 20 07:04:37 2024 +0000 + + [UR] CI for UR PR refactor-guess-local-worksize (#12663) + + https://github.com/oneapi-src/unified-runtime/pull/1326 + + --------- + + Co-authored-by: Kenneth Benzie (Benie) + +commit 1f9bf7a731b16d6d0d017c35245991ca95d0aef7 +Author: Artur Gainullin +Date: Tue Mar 19 14:47:58 2024 -0700 + + [SYCL][Graph][UR] Update UR to support updating kernel commands in command buffers for L0 (#12897) + +commit cf402b8473e9b3a4ee675a6154b80f0d54b198d1 + [UR][L0] Support for urUsmP2PPeerAccessGetInfoExp to query p2p access… (#12983) + Strictly speaking, this may have a visible effect for end users since some + of queries won't always return `false` anymore. + + # Mar'24 release notes Release notes for commit range [f4e0d3177338](https://github.com/intel/llvm/commit/f4ed132f243ab43816ebe826669d978139964df2).. [d2817d6d317db1](https://github.com/intel/llvm/commit/d2817d6d317db1143bb227168e85c409d5ab7c82) From c69bc96ac884adfc2cd12d083f2570b4ee80db0d Mon Sep 17 00:00:00 2001 From: Dmitry Vodoypanov Date: Tue, 20 Aug 2024 08:42:46 -0700 Subject: [PATCH 02/30] Change commit hashes to links --- sycl/ReleaseNotes.md | 744 +++++++++++++++++++++---------------------- 1 file changed, 372 insertions(+), 372 deletions(-) diff --git a/sycl/ReleaseNotes.md b/sycl/ReleaseNotes.md index 506bd62d8ce6..9f6eaa7d4179 100644 --- a/sycl/ReleaseNotes.md +++ b/sycl/ReleaseNotes.md @@ -1,6 +1,6 @@ # Release notes Jul'24 -Release notes for commit range +Release notes for commit https://github.com/intel/llvm/commit/range [d2817d6d317db1](https://github.com/intel/llvm/commit/d2817d6d317db1143bb227168e85c409d5ab7c82) ... [ebb3b4a21b3b0e](https://github.com/intel/llvm/commit/ebb3b4a21b3b0e977f44434781729df7de83e436) @@ -43,7 +43,7 @@ commit a8609a5925a3fcb2bd85636702556d15ae5574f4 [New offload][llc] Pass -relocation-model=pic option to llc when building shared libraries (#13687) commit 16007fa8be4292159f0b19e5fb911b90e3f84aa4 [SYCL][Device libs][New offload] Add missing fallback SYCL device library files (#13869) -commit 5ddc6881a4f5c2ee5f0ccbbd873e57a62bceb30d +commit /5ddc6881a4f5c2ee5f0ccbbd873e57a62bceb30d [Driver][SYCL][NewOffloadModel] Improve arch association for device (#13898) commit 7439fb46f1469cf401d89bf203f91ea22bc7ee57 [Driver][SYCL][NewOffloadModel] Hook up options for the offload-wrapper (#14001) @@ -120,11 +120,11 @@ commit 1c13e6f5e6bae9df42b483852c60631609422043 - Added `cmul_add` API. intel/llvm#12969 - Added experimental APIs for maksed operations over sub-groups (`select`, `shift`, etc.). intel/llvm#12972 -commit e0d020a74fee74a1fcda97b9a9854ad07bde4eae +commit https://github.com/intel/llvm/commit/e0d020a74fee74a1fcda97b9a9854ad07bde4eae [SYCL][COMPAT] Added utility helpers to simplify code translation (#12970) ??? -commit 4ade7b71db910a694e1da4d73495fd1903da1622 +commit https://github.com/intel/llvm/commit/4ade7b71db910a694e1da4d73495fd1903da1622 [SYCL][COMPAT] Added support for multiple math ops (#13005) ??? @@ -139,37 +139,37 @@ commit 4ade7b71db910a694e1da4d73495fd1903da1622 -commit 9876e19f4ff387b35b0c98c7d62e5f50e6de187d +commit https://github.com/intel/llvm/commit/9876e19f4ff387b35b0c98c7d62e5f50e6de187d [SYCL][XPTI] 'queue_id' metadata feature refactoring (#13070) bugfix? -commit 3800814750da51d6da852ce404bde91e1dbe02b8 +commit https://github.com/intel/llvm/commit/3800814750da51d6da852ce404bde91e1dbe02b8 [SYCL] Key/Value sorting with fixed-size private array input (#14399) -commit 7b3f21527abb904cb5c63e9ea32c7f0d65636436 +commit https://github.com/intel/llvm/commit/7b3f21527abb904cb5c63e9ea32c7f0d65636436 [SYCL] [ABI-Break] Partial implementation of sycl_ext_oneapi_cuda_cluster_group (#14113) -commit 8e3b8ce77f41d85687ae3bceedf5d1dc6e0e3155 +commit https://github.com/intel/llvm/commit/8e3b8ce77f41d85687ae3bceedf5d1dc6e0e3155 [SYCL] Add sorting APIs for fixed-size private array input (#14185) -commit bd97f283c9f982b89a3347754edf184a38762a4a +commit https://github.com/intel/llvm/commit/bd97f283c9f982b89a3347754edf184a38762a4a [Bindless][Exp] Windows & DX12 interop. Semaphore ops can take values. (#13860) -commit 3910d0c1393247313c8987b3f68a8d540d940673 +commit https://github.com/intel/llvm/commit/3910d0c1393247313c8987b3f68a8d540d940673 [SYCL] Add support for key/value sorting APIs (#13942) -commit 5e269c88bcfafd82719d1266a5b8a2bb7b90045d +commit https://github.com/intel/llvm/commit/5e269c88bcfafd82719d1266a5b8a2bb7b90045d [SYCL] Initial changes for the second version of sycl_ext_oneapi_group_sort extension (#13908) -commit 55b547e59a28c4c446a797bb8c51a83156609327 +commit https://github.com/intel/llvm/commit/55b547e59a28c4c446a797bb8c51a83156609327 [SYCL][ESIMD] Introduce load2d/store2d/prefetch2d API that accepts compile time properties (#13046) -commit d06724a7c304d393500b7edbb84f5c7e59f6b319 +commit https://github.com/intel/llvm/commit/d06724a7c304d393500b7edbb84f5c7e59f6b319 [SYCL][Graph] Specify API for explicit update using indices (#12486) -commit 2bc8b5bc8cbc44cf8ef1deb095c10450348904d8 +commit https://github.com/intel/llvm/commit/2bc8b5bc8cbc44cf8ef1deb095c10450348904d8 [SYCL][Graph] Implementation of explicit update with indices (#12840) -commit c8ae6c68943b9635cd9822f3c9ee7b5cc8d98acc +commit https://github.com/intel/llvm/commit/c8ae6c68943b9635cd9822f3c9ee7b5cc8d98acc [ESIMD][NFC][DOC] Add load/store/prefetch_2d functions, L1/L2 hint combinations(#13218) ## Improvements @@ -240,13 +240,13 @@ commit c8ae6c68943b9635cd9822f3c9ee7b5cc8d98acc intel/llvm#13545 - Added check for template argument `N` of `media_block_load` ESIMD API. intel/llvm#13668 -commit c5b174d8507cad1328b3121e650120e85f1da213 +commit https://github.com/intel/llvm/commit/c5b174d8507cad1328b3121e650120e85f1da213 [SYCL] Implement latest version of sycl_ext_oneapi_free_function_queries (#13257) -commit 398aa20350aa38d76d9e95a8b76e3858c38faae5 +commit https://github.com/intel/llvm/commit/398aa20350aa38d76d9e95a8b76e3858c38faae5 [SYCL] Support shuffle algorithms for non-uniform groups (#12705) -commit ebb3b4a21b3b0e977f44434781729df7de83e436 +commit https://github.com/intel/llvm/commit/ebb3b4a21b3b0e977f44434781729df7de83e436 [SYCL] Remove plugin interface (#14145) @@ -273,450 +273,450 @@ commit ebb3b4a21b3b0e977f44434781729df7de83e436 to return new `unknown` enumerator if device architecture cannot be properly detected. intel/llvm#14077 -commit ffc0de03f900da2d0262ea8ec41ac3847a1edbcc +commit https://github.com/intel/llvm/commit/ffc0de03f900da2d0262ea8ec41ac3847a1edbcc [SYCL][Graph][Doc] Remove outdated limitation from spec (#13163) -commit 486b3dd1a2b2924e4445f1e36e5c341a09ba784f +commit https://github.com/intel/llvm/commit/486b3dd1a2b2924e4445f1e36e5c341a09ba784f [SYCL][Graph][Doc] Tidy of graph extension design doc (#13065) -commit 09c93842ffe51602e118504e4e3229d41b2a4fb2 +commit https://github.com/intel/llvm/commit/09c93842ffe51602e118504e4e3229d41b2a4fb2 [SYCL][Graph] Clarify graph enable_profiling property in finalize() (#14067) -commit ecd3b903f4ddb6b32892f03c326151faa9fa63e8 +commit https://github.com/intel/llvm/commit/ecd3b903f4ddb6b32892f03c326151faa9fa63e8 [SYCL][Joint Matrix Spec] Add new API for out of bounds fill/load/store (#11172) -commit f1e66f5f0f59b958ff352a558dbad8b42df63175 +commit https://github.com/intel/llvm/commit/f1e66f5f0f59b958ff352a558dbad8b42df63175 [ESIMD][NFC][DOC] Add fence to the ESIMD SPEC functions (#13135) -commit 1e2e6baaf86009f0f9067b1146a8ca7923436e60 +commit https://github.com/intel/llvm/commit/1e2e6baaf86009f0f9067b1146a8ca7923436e60 [SYCL][Bindless] Add image_mem_handle to image_mem_handle devices copies. (#12449) ### SYCLcompat - Added non-`const` `image2d_max` and `image3d_max` getters. intel/llvm#14138 -commit 17d2e2d8a483e7a4c33cd542a3c1381b767452bc +commit https://github.com/intel/llvm/commit/17d2e2d8a483e7a4c33cd542a3c1381b767452bc [SYCL][COMPAT] Add version & release process (#14457) -commit d8c0a9342a7e71420883c8a750f89679897b9ca1 +commit https://github.com/intel/llvm/commit/d8c0a9342a7e71420883c8a750f89679897b9ca1 [SYCL][COMPAT] Memory Header cleanup (#13143) is it really user-visible? -commit 89eeb02519cfee2f1d88ffac9f07dd131099b7dd +commit https://github.com/intel/llvm/commit/89eeb02519cfee2f1d88ffac9f07dd131099b7dd [SYCL][COMPAT] defs.hpp update with Windows macros. SYCLCOMPAT_CHECK_ERROR added. (#13027) -commit 0b05577790f2a81cb10a41262324ff9558614f09 +commit https://github.com/intel/llvm/commit/0b05577790f2a81cb10a41262324ff9558614f09 [SYCL[COMPAT][CUDA] Impl masked compat shuffles on cuda (#13363) -commit 13c9d0ef964b17dd3e2c297b1ceb2ecb8ea2ffe9 +commit https://github.com/intel/llvm/commit/13c9d0ef964b17dd3e2c297b1ceb2ecb8ea2ffe9 [SYCL][Bindless][Doc][ABI-Break] Rename external semaphore destroy to release (#14535) -commit fb561b9f336f8f9c286a1125631dedf1b5fb1e4b +commit https://github.com/intel/llvm/commit/fb561b9f336f8f9c286a1125631dedf1b5fb1e4b [SYCL][Bindless][Doc][ABI-Break] Add const qualifiers to copies (#14140) -commit 0eeae2ac96ea179099dd5d57c241260ccfe65f73 +commit https://github.com/intel/llvm/commit/0eeae2ac96ea179099dd5d57c241260ccfe65f73 [SYCL][Graph] Update design doc for copy optimization and add test (#13051) -commit 4acca904c0e07fd6b504f7938f539bc1a0e94ce0 +commit https://github.com/intel/llvm/commit/4acca904c0e07fd6b504f7938f539bc1a0e94ce0 [CLC][AMDGPU] Refactor fence helper to process order semantic explicitly (#12872) ??? -commit 13a7b3ad2f229099fe964016f591f17a66b0ea15 +commit https://github.com/intel/llvm/commit/13a7b3ad2f229099fe964016f591f17a66b0ea15 [SYCL] [libdevice] Add vector overloads of ConvertBFloat16ToFINTEL and ConvertFToBFloat16INTEL (#14085) -commit 0dcad16c36f27e6254e7b831faaad8c6e07f8cfb +commit https://github.com/intel/llvm/commit/0dcad16c36f27e6254e7b831faaad8c6e07f8cfb [SYCL][Bindless] Update spirv fetch-sampled and fetch/write-array (#13946) bugfix?? -commit af65855fa6b6df0eded078bd3dbe3bf4a6a2b2e3 +commit https://github.com/intel/llvm/commit/af65855fa6b6df0eded078bd3dbe3bf4a6a2b2e3 [SYCL][ESIMD]Replace use of intrinsics with spirv functions (#13553) do we even need to mention this? -commit 990b1d1ba053d60a803ae5e750803ae6583119f9 +commit https://github.com/intel/llvm/commit/990b1d1ba053d60a803ae5e750803ae6583119f9 [ESIMD]Replace use of vc intrinsic with spirv extension for rdtsc API (#13536) -commit 1f1be9c642889b7c0fd045b073d411e544dc6007 +commit https://github.com/intel/llvm/commit/1f1be9c642889b7c0fd045b073d411e544dc6007 [SYCL][ESIMD] Move fmax to SPIR-V intrinsic (#14020) this one is also problematic -commit bcca7a80adf50b04c0991ef48745353ac7829016 +commit https://github.com/intel/llvm/commit/bcca7a80adf50b04c0991ef48745353ac7829016 [SYCL][ESIMD] Move a few math operations to SPIR-V intrinsics and support new functions (#13383) that is a regression, not an improvement :) should be noted in known issues -commit 1d2007ba7c661322584a60d84a40777e0e0d9567 +commit https://github.com/intel/llvm/commit/1d2007ba7c661322584a60d84a40777e0e0d9567 [SYCL][COMPAT] kernel_function and kernel_library constexpr constructors (#13932) if those APIs were added in this release, we should squash two items into one -commit 74602458d5583cf69ca575a9167def51dad15052 +commit https://github.com/intel/llvm/commit/74602458d5583cf69ca575a9167def51dad15052 [SYCL][Bindless] Replace 'image_channel_order' field in 'image_descriptor' with number of channels (#13745) -commit 83db85f1964338d9ce67bb536f8e6c5eebe8893b +commit https://github.com/intel/llvm/commit/83db85f1964338d9ce67bb536f8e6c5eebe8893b [SYCL][Bindless] Update and add support for SPV_INTEL_bindless_image extension new revision (#13753) -commit d2a5e8d095c0176957f5da2c5232d8966f8ff1bf +commit https://github.com/intel/llvm/commit/d2a5e8d095c0176957f5da2c5232d8966f8ff1bf [SYCL][Matrix] Add generation of spirv.CooperativeMatrixKHR type (#13645) internal improvement that can be ignored? -commit 82aaf27f6f0cf97ba89b58f88a18b09e23097afc +commit https://github.com/intel/llvm/commit/82aaf27f6f0cf97ba89b58f88a18b09e23097afc [SYCL][Driver] Refactor device config parsing to better match HIP and CUDA targets (#13617) -commit b11a19b1896cc2f7ab43735aacf265182e22832c +commit https://github.com/intel/llvm/commit/b11a19b1896cc2f7ab43735aacf265182e22832c [Bindless][SYCL][Doc] Add HintT tparam to cubemap fetch and sample (#13742) -commit 9d1cbc51854f19f89105d502db9156b11e4507f4 +commit https://github.com/intel/llvm/commit/9d1cbc51854f19f89105d502db9156b11e4507f4 [SYCL][COMPAT] nd_range barriers seq_cst by default in supported devices (#12974) -commit 3756fd1b778ae4ab36bd3988bfdf9ba910b779fd +commit https://github.com/intel/llvm/commit/3756fd1b778ae4ab36bd3988bfdf9ba910b779fd [ESIMD] Enable FADD/FSUB for slm_atomic_update (#13535) ??? -commit c65bed1073460fb8d6dbb319f5e7ff2c9c7c9422 +commit https://github.com/intel/llvm/commit/c65bed1073460fb8d6dbb319f5e7ff2c9c7c9422 [SYCL][Graph] Update begin_recording and end_recording (#13480) -commit d6dfd0c77b2212f4e3e926d2e289bd3dc6e18b49 +commit https://github.com/intel/llvm/commit/d6dfd0c77b2212f4e3e926d2e289bd3dc6e18b49 [SYCL][Graph][DOC] add an edge case for record&replay mode (#12916) -commit 89132855d4312536f5f40792194b6251d4cde819 +commit https://github.com/intel/llvm/commit/89132855d4312536f5f40792194b6251d4cde819 [SYCL][Joint Matrix] Add a new overload for joint_matrix_apply to be able to return result into a different matrix (#13151) -commit 8847c110c78684a86ec7e62d7255f1bb9c6efd4f +commit https://github.com/intel/llvm/commit/8847c110c78684a86ec7e62d7255f1bb9c6efd4f [SYCL][NATIVECPU][libclc]Mark opencl_c_generic_address_space as unsupported on Native CPU (#13109) -commit 07e3bcf9f3be46234deb471e25d94b5692353688 +commit https://github.com/intel/llvm/commit/07e3bcf9f3be46234deb471e25d94b5692353688 [SYCL][ESIMD] Use LSC for unsupported surface index block stores (#13150) -commit ed0619b4caa24af8e78053ecef2e5e808e0e2b08 +commit https://github.com/intel/llvm/commit/ed0619b4caa24af8e78053ecef2e5e808e0e2b08 [SYCL][Joint Matrix] Support 1x64x16 bf16 combination (#13391) -commit 03233e57e5585813ec2c0dbc7a10ceb4a6d15a71 +commit https://github.com/intel/llvm/commit/03233e57e5585813ec2c0dbc7a10ceb4a6d15a71 [SYCL] Add missed intel math functions in sycl_ext_intel_math header (#13762) -commit 6cb77fcfb37ffb445ab62ea1545422dc52128da1 +commit https://github.com/intel/llvm/commit/6cb77fcfb37ffb445ab62ea1545422dc52128da1 [SYCL] Add -fPIC for Intel math function host code (#13800) -commit 434d5edfae78307969ade6764e5bafeb17ce5073 +commit https://github.com/intel/llvm/commit/434d5edfae78307969ade6764e5bafeb17ce5073 [SYCL] Remove redundant detail::empty_properties_t (#13777) -commit 84bae21d3f63f04ca50bfffc5203909ba3fd95a6 +commit https://github.com/intel/llvm/commit/84bae21d3f63f04ca50bfffc5203909ba3fd95a6 Implement missing overloads for generic AS in generic target (#13938) -commit da379ecfa649a520f49f8adfb97e73c72ff3fb06 +commit https://github.com/intel/llvm/commit/da379ecfa649a520f49f8adfb97e73c72ff3fb06 [SYCL] Add support for multiple missing math ops (#13714) -commit 0f6f57b43afa7ee442744b89e7a034673d58c8d8 +commit https://github.com/intel/llvm/commit/0f6f57b43afa7ee442744b89e7a034673d58c8d8 [SYCL][Doc] Fix typos and formating of SYCLCompat README (#13961) -commit 0d1dd2d2b1e8655b96940edecef84447866e87bc +commit https://github.com/intel/llvm/commit/0d1dd2d2b1e8655b96940edecef84447866e87bc [SYCL] Add a module flag for device compilations (#13880) -commit 29b4d855fa1a378e89182795e0d368304c40c3f6 +commit https://github.com/intel/llvm/commit/29b4d855fa1a378e89182795e0d368304c40c3f6 [SYCL][CUDA] Enable support of msvc math functions for nvptx target. (#14007) -commit 9f1cee573782772f8d062f6490128c3ee6fa6911 +commit https://github.com/intel/llvm/commit/9f1cee573782772f8d062f6490128c3ee6fa6911 [SYCL][CUDA] Improve kernel launch error handling for out-of-registers (#12604) -commit b49303c7e13ca0a69454eaaaeb8c3d094916218d +commit https://github.com/intel/llvm/commit/b49303c7e13ca0a69454eaaaeb8c3d094916218d [SYCL][COMPAT] Add Image Max dims to device_info. Updated Max ND Range Size (#13973) -commit db54535fb389331b167807a5d8f1ed16b5695474 +commit https://github.com/intel/llvm/commit/db54535fb389331b167807a5d8f1ed16b5695474 [AMDGPU][SYCL] Make unsafe atomic fadd opt in (#13955) -commit dce651bd69ea12c935c70990ed3290007a00c6c5 +commit https://github.com/intel/llvm/commit/dce651bd69ea12c935c70990ed3290007a00c6c5 [SYCL][COMPAT] Migrate bug fixes & refactor of get_*version APIs (#14011) -commit 4e36825beabb4b4a7435470ac633768dcbd7b376 +commit https://github.com/intel/llvm/commit/4e36825beabb4b4a7435470ac633768dcbd7b376 [SYCL] Record aspect names when computing device requirements (#13974) -commit a35f862445b5666c63469cda2656b0a9946df25c +commit https://github.com/intel/llvm/commit/a35f862445b5666c63469cda2656b0a9946df25c [SYCL][Graph] fix the address pointer in graph print (#13595) -commit f204869281570959af82fff638df6b34151718f4 +commit https://github.com/intel/llvm/commit/f204869281570959af82fff638df6b34151718f4 [SYCL] Add sm90a Cuda target architecture support (#14075) -commit c1b17e00f9b5c51db1f8385435d7a591224b01e0 +commit https://github.com/intel/llvm/commit/c1b17e00f9b5c51db1f8385435d7a591224b01e0 [SYCL] Enable CET for wqlibsycl-devicelib-host.a (#14135) -commit e7defabdcc3d5b460cfc593822156836b874f092 +commit https://github.com/intel/llvm/commit/e7defabdcc3d5b460cfc593822156836b874f092 [SYCL] Use `std::array` as storage for `sycl::vec` on device (#14130) -commit 0e24ac5677d8d91aed2fcc72d52d9d6b40f5985a -commit ea2111c1a022a1bd7a818ef9796d70d22f3b92d0 +commit https://github.com/intel/llvm/commit/0e24ac5677d8d91aed2fcc72d52d9d6b40f5985a +commit https://github.com/intel/llvm/commit/ea2111c1a022a1bd7a818ef9796d70d22f3b92d0 [SYCL] Re-implement diagnostics about virtual calls (#14141) -commit c2ebf84fd7ffcc8f40dd9eef2aed163437792cd5 +commit https://github.com/intel/llvm/commit/c2ebf84fd7ffcc8f40dd9eef2aed163437792cd5 [SYCL] Make `vec` conversion operator to scalar non-template (#14668) -commit 4240ef0d9db3577b057d27233c5393cc7f6b774e +commit https://github.com/intel/llvm/commit/4240ef0d9db3577b057d27233c5393cc7f6b774e [SYCL] Add check for valid SYCL triple for NVidia GPUs. (#14673) -commit 3fdfbfed1ed0062b9f3848a100093b340183c6a3 +commit https://github.com/intel/llvm/commit/3fdfbfed1ed0062b9f3848a100093b340183c6a3 [SYCL][NATIVECPU] Support reqd_work_group_size on Native CPU (#13175) -commit fe1859085b621ea901cd8da81659923122417688 +commit https://github.com/intel/llvm/commit/fe1859085b621ea901cd8da81659923122417688 [SYCL][NVPTX] Emit reqd_work_group_size attributes as NVVM annotations (#14502) related to above? -commit 9e4768ca9849e7188221c0e2894282730e3b1bde +commit https://github.com/intel/llvm/commit/9e4768ca9849e7188221c0e2894282730e3b1bde [SYCL][libclc] Add generic addrspace overloads of math builtins (#13015) -commit 183832b9cebd471586c0ed251876972939442327 +commit https://github.com/intel/llvm/commit/183832b9cebd471586c0ed251876972939442327 [SYCL][PI] Add PI_ERROR_UNSUPPORTED_FEATURE error code (#13036) -commit c1e2957be8db95425f1c17df258a0830c83dcf47 +commit https://github.com/intel/llvm/commit/c1e2957be8db95425f1c17df258a0830c83dcf47 [CUDA][LIBCLC] Implement RC11 seq_cst for PTX6.0 (#12516) -commit 73be194fc27cd20968c264afdb71befc181d51ec +commit https://github.com/intel/llvm/commit/73be194fc27cd20968c264afdb71befc181d51ec [SYCL] Add support for optional kernel features in AOT x86_64 compilation (#14590) -commit f51e43b2f0616934116626dc48c83282a84090ce +commit https://github.com/intel/llvm/commit/f51e43b2f0616934116626dc48c83282a84090ce [SYCL] Add more aspect information for intel_gpu_* in device config file (#14188) -commit 7a9d3b1e9483b69baa0b8c6f1097016efd52854c +commit https://github.com/intel/llvm/commit/7a9d3b1e9483b69baa0b8c6f1097016efd52854c [SYCL][NVPTX] Do not decompose SYCL functor unless necessary (#14434) -commit d42d90e52d71c16739e26de353f69930cbe1f860 +commit https://github.com/intel/llvm/commit/d42d90e52d71c16739e26de353f69930cbe1f860 [SYCL] Change the ext_intel_device_info spec to throw a feature not supported error when a query is not supported (#14576) -commit 3561c9bb854d35eeb9fc4da3550334faaf316a4f +commit https://github.com/intel/llvm/commit/3561c9bb854d35eeb9fc4da3550334faaf316a4f [SYCL] Add support of more Intel GPU arch versions to sycl_ext_oneapi_device_architecture (#14582) -commit e51002c81cdf32f383104907cca820e4ed3452ba +commit https://github.com/intel/llvm/commit/e51002c81cdf32f383104907cca820e4ed3452ba [SYCL] Enable intel joint matrix on GNR. (#14436) -commit 21c2e1c2213171d12acb5e6c41a713db30a0d5d4 +commit https://github.com/intel/llvm/commit/21c2e1c2213171d12acb5e6c41a713db30a0d5d4 [SYCL] Make swizzle mutating operators const friends (#13012) -commit da02e023e60d89824aad440c4f7bb558e70501a4 +commit https://github.com/intel/llvm/commit/da02e023e60d89824aad440c4f7bb558e70501a4 [SYCL] Workaround for seg fault in `vec::convert<>` for OpenCL CPU at O0 (#14498) -commit 17ee3e24e2874690f7526dcda9d8bc4679fe7edc +commit https://github.com/intel/llvm/commit/17ee3e24e2874690f7526dcda9d8bc4679fe7edc [SYCL][NATIVECPU] Add device library and initial subgroup support (#13979) -commit 0b9fc099f63feadb5e476c5862de3d8fa977a655 +commit https://github.com/intel/llvm/commit/0b9fc099f63feadb5e476c5862de3d8fa977a655 [SYCL][Graph] Test WGU kernel mismatch (#14379) -commit 0ccb0b7d3dd614707f82ea8f99790e2d3b08496d +commit https://github.com/intel/llvm/commit/0ccb0b7d3dd614707f82ea8f99790e2d3b08496d [SYCL][ABI-Break] Improve Queue fill (#13788) -commit 005622d177c9a17dc9defefd507921daf7affc28 +commit https://github.com/intel/llvm/commit/005622d177c9a17dc9defefd507921daf7affc28 [SYCL][Doc] Update work-group-specific extension (#14271) -commit 93fef86cd4fb8e18c126365c404eea1ed0f1a7fa +commit https://github.com/intel/llvm/commit/93fef86cd4fb8e18c126365c404eea1ed0f1a7fa [SYCL][Graph] Permit empty & barrier nodes in WGU (#14236) -commit 02ac8a414c1fd9b209d139c100cd1bbeae3729d2 +commit https://github.com/intel/llvm/commit/02ac8a414c1fd9b209d139c100cd1bbeae3729d2 [SYCL][LIBCLC][NATIVECPU] Add checks for fp16 and fp64 in Native CPU libclc (#14242) -commit a25d27bc9fbb2925519e966b9e7043be04274b27 +commit https://github.com/intel/llvm/commit/a25d27bc9fbb2925519e966b9e7043be04274b27 [SYCL][NATIVECPU][LIBCLC] Implement missing builtins for half type (#13829) -commit 4151c799ef36f2912fab3f6b9e305240ef4ff327 +commit https://github.com/intel/llvm/commit/4151c799ef36f2912fab3f6b9e305240ef4ff327 [SYCL][Graph] Wait instead of flush dep events in update command (#14167) -commit 47a03418ac74f3a5492213afc192569eae1393ec +commit https://github.com/intel/llvm/commit/47a03418ac74f3a5492213afc192569eae1393ec [SYCL][LIBCLC][NATIVECPU] Add aarch64 target triple for Native CPU (#13911) -commit f2cd2a80e7277fc62d8802673ce6ab2fac6fcbd0 +commit https://github.com/intel/llvm/commit/f2cd2a80e7277fc62d8802673ce6ab2fac6fcbd0 [SYCL] Disable in-order queue barrier optimization while profiling (#14123) -commit e34b7fffedbe9ff73d41b172eb48c189170f99f9 +commit https://github.com/intel/llvm/commit/e34b7fffedbe9ff73d41b172eb48c189170f99f9 [Doc] Document Unified Runtime update process (#14097) -commit 2e1f14adb3bf6d9e9c55e4b0ced9e1ece2172a4a +commit https://github.com/intel/llvm/commit/2e1f14adb3bf6d9e9c55e4b0ced9e1ece2172a4a [SYCL] Fix UB and alignment issues in the SYCL default sorter (#13975) -commit 4222b4ccd6dc499248c8bf026bcdd0f207000b35 +commit https://github.com/intel/llvm/commit/4222b4ccd6dc499248c8bf026bcdd0f207000b35 [SYCL] Restrict `sycl::vec` and swizzle operations to types mentioned in the SPEC (#13947) -commit 5b6cc5eb7bb2106ff426815702d89569e166c4f9 +commit https://github.com/intel/llvm/commit/5b6cc5eb7bb2106ff426815702d89569e166c4f9 [SYCL][Matrix] Enable SPV_KHR_cooperative_matrix extension (#13923) -commit 03b994ead80bb381d59b1390f255119b8d211a1f +commit https://github.com/intel/llvm/commit/03b994ead80bb381d59b1390f255119b8d211a1f [SYCL] Add code location information to enqueue free functions (#13924) -commit 58382507f0c7bd8a5c21e3b7e1d3360f0835f26a +commit https://github.com/intel/llvm/commit/58382507f0c7bd8a5c21e3b7e1d3360f0835f26a [ESIMD]Add support for double data type to inv API (#13838) -commit 0ce40f46ef4e2f5e8eed75e28352a90c9b8ecbaf +commit https://github.com/intel/llvm/commit/0ce40f46ef4e2f5e8eed75e28352a90c9b8ecbaf [SYCL] [NATIVECPU] Implement generic atomic store for generic target (#13428) -commit ccca3b73769bfd8a27eff9956630fe86a2e4832d +commit https://github.com/intel/llvm/commit/ccca3b73769bfd8a27eff9956630fe86a2e4832d [SYCL] Optimize SG group_store via BlockWriteINTEL in simple cases (#13734) -commit 48a0ff5b4b5bc21dedab37380c4ac93676277f91 +commit https://github.com/intel/llvm/commit/48a0ff5b4b5bc21dedab37380c4ac93676277f91 [SYCL] Optimize SG group_load via BlockReadINTEL in simple cases (#13673) -commit 0a1381d286f7c32a256a6dab49917870769f1238 +commit https://github.com/intel/llvm/commit/0a1381d286f7c32a256a6dab49917870769f1238 [SYCL][Graph] Add wording about arbitrary C++ code in CGFs (#13699) -commit ece19f298b1029121da17a423b801bc2a9267a8d +commit https://github.com/intel/llvm/commit/ece19f298b1029121da17a423b801bc2a9267a8d [SYCL][Graph] Clarify graph in-order and out-of-order properties (#13681) -commit 67f3bf292ff58136adc6383a3c5a1b19779e4120 +commit https://github.com/intel/llvm/commit/67f3bf292ff58136adc6383a3c5a1b19779e4120 [SYCL][COMPAT] Removed sycl/sycl.hpp include (#13108) -commit 7d55eb8a8419dac64f065bbf84125ed1d78dc992 +commit https://github.com/intel/llvm/commit/7d55eb8a8419dac64f065bbf84125ed1d78dc992 [SYCL][Docs] Behavioral changes to in-order queue events extension (#13624) -commit 771ffa4e967f3058c500c87297c2d1a7be156a9b +commit https://github.com/intel/llvm/commit/771ffa4e967f3058c500c87297c2d1a7be156a9b [SYCL] Remove get_child_group() (#13482) -commit e1119d9d2753dc9165e10c2e8c11e222cc549ba9 +commit https://github.com/intel/llvm/commit/e1119d9d2753dc9165e10c2e8c11e222cc549ba9 [SYCL][ESIMD] Add more compile time checks to rdregion and wrregion API (#13158) -commit c5cf452d663b96479341daa182c7e305baf542aa +commit https://github.com/intel/llvm/commit/c5cf452d663b96479341daa182c7e305baf542aa [SYCL][libdevice] Add simple rand for ease of use in device (#13506) -commit a36e9f8969a5ad4346f84c925aa89e1a00128b7f +commit https://github.com/intel/llvm/commit/a36e9f8969a5ad4346f84c925aa89e1a00128b7f [SYCL] APIs cleanup (#13443) -commit ef6d2bb3caf36eaa1149369f8aee1578d6e31a6e +commit https://github.com/intel/llvm/commit/ef6d2bb3caf36eaa1149369f8aee1578d6e31a6e [SYCL][ESIMD] Add support for transposed prefetch for 1/2 byte elements (#13452) -commit 5a07640e1ce68584a60b1a0450526e928340d1e0 +commit https://github.com/intel/llvm/commit/5a07640e1ce68584a60b1a0450526e928340d1e0 [SYCL][ESIMD] Add native FMA function (#13366) -commit a4fdfdad53c3f1b2e423cdf5f5f0f977ce055593 +commit https://github.com/intel/llvm/commit/a4fdfdad53c3f1b2e423cdf5f5f0f977ce055593 [ESIMD][NFC][DOC] Fix misprints and punctuation in esimd-functions doc (#13239) -commit 0557c7bb247f6b90ce9e389b9bde341f98dca667 +commit https://github.com/intel/llvm/commit/0557c7bb247f6b90ce9e389b9bde341f98dca667 [NFC][SYCL] Use __SYCL2020_DEPRECATED macro for any/all builtins (#13237) -commit 3f841ff1d21bac2601beaf087bc3d09170af6d35 +commit https://github.com/intel/llvm/commit/3f841ff1d21bac2601beaf087bc3d09170af6d35 [SYCL] Deprecate legacy information descriptors (#13279) -commit 902dadc476617379a8d38206d6f2183b657acf62 +commit https://github.com/intel/llvm/commit/902dadc476617379a8d38206d6f2183b657acf62 [SYCL] Add alternative to deprecated barrier() function for sub-group (#13276) -commit fc9d62f6c93a47b5b980e4c2840f349c4b2db93a +commit https://github.com/intel/llvm/commit/fc9d62f6c93a47b5b980e4c2840f349c4b2db93a [SYCL][AMDGCN] Provide a more helpful --offload-arch error (#13078) -commit 0c0b58686a79c8d9a8ef547a96b5c1642480e591 +commit https://github.com/intel/llvm/commit/0c0b58686a79c8d9a8ef547a96b5c1642480e591 [XPTI][INFRA] Sample E2E data collection timing test for XPTI (#13045) -commit 24699750a7f816b7ad4ebe19342210693e20a9f3 +commit https://github.com/intel/llvm/commit/24699750a7f816b7ad4ebe19342210693e20a9f3 [ESIMD][NFC][DOC] Add 'restrictions' section to gather/scatter() doc (#13196) -commit 8867d446c360048b62064828693f4d50c945a55c +commit https://github.com/intel/llvm/commit/8867d446c360048b62064828693f4d50c945a55c [spir-v][clang] Allow spirv32/spirv64 as target triples for sycl offloading (#13083) -commit b8f394203ec4436ddd31f72193c4c1a52e3747df +commit https://github.com/intel/llvm/commit/b8f394203ec4436ddd31f72193c4c1a52e3747df [SYCL] Fix device libraries and SYCL headers with spirv64 target (#13288) -commit 8bc909e01ece4e177ae25168995be21f0d37abc6 +commit https://github.com/intel/llvm/commit/8bc909e01ece4e177ae25168995be21f0d37abc6 [SYCL][libdevice] Build for spirv64 on Linux (#13302) -commit 363fceff578dcfa5a488b89f71f259da80aad2d7 +commit https://github.com/intel/llvm/commit/363fceff578dcfa5a488b89f71f259da80aad2d7 [SYCL][ESIMD] Don't override target triple to genx64 (#13445) -commit 9bb2b343de3308994892961b0b48838ce7f2e91d +commit https://github.com/intel/llvm/commit/9bb2b343de3308994892961b0b48838ce7f2e91d [SYCL][ClangLinkerWrapper] Fix SYCL binary creation with spirv64 triple (#14686) -commit f8926a63ce5a1634cb0533f4ab8eab2b6898caac +commit https://github.com/intel/llvm/commit/f8926a63ce5a1634cb0533f4ab8eab2b6898caac [SYCL][Libdevice] Build for spirv64 on Windows (#13649) -commit d0744751abe535c1470ca8833d5dd3b3d1a72c6b +commit https://github.com/intel/llvm/commit/d0744751abe535c1470ca8833d5dd3b3d1a72c6b [SPIR-V][Headers] Enable programs that include system headers on Windows for SPIRV32 and SPIRV64 targets (#13548) -commit 38e663ecd37de513d8e31afdfdf245cf8c9d17f0 +commit https://github.com/intel/llvm/commit/38e663ecd37de513d8e31afdfdf245cf8c9d17f0 [SYCL] Declare __devicelib_assert_read only when fallback assert is enabled (#13241) -commit 66865607bb90f7a7ca7602e5e18d8314659ffba5 +commit https://github.com/intel/llvm/commit/66865607bb90f7a7ca7602e5e18d8314659ffba5 [SYCL][NFC] Remove legacy SYCL_EXT_ONEAPI_MATRIX_VERSION usages (#13235) -commit 9612159998a1c05525f08f0a6775d875d86da518 +commit https://github.com/intel/llvm/commit/9612159998a1c05525f08f0a6775d875d86da518 [Driver][SYCL] Cleanup redundant dependency steps (#13217) -commit c821dc934dc7934b0209b5d3f88a280bbaa7145c +commit https://github.com/intel/llvm/commit/c821dc934dc7934b0209b5d3f88a280bbaa7145c [SYCL] Add support for multiple filtered outputs in sycl-post-link (#12727) merge with other optional kernel features AOT improvements -commit 3ea29b2a9028b485b76339e16754e3e74c9cc7a6 +commit https://github.com/intel/llvm/commit/3ea29b2a9028b485b76339e16754e3e74c9cc7a6 [SYCL] Update root_group extension to use `this_work_item` namespace (#13304) -commit fb66f1b83559366e541381251de4281bb554613d +commit https://github.com/intel/llvm/commit/fb66f1b83559366e541381251de4281bb554613d [SYCL] Replace __builtin_bit_cast with sycl::bit_cast in imf headers (#13313) is it a bugfix? -commit 65bdffb1c9d4c474316d3e330fc3c59338e004f6 +commit https://github.com/intel/llvm/commit/65bdffb1c9d4c474316d3e330fc3c59338e004f6 [SYCL][libclc][NATIVECPU] Implement generic atomic load for generic target (#13249) new feature? -commit d932fcae4aa83d12a3eb30a3003d1718429a9df1 +commit https://github.com/intel/llvm/commit/d932fcae4aa83d12a3eb30a3003d1718429a9df1 [SYCL][COMPAT] Extended device_info properties. (#13050) -commit 05644a470303c2af3385b9533b8d23ebdea99eb7 +commit https://github.com/intel/llvm/commit/05644a470303c2af3385b9533b8d23ebdea99eb7 [OpenCL] Config dependent-load flag to exclude CWD from DLL search path (#13327) do we report security issues? -commit e9befa2d10f6c23a66ac780df7a1ddda55279230 +commit https://github.com/intel/llvm/commit/e9befa2d10f6c23a66ac780df7a1ddda55279230 [SYCL][DebugInfo] Switch to nonsemantic-shader-200 for non-FPGA HW on linux (#13107) do we need to mention it? -commit a0d8f01c82dda1ed5227945001a179f97774474f +commit https://github.com/intel/llvm/commit/a0d8f01c82dda1ed5227945001a179f97774474f [SYCL][ESIMD] Move rdtsc function out of experimental namespace (#13417) -commit 2a1002b9fac9c4b878c6625c3cfafa61dea07ea2 +commit https://github.com/intel/llvm/commit/2a1002b9fac9c4b878c6625c3cfafa61dea07ea2 [SYCL][JIT] Load SYCL JIT lazily (#13433) -commit 4f5a5f0fba71593888f1737e0b4dbaf49c85e04b +commit https://github.com/intel/llvm/commit/4f5a5f0fba71593888f1737e0b4dbaf49c85e04b [SYCL] Fix WA for ocl query of CL_DEVICE_PROFILE (#13584) -commit 893059138f61aabeb0e1063549d7f4dd533fdfd1 +commit https://github.com/intel/llvm/commit/893059138f61aabeb0e1063549d7f4dd533fdfd1 [SYCL][Matrix spec] Add 1x64x16 combination for Intel XMX (PVC only) (#13587) or rather a new feature? -commit e17632f32fcc160add43742ccdaa6cc80cc1b0c0 +commit https://github.com/intel/llvm/commit/e17632f32fcc160add43742ccdaa6cc80cc1b0c0 [Driver][SYCL] Use LLVM-IR based device libraries for device linking (#13604) -commit 67d8ea1cdaef29afd75f7f085f0b6c6d73af81a3 +commit https://github.com/intel/llvm/commit/67d8ea1cdaef29afd75f7f085f0b6c6d73af81a3 [Driver][SYCL][FPGA] Use bundled device libraries for FPGA targets (#13693) -commit 1665cc0dd57266d2677c625725d38973cce3e8d9 +commit https://github.com/intel/llvm/commit/1665cc0dd57266d2677c625725d38973cce3e8d9 [SYCL][Graph] Enable in-order cmd-list (#13088) -commit d13fdbe4ee02c39b1939bae7da61392e75ce2c78 +commit https://github.com/intel/llvm/commit/d13fdbe4ee02c39b1939bae7da61392e75ce2c78 [Bindless][Exp] Add texture fetch functionality (#12447) or a new feature? -commit fbd10436a5911b12b8d77ba50397a24e6905e7a3 +commit https://github.com/intel/llvm/commit/fbd10436a5911b12b8d77ba50397a24e6905e7a3 [Driver][SYCL]Adding 'aoc -vpfp-relaxed' with -fintelfpga and -fp-model=fast (#13651) -commit 8993f3fc55489023603ceafa631e8f19824979b3 +commit https://github.com/intel/llvm/commit/8993f3fc55489023603ceafa631e8f19824979b3 [SYCL][ESIMD] Use old intrinsic for named_barrier_signal for now (#13255) does it revert the patch below? -commit d4a9254d764a0ff0be8514a6854afda833a268ce +commit https://github.com/intel/llvm/commit/d4a9254d764a0ff0be8514a6854afda833a268ce [SYCL][ESIMD] Use intrinsic for named_barrier_signal (#12982) ??? -commit 51ffc04f0f317e0395c678e1fecd654df51db955 +commit https://github.com/intel/llvm/commit/51ffc04f0f317e0395c678e1fecd654df51db955 [SYCL][libclc] Add generic addrspace overloads of vload/vstore builtins (#13092) ???? -commit 75300ab1ceee835e07086925d990f74107a84a1d +commit https://github.com/intel/llvm/commit/75300ab1ceee835e07086925d990f74107a84a1d [SYCL][libclc] Add generic fp16 math builtins for generic SPIR-V target (#13361) ??? -commit 7271d613156f2268d538f20d92ecd52b1fbc488f +commit https://github.com/intel/llvm/commit/7271d613156f2268d538f20d92ecd52b1fbc488f [SYCL][Docs] Add deprecation notice to SPV_INTEL_global_variable_decorations (#13772) do we really need to mention SPIR-V specs? -commit 0678c5ce0fe3af6363bd4b374ffaedb800a5b1e1 +commit https://github.com/intel/llvm/commit/0678c5ce0fe3af6363bd4b374ffaedb800a5b1e1 [SYCL][Joint matrix] clarify the range of the prefetch templated arguments (#13796) -commit bdaf1e27310dc2218a95f05731a422a32ea5a658 +commit https://github.com/intel/llvm/commit/bdaf1e27310dc2218a95f05731a422a32ea5a658 [libclc] Separate out generic AS support macros (#13792) ???? -commit 24a6b3b2f2d2a160a737fb1162c78f4cce9a8f1d +commit https://github.com/intel/llvm/commit/24a6b3b2f2d2a160a737fb1162c78f4cce9a8f1d [SYCL] Generate imported symbol files in sycl-post-link (#14189) -commit 62ea97e34e9245fb50f5718861da06e5e4425c2e +commit https://github.com/intel/llvm/commit/62ea97e34e9245fb50f5718861da06e5e4425c2e [SYCL] Exclude SYCL_EXTERNAL functions from device image with the option -support-dynamic-linking (#14103) -commit d4f2fe54047a1b415af2402a497f20e918094580 +commit https://github.com/intel/llvm/commit/d4f2fe54047a1b415af2402a497f20e918094580 [SYCL][Bindless][Exp] Remove const from non-reference and non-pointer type parameters (#14238) -commit 9800153d373eed9bb5d23acf965541ab0a99b316 +commit https://github.com/intel/llvm/commit/9800153d373eed9bb5d23acf965541ab0a99b316 [MATRIX][DOC][E2E] Add note on sm version nvidia device issue. (#14178) -commit 2bac63f5ebd62b29c8fe916a89b8b42ae536d609 +commit https://github.com/intel/llvm/commit/2bac63f5ebd62b29c8fe916a89b8b42ae536d609 [ESIMD] Infer address space of pointer that are passed through invoke_simd to ESIMD API to generate better code on BE (#14628) -commit 14aabdd3d081fea4ab7f66edc42b4b53eb9c50fe +commit https://github.com/intel/llvm/commit/14aabdd3d081fea4ab7f66edc42b4b53eb9c50fe [SYCL] Throw exception when device does not support queries in sycl_ext_intel_device_info (#14788) -commit 2442ef047a4e9e9c135beed18a92029e1aad6cad +commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad6cad [DeviceSanitizer] Disable handling no return calls (#14652) // bugfix? -commit e38dcdc8bb547f4b63c7b860c1cd9948c090ffc8 +commit https://github.com/intel/llvm/commit/e38dcdc8bb547f4b63c7b860c1cd9948c090ffc8 [SYCL] Add compile target to device image properties (#14757) Not user visible, need to merged with other optional kernel features AOT patches @@ -798,22 +798,22 @@ commit e38dcdc8bb547f4b63c7b860c1cd9948c090ffc8 - Fixed shutdown sequence issues when SYCL RT is used from an application or library that has its own shutdown sequence using global destructors. intel/llvm#14153 -commit c2a4054980fd4ce4c4d6cfa425cbc71b20d5f450 +commit https://github.com/intel/llvm/commit/c2a4054980fd4ce4c4d6cfa425cbc71b20d5f450 [SYCL] Fix barrier with wait list handling (#13863) is it just a bugfix for intel/llvm#13094? -commit 11046e7d8bf07afebd79f30a528c3cbe5493d8ed +commit https://github.com/intel/llvm/commit/11046e7d8bf07afebd79f30a528c3cbe5493d8ed [SYCL] Fix queue fields cleanup for barrier vs host task deps (#14268) looks like bugfix for intel/llvm#13094 -commit 1d24713c299aa16113f390c87d4444af5b83a586 +commit https://github.com/intel/llvm/commit/1d24713c299aa16113f390c87d4444af5b83a586 [SYCL] Fix ONEAPI_DEVICE_SELECTOR handling of discard filters. (#13927) What was happening before this patch? -commit 775dccb43494b1d38fb84de728446053b11bd05a +commit https://github.com/intel/llvm/commit/775dccb43494b1d38fb84de728446053b11bd05a [SYCL] Allow empty and unsupported case for component_devices (#13931) This is later modified in another commit, so those two should be squashed -commit 33325d4af0b66c33f7a42f0bf584645972a738a8 +commit https://github.com/intel/llvm/commit/33325d4af0b66c33f7a42f0bf584645972a738a8 [SYCL] Fix enqueue functions taking both kernel and properties (#14743) ### Documentation @@ -830,190 +830,190 @@ commit 33325d4af0b66c33f7a42f0bf584645972a738a8 ### SYCLcompat -commit 29230c80d117e29ba113ac8522b6fd2946ac56f9 +commit https://github.com/intel/llvm/commit/29230c80d117e29ba113ac8522b6fd2946ac56f9 [SYCL][COMPAT] Fix using address of a temporary queue_ptr in util.hpp (#14440) Was it user-visible? -commit 510965a0a098313cc19e8a68cc405098dc9e9501 +commit https://github.com/intel/llvm/commit/510965a0a098313cc19e8a68cc405098dc9e9501 [SYCL][COMPAT] fixed byte-dot products to properly call cuda intrinsics (#14463) -commit 3caa78ecf53644ead4f1d5fa8bc7b4a81a1f4961 +commit https://github.com/intel/llvm/commit/3caa78ecf53644ead4f1d5fa8bc7b4a81a1f4961 [SYCL][COMPAT] Fixes SYCLCOMPAT_PROFILING_ENABLED codepath (#14574) -commit 7b538cdc4ecb33e88682eb1b36be33b73ac68caf +commit https://github.com/intel/llvm/commit/7b538cdc4ecb33e88682eb1b36be33b73ac68caf [SYCL][COMPAT] fixed atomic_compare_exchange_strong not using addressSpace template parameter (#13821) -commit d66b0baed24483da96fa135082e7c544498ce2d9 +commit https://github.com/intel/llvm/commit/d66b0baed24483da96fa135082e7c544498ce2d9 [SYCL][COMPAT] Add inline in max and min functions (#13708) -commit e40283b1234e0846d1a19be537948e865a31f360 +commit https://github.com/intel/llvm/commit/e40283b1234e0846d1a19be537948e865a31f360 Task sequence revert (#14359) This reverts PR #12453 and #13080 not sure which section this should go into -commit 93a0ec4c465ceff1bed641422e23f13ca6b8a7cd +commit https://github.com/intel/llvm/commit/93a0ec4c465ceff1bed641422e23f13ca6b8a7cd [SYCL] all_props_are_keys_of fix (#14433) -commit a14c0917ad741a3a27b50040e4589b56262462bc +commit https://github.com/intel/llvm/commit/a14c0917ad741a3a27b50040e4589b56262462bc [SYCL][Bindless] Update spirv read/fetch from sampled image and sampled image array (#14493) -commit c1ee064428a2d4038021dc3284a4c2f3aa897cb8 +commit https://github.com/intel/llvm/commit/c1ee064428a2d4038021dc3284a4c2f3aa897cb8 [SYCL][Bindless] Fix OpaqueFD/Win32Handle's scope in piextImportExternalMemory/Semaphore (#14266) -commit 493e78be6020ef436634b21d93069467fa6c69e7 +commit https://github.com/intel/llvm/commit/493e78be6020ef436634b21d93069467fa6c69e7 [SYCL][Graph] Fix PI Kernel leak in graph update (#14029) -commit 9ec73a21782de1d11d08e97d63a27fa8b208c1e5 +commit https://github.com/intel/llvm/commit/9ec73a21782de1d11d08e97d63a27fa8b208c1e5 [SYCL] Add work_group_num_dim metadata (#13600) Fixes reqd_work_group_size for HIP -commit 14ee7e1cca79cac97ecc41ddc15d5d724011c89a +commit https://github.com/intel/llvm/commit/14ee7e1cca79cac97ecc41ddc15d5d724011c89a [SYCL][Bindless][Exp] Remove unneeded function argument causing memory leak in image create functions (#13364) -commit 4b993a7b32f7743980bce646765a1b427b0996b6 +commit https://github.com/intel/llvm/commit/4b993a7b32f7743980bce646765a1b427b0996b6 Revert "[SYCL][Driver] Link with sycl libs at link step of clang-cl -fsycl (#12793)" (#13326) - revert commit seems to be a part of a previous release + revert commit https://github.com/intel/llvm/commit/seems to be a part of a previous release -commit ea7ba1b965302277fc23ef48dba83b10e6c734e9 +commit https://github.com/intel/llvm/commit/ea7ba1b965302277fc23ef48dba83b10e6c734e9 [ESIMD] Restore the lowering of lsc_load_stateless in sycl-post-link (#13104) -commit 2053be298d1bf2417ad0b2efaf0d9360650ed491 +commit https://github.com/intel/llvm/commit/2053be298d1bf2417ad0b2efaf0d9360650ed491 [SYCL][COMPAT] Reverted nd_barrier atomic_ref to acq_rel (NVPTX) (#13641) do we ned to mention this? do we need to drop some other item? -commit 267a03cd1ba5eaa55db95800712f978b93842bc5 +commit https://github.com/intel/llvm/commit/267a03cd1ba5eaa55db95800712f978b93842bc5 [SYCL] [NATIVECPU] Select right libclc file for native cpu (#13478) ??? -commit 5794326b965071a69273a1f653405670b728e66b +commit https://github.com/intel/llvm/commit/5794326b965071a69273a1f653405670b728e66b [SYCL][NATIVECPU][DRIVER] Select remangled libclc variant for Native CPU (#13765) ??? -commit 014004cf0f7cc21195a4a0ed4f16a003ecb7be72 +commit https://github.com/intel/llvm/commit/014004cf0f7cc21195a4a0ed4f16a003ecb7be72 [SYCL] event() fail fast (#13419) What was the problem with this? -commit bc9e30eff79a091bb3db3fc1a005009049734798 +commit https://github.com/intel/llvm/commit/bc9e30eff79a091bb3db3fc1a005009049734798 [SYCL] Use 32-bit integers where it's appropriate for matrix instructions (#12867) do we even need to mention this? -commit 0fde69dbfa18e0c9b477a916477297a832e194a3 +commit https://github.com/intel/llvm/commit/0fde69dbfa18e0c9b477a916477297a832e194a3 [SYCL] Do not enable SPV_KHR_bit_instructions until downstream tools are ready (#13044) Perhaps it can be fully omitted, because it may have been "reverted" later -commit f170c63ed329c1fa5271d67e68144ec5d7808079 +commit https://github.com/intel/llvm/commit/f170c63ed329c1fa5271d67e68144ec5d7808079 [SYCL] Fix kernel shortcut path for inorder queue (#13333) - could be related to a commit made post-March release, i.e. it can probably be squashed with some other line + could be related to a commit https://github.com/intel/llvm/commit/made post-March release, i.e. it can probably be squashed with some other line -commit 5332773b17efbf10e1b72cd633c1d7e2b4f75125 +commit https://github.com/intel/llvm/commit/5332773b17efbf10e1b72cd633c1d7e2b4f75125 [SYCL][ESIMD] atomic_update with data size less than 4 bytes should use LSC atomics (#13340) -commit 4b14d706d93891cdb5b0e6a8d4b0b027c1d54ab8 +commit https://github.com/intel/llvm/commit/4b14d706d93891cdb5b0e6a8d4b0b027c1d54ab8 [SYCL][DeviceSanitizer] Use -asan-constructor-kind=none to disable ctor/dtor (#13259) -commit 13a80f87098f4e6db25b75b46736ebd967110953 +commit https://github.com/intel/llvm/commit/13a80f87098f4e6db25b75b46736ebd967110953 [DeviceSanitizer] Strip off pointer casts and inbounds GEPs (#13262) all device sanitizer PRs can probably be merged into a single line -commit 4723efc481cc18160cfa2f76d89378a84c43df64 +commit https://github.com/intel/llvm/commit/4723efc481cc18160cfa2f76d89378a84c43df64 [SYCL][DeviceSanitizer] Checking "sycl::free" related errors (#12882) -commit 247e5e0a68b25af8d0f76855d231b9e5045b9c9a +commit https://github.com/intel/llvm/commit/247e5e0a68b25af8d0f76855d231b9e5045b9c9a [SYCL][DeviceSanitizer] Checking out-of-bounds error on sycl::local_accessor (#13503) -commit 7b4fbac8f29faa533b55c80bf4adbc51f5afe833 +commit https://github.com/intel/llvm/commit/7b4fbac8f29faa533b55c80bf4adbc51f5afe833 [DeviceSanitizer] Support multiple error reports (-fsanitize-recover=address) (#13948) -commit 2a5f9137ca2a6c1004a059ed95d3bfd79cf3ad41 +commit https://github.com/intel/llvm/commit/2a5f9137ca2a6c1004a059ed95d3bfd79cf3ad41 [DeviceSanitizer] Support detecting misaligned access error (#14148) -commit a0cc14f9ff5aad889b31534a27e4a39d5b2c25c2 +commit https://github.com/intel/llvm/commit/a0cc14f9ff5aad889b31534a27e4a39d5b2c25c2 [DeviceSanitizer]Change ASan shadow scale from 3 to 4 (#13857) -commit 0939f39818225ce3e469e08f6a45711b449a8ad4 +commit https://github.com/intel/llvm/commit/0939f39818225ce3e469e08f6a45711b449a8ad4 [SYCL] Align assert ext name with libdevice implementation (#13312) can likely be ommitted -commit 5ab1f762821abf36412b2b8d0e529285553fa472 +commit https://github.com/intel/llvm/commit/5ab1f762821abf36412b2b8d0e529285553fa472 [SYCL][ESIMD] Fix simd_view template argument and add nested simd_view tests (#13231) -commit 3c7f99d891cdd7c929b38b18bf6877c3c8dba163 +commit https://github.com/intel/llvm/commit/3c7f99d891cdd7c929b38b18bf6877c3c8dba163 [SYCL][Graph] Fix potential issue with command buffer commands (#13224) -commit c4c456bd74945b0ea2faf9ca54b28bb02f36cd49 +commit https://github.com/intel/llvm/commit/c4c456bd74945b0ea2faf9ca54b28bb02f36cd49 [SYCL] fix for kernel_compiler (#13214) -commit bcf7d4df6acf33a75c195215afad78113d14ae2d +commit https://github.com/intel/llvm/commit/bcf7d4df6acf33a75c195215afad78113d14ae2d [SYCL] kernel_compiler opencl query fix (#13448) -commit d6340b67391cd8e9e4c7775a3c1ada8f2755bb06 +commit https://github.com/intel/llvm/commit/d6340b67391cd8e9e4c7775a3c1ada8f2755bb06 [SYCL][Graph] in-order queue barrier fix (#13193) -commit fffe9a10d1d65d97302fd0ec88ce015ab625033d +commit https://github.com/intel/llvm/commit/fffe9a10d1d65d97302fd0ec88ce015ab625033d [clang][FE][Cuda] Fix a sm90a cuda arch define check in TargetInfo (#12885) -commit 0e892be1316ccf019688e420eaa770ec4a4a30fa +commit https://github.com/intel/llvm/commit/0e892be1316ccf019688e420eaa770ec4a4a30fa [SYCL][COMPAT] Specify proper namespace for abs with sycl::complex (#13518) -commit 628ede6edf2448c531bea7f818dc6819d9e7393f +commit https://github.com/intel/llvm/commit/628ede6edf2448c531bea7f818dc6819d9e7393f [libclc] Fix UB in double->int conversions (#13546) -commit 13a06d8c6bb468165fbdd2a2fc24dc79d6110b4f +commit https://github.com/intel/llvm/commit/13a06d8c6bb468165fbdd2a2fc24dc79d6110b4f [ESIMD] Use 1-element mask for load_2d()/store_2d()/prefetch_2d() (#13613) -commit 646db9cdb1899bbfefbbcb77d6ea256c4e9789c0 +commit https://github.com/intel/llvm/commit/646db9cdb1899bbfefbbcb77d6ea256c4e9789c0 [SYCL][Matrix] Fix checked matrix instructions (#13287) If I understand correctly, that is a non-functional change. @MrSidims? -commit 6b2fb665e9aa0bb7f2e034a22a153b7006c19d8a +commit https://github.com/intel/llvm/commit/6b2fb665e9aa0bb7f2e034a22a153b7006c19d8a [SYCL] Fix Level-Zero's `sycl::make_device` interop (#13483) -commit 64cb0cf96de28bfd495e577b4dd46c26dbb6b197 +commit https://github.com/intel/llvm/commit/64cb0cf96de28bfd495e577b4dd46c26dbb6b197 [SYCL][Graph] Fix minor issues in graph update code (#13660) -commit b4e0450207b5a85d5b985de0c0ff6fecdfebf0da +commit https://github.com/intel/llvm/commit/b4e0450207b5a85d5b985de0c0ff6fecdfebf0da [SYCL][Graph] Export missing graph node symbols (#13744) -commit 6934bcfb13415dc5bda85876b5cfc361678523f4 +commit https://github.com/intel/llvm/commit/6934bcfb13415dc5bda85876b5cfc361678523f4 [SYCL] Do not attach reqd_work_group_size info when multiple are detected (#13523) -commit f90554f8ec4a9e58a9e96865afed98deb9615ef4 +commit https://github.com/intel/llvm/commit/f90554f8ec4a9e58a9e96865afed98deb9615ef4 [SYCL][Docs] Fix variadic properties ctor (#13676) -commit 601f12103f2cb3bed8c61a9b25122144bf9a663c +commit https://github.com/intel/llvm/commit/601f12103f2cb3bed8c61a9b25122144bf9a663c [clang][FE] Remove duplicate preprocessor defines of HIP memory scope (#12871) -commit 563904b2aebb791adf0e1ad955a43e226c9a6caf +commit https://github.com/intel/llvm/commit/563904b2aebb791adf0e1ad955a43e226c9a6caf [SYCL] Add aspect names to sycl_used_aspects before cleaning up (#13486) part of optional kernel features AOT? -commit 8eff95ca51963dc6b4ec629da0dfaf134239cefc +commit https://github.com/intel/llvm/commit/8eff95ca51963dc6b4ec629da0dfaf134239cefc [SYCL] Fix FloatVecToBF16Vec build (#14161) Is it a user-visible fix? -commit 82f77d10dd092ea419115f61a7715655f055b7bb +commit https://github.com/intel/llvm/commit/82f77d10dd092ea419115f61a7715655f055b7bb [SYCL][Graph] Fix queue recording barrier to different graphs (#14212) -commit e22cb798f8363f8e2a95a7e6df9a294b34c52fc4 +commit https://github.com/intel/llvm/commit/e22cb798f8363f8e2a95a7e6df9a294b34c52fc4 Fix Basic/image/srgba-read.cpp failure under SYCL_PREFER_UR with ONEAPI_DEVICE_SELECTOR=opencl:cpu (#14233) -commit 1b5c5a8e96502b196c91251fa6513a6ede1257f5 +commit https://github.com/intel/llvm/commit/1b5c5a8e96502b196c91251fa6513a6ede1257f5 [SYCL] Fix SYCL_EXTERNAL device code when linking with a static lib (#14256) -commit d77a348776672316f59c59dc3b11ebf5aa79f936 +commit https://github.com/intel/llvm/commit/d77a348776672316f59c59dc3b11ebf5aa79f936 [SYCL][NVPTX] Emit 'grid_constant' annotations for by-val kernel params (#14332) -commit 5ad97902643da043233ec21ac203cca329df07b2 +commit https://github.com/intel/llvm/commit/5ad97902643da043233ec21ac203cca329df07b2 [SYCL][Graph] Fix profiling info when bypassing scheduler (#14678) -commit daaece06ce68544eaae078899c559f571297d8c0 +commit https://github.com/intel/llvm/commit/daaece06ce68544eaae078899c559f571297d8c0 [SYCL][Graph] Fix access modes not being respected (#13011) -commit c63b49ddfacf2f17135663a320e15e93be2971aa +commit https://github.com/intel/llvm/commit/c63b49ddfacf2f17135663a320e15e93be2971aa [Driver][SYCL] Address issue with improper bundler call with -fsycl-link (#13002) -commit 6e9a3dd987ce6f1c7384623713cea14f084cab9d +commit https://github.com/intel/llvm/commit/6e9a3dd987ce6f1c7384623713cea14f084cab9d [SYCL] Fix 'ignore-device-selectors' sycl-ls CLI option on windows (#13047) -commit b13a3c4c39a356c47cda983350f06000330a42f1 +commit https://github.com/intel/llvm/commit/b13a3c4c39a356c47cda983350f06000330a42f1 [libclc][hip] Fix half shuffles and reenable reduction test (#13016) -commit 0360e6af2a353210d508633a60ff02327094f7e7 +commit https://github.com/intel/llvm/commit/0360e6af2a353210d508633a60ff02327094f7e7 [SYCL] Follow up fixes for group_sort extension (#14591) -commit d39563ad1faa1d503c0396a137afd6664756b358 +commit https://github.com/intel/llvm/commit/d39563ad1faa1d503c0396a137afd6664756b358 [SYCL][Clang] Fix address space for virtual table support (#13629) ## API/ABI Breaking Changes @@ -1055,7 +1055,7 @@ of some classes to use so-called preview implementation. - Removed number of deprecated ESIMD APIs. intel/llvm#14415 - Removed deprecated overloads of math built-ins accepting raw pointers. intel/llvm#13238 Is it negated by the following commit? - - commit efed3bb04f3c43baf3373bf35d8924bbcf91f385 + - commit https://github.com/intel/llvm/commit/efed3bb04f3c43baf3373bf35d8924bbcf91f385 [SYCL] Allow raw pointers in SYCL math builtins (#13893) - Removed non-standard `sycl::id` -> `sycl::range` conversion operator. intel/llvm#13293 - Removed deprecated APIs from @@ -1092,17 +1092,17 @@ Breaking changes were also made to compiler flags: - Deprecated `-fsycl-disable-range-rounding` flag in favor of the new `-fsycl-range-rounding`. intel/llvm#12715 -commit 00b9b6d5db3de7257229f5c8b6aba4163a8f8977 +commit https://github.com/intel/llvm/commit/00b9b6d5db3de7257229f5c8b6aba4163a8f8977 [SYCL][ESIMD][ABI Break] Remove predec atomic op (#14480) Is it user visible? Is it an API break? -commit 9457144dd784d786c8f7e994bcb804f123cfb587 +commit https://github.com/intel/llvm/commit/9457144dd784d786c8f7e994bcb804f123cfb587 [ABI-Break][SYCL] Remove ESIMD emulator code from pi.cpp (#13234) // is it really user-visible? ## Known Issues -commit 33c0829f3e3389006662845784980b930faf3b38 +commit https://github.com/intel/llvm/commit/33c0829f3e3389006662845784980b930faf3b38 Author: Igor Chorążewicz Date: Thu Jul 25 23:00:19 2024 -0700 @@ -1115,7 +1115,7 @@ Date: Thu Jul 25 23:00:19 2024 -0700 Co-authored-by: Krzysztof Swiecicki Co-authored-by: Steffen Larsen -commit 450683b6fa1d1be1b9391905f43073b7a9555aa1 +commit https://github.com/intel/llvm/commit/450683b6fa1d1be1b9391905f43073b7a9555aa1 Author: Yang Zhao Date: Thu Jul 25 00:02:46 2024 +0800 @@ -1136,7 +1136,7 @@ Date: Thu Jul 25 00:02:46 2024 +0800 Co-authored-by: Wenju He Co-authored-by: Kenneth Benzie (Benie) -commit bd97280007ee79bf118fdbade3d9cb14721b9014 +commit https://github.com/intel/llvm/commit/bd97280007ee79bf118fdbade3d9cb14721b9014 Author: aarongreig Date: Wed Jul 24 07:20:14 2024 +0100 @@ -1150,7 +1150,7 @@ Date: Wed Jul 24 07:20:14 2024 +0100 Co-authored-by: Kenneth Benzie (Benie) -commit 5667218ed6be6dee8877efcc2fbcfc2ecd515cff +commit https://github.com/intel/llvm/commit/5667218ed6be6dee8877efcc2fbcfc2ecd515cff Author: Kenneth Benzie (Benie) Date: Tue Jul 16 18:05:26 2024 +0100 @@ -1163,7 +1163,7 @@ Date: Tue Jul 16 18:05:26 2024 +0100 * https://github.com/oneapi-src/unified-runtime/pull/1772 * https://github.com/oneapi-src/unified-runtime/pull/1862 -commit 44861fec406fff7a20bd4791c4288d71828912cc +commit https://github.com/intel/llvm/commit/44861fec406fff7a20bd4791c4288d71828912cc Author: Callum Fare Date: Thu Jul 11 15:15:12 2024 +0100 @@ -1175,13 +1175,13 @@ Date: Thu Jul 11 15:15:12 2024 +0100 Co-authored-by: Kenneth Benzie (Benie) -commit c30769b122d99eb4d05bcb78f15e593491fe31ae +commit https://github.com/intel/llvm/commit/c30769b122d99eb4d05bcb78f15e593491fe31ae Author: Neil R. Spruit Date: Wed Jul 10 21:58:04 2024 -0700 [UR][L0] Use Intel Level Zero Driver Version String extension (#14426) - pre-commit PR for + pre-commit https://github.com/intel/llvm/commit/PR for https://github.com/oneapi-src/unified-runtime/pull/1816 --------- @@ -1189,20 +1189,20 @@ Date: Wed Jul 10 21:58:04 2024 -0700 Signed-off-by: Neil R. Spruit Co-authored-by: Kenneth Benzie (Benie) -commit 8ddd7291219256f9bcb78328cc85322037736171 +commit https://github.com/intel/llvm/commit/8ddd7291219256f9bcb78328cc85322037736171 Author: Ross Brunton Date: Wed Jul 10 15:12:23 2024 +0100 [UR] Update to new urProgramLink interface (#13085) - Pre-commit PR for + Pre-commit https://github.com/intel/llvm/commit/PR for https://github.com/oneapi-src/unified-runtime/pull/1458 --------- Co-authored-by: Kenneth Benzie (Benie) -commit 13ae57f97cfb45cbcee8db6155ac8b0f7b7fbb82 +commit https://github.com/intel/llvm/commit/13ae57f97cfb45cbcee8db6155ac8b0f7b7fbb82 Author: Kenneth Benzie (Benie) Date: Wed Jul 10 10:53:12 2024 +0100 @@ -1210,13 +1210,13 @@ Date: Wed Jul 10 10:53:12 2024 +0100 https://github.com/oneapi-src/unified-runtime/pull/1822 -commit db4d83e3969a5f7b5313aa5fb8466dd2ebbf9283 +commit https://github.com/intel/llvm/commit/db4d83e3969a5f7b5313aa5fb8466dd2ebbf9283 Author: Neil R. Spruit Date: Tue Jul 9 06:56:01 2024 -0700 [UR][L0] Fix Queue get info and fix Queue release decrement (#14411) - pre-commit PR for + pre-commit https://github.com/intel/llvm/commit/PR for https://github.com/oneapi-src/unified-runtime/pull/1814 --------- @@ -1224,7 +1224,7 @@ Date: Tue Jul 9 06:56:01 2024 -0700 Signed-off-by: Neil R. Spruit Co-authored-by: Kenneth Benzie (Benie) -commit 78ae397aab9b2040be945ee2f7f73d93404ffa06 +commit https://github.com/intel/llvm/commit/78ae397aab9b2040be945ee2f7f73d93404ffa06 Author: Artur Gainullin Date: Tue Jul 9 02:37:27 2024 -0700 @@ -1232,7 +1232,7 @@ Date: Tue Jul 9 02:37:27 2024 -0700 UR PR: https://github.com/oneapi-src/unified-runtime/pull/1806 -commit ac556f9273e479c033e7dc76248fdb6861377ce7 +commit https://github.com/intel/llvm/commit/ac556f9273e479c033e7dc76248fdb6861377ce7 Author: Fábio Date: Mon Jul 8 16:23:41 2024 +0100 @@ -1240,13 +1240,13 @@ Date: Mon Jul 8 16:23:41 2024 +0100 Co-authored-by: Kenneth Benzie (Benie) -commit eb03091539daa68a582ceab950379ca482e118d9 +commit https://github.com/intel/llvm/commit/eb03091539daa68a582ceab950379ca482e118d9 Author: Neil R. Spruit Date: Mon Jul 8 05:50:54 2024 -0700 [UR][L0] Fix Device Info return code to report unsupported enumeration (#14407) - pre-commit PR for + pre-commit https://github.com/intel/llvm/commit/PR for https://github.com/oneapi-src/unified-runtime/pull/1809 --------- @@ -1254,13 +1254,13 @@ Date: Mon Jul 8 05:50:54 2024 -0700 Signed-off-by: Neil R. Spruit Co-authored-by: Kenneth Benzie (Benie) -commit 577c349c5f3b1c893160de2470aff5ee3f87f0bc +commit https://github.com/intel/llvm/commit/577c349c5f3b1c893160de2470aff5ee3f87f0bc Author: Neil R. Spruit Date: Fri Jul 5 04:30:49 2024 -0700 [UR][L0] Fix immediate command list use in Command Queues (#14341) - pre-commit PR for + pre-commit https://github.com/intel/llvm/commit/PR for https://github.com/oneapi-src/unified-runtime/pull/1802 --------- @@ -1269,7 +1269,7 @@ Date: Fri Jul 5 04:30:49 2024 -0700 Co-authored-by: Kenneth Benzie (Benie) Co-authored-by: Aaron Greig -commit f2bd076eb55a2cc79de2e9d4748967ed3cb13c9b +commit https://github.com/intel/llvm/commit/f2bd076eb55a2cc79de2e9d4748967ed3cb13c9b Author: Wu Yingcong Date: Thu Jun 27 02:26:23 2024 -0700 @@ -1281,7 +1281,7 @@ Date: Thu Jun 27 02:26:23 2024 -0700 Co-authored-by: Callum Fare -commit c6428bee93a01009291ee704dca9db6262045aed +commit https://github.com/intel/llvm/commit/c6428bee93a01009291ee704dca9db6262045aed Author: Neil R. Spruit Date: Tue Jun 25 07:03:05 2024 -0700 @@ -1289,12 +1289,12 @@ Date: Tue Jun 25 07:03:05 2024 -0700 …rivers - pre-commit PR for + pre-commit https://github.com/intel/llvm/commit/PR for https://github.com/oneapi-src/unified-runtime/pull/1778 Signed-off-by: Neil R. Spruit -commit 088a9475e7c5f39ecb2b74f79a479380c9dd64be +commit https://github.com/intel/llvm/commit/088a9475e7c5f39ecb2b74f79a479380c9dd64be Author: aarongreig Date: Fri Jun 21 13:52:08 2024 +0100 @@ -1306,7 +1306,7 @@ Date: Fri Jun 21 13:52:08 2024 +0100 Co-authored-by: Kenneth Benzie (Benie) -commit 350b56fda217ffc4677c5a3443a7844e13ac209d +commit https://github.com/intel/llvm/commit/350b56fda217ffc4677c5a3443a7844e13ac209d Author: Hugh Delaney Date: Fri Jun 21 10:30:11 2024 +0100 @@ -1318,7 +1318,7 @@ Date: Fri Jun 21 10:30:11 2024 +0100 Co-authored-by: Kenneth Benzie (Benie) -commit ab77ba800e6b36d0217dea053d125435f0a0b2db +commit https://github.com/intel/llvm/commit/ab77ba800e6b36d0217dea053d125435f0a0b2db Author: Kenneth Benzie (Benie) Date: Tue Jun 18 17:51:02 2024 +0100 @@ -1326,7 +1326,7 @@ Date: Tue Jun 18 17:51:02 2024 +0100 https://github.com/oneapi-src/unified-runtime/pull/1623 -commit 174f7510328f49d6f24c578b226acea085489082 +commit https://github.com/intel/llvm/commit/174f7510328f49d6f24c578b226acea085489082 Author: Steffen Larsen Date: Mon Jun 17 15:01:45 2024 +0200 @@ -1341,7 +1341,7 @@ Date: Mon Jun 17 15:01:45 2024 +0200 Co-authored-by: Kenneth Benzie (Benie) -commit 5e9e7a73ce11182af6ceafc1e91996b6c79f7180 +commit https://github.com/intel/llvm/commit/5e9e7a73ce11182af6ceafc1e91996b6c79f7180 Author: aarongreig Date: Mon Jun 17 10:30:32 2024 +0100 @@ -1349,13 +1349,13 @@ Date: Mon Jun 17 10:30:32 2024 +0100 Co-authored-by: Kenneth Benzie (Benie) -commit 579484f0ae9e5e30b9c9bd468799e1688d5de890 +commit https://github.com/intel/llvm/commit/579484f0ae9e5e30b9c9bd468799e1688d5de890 Author: Neil R. Spruit Date: Fri Jun 14 05:45:42 2024 -0700 [UR][L0] Maintain Lock of Queue while syncing the Last Command Event (#14150) - pre-commit PR for + pre-commit https://github.com/intel/llvm/commit/PR for https://github.com/oneapi-src/unified-runtime/pull/1749 --------- @@ -1363,7 +1363,7 @@ Date: Fri Jun 14 05:45:42 2024 -0700 Signed-off-by: Neil R. Spruit Co-authored-by: Kenneth Benzie (Benie) -commit ae79b95cc07ab68fcf706d47851b93e5b299dc87 +commit https://github.com/intel/llvm/commit/ae79b95cc07ab68fcf706d47851b93e5b299dc87 Author: Hugh Delaney Date: Wed Jun 12 16:46:31 2024 +0100 @@ -1375,7 +1375,7 @@ Date: Wed Jun 12 16:46:31 2024 +0100 Co-authored-by: Kenneth Benzie (Benie) -commit 1a885ecacc468ab324c812ab47b4af7f3b086e52 +commit https://github.com/intel/llvm/commit/1a885ecacc468ab324c812ab47b4af7f3b086e52 Author: Artur Gainullin Date: Wed Jun 12 07:30:29 2024 -0700 @@ -1383,7 +1383,7 @@ Date: Wed Jun 12 07:30:29 2024 -0700 Co-authored-by: Kenneth Benzie (Benie) -commit 7c530e154021d103259c8437233e7ba13ce98146 +commit https://github.com/intel/llvm/commit/7c530e154021d103259c8437233e7ba13ce98146 Author: aarongreig Date: Wed Jun 12 13:15:51 2024 +0100 @@ -1395,7 +1395,7 @@ Date: Wed Jun 12 13:15:51 2024 +0100 Co-authored-by: Kenneth Benzie (Benie) -commit c168f213381645e695eaed3500a7ba7bcc655321 +commit https://github.com/intel/llvm/commit/c168f213381645e695eaed3500a7ba7bcc655321 Author: Andrey Alekseenko Date: Wed Jun 12 07:01:54 2024 +0200 @@ -1407,7 +1407,7 @@ Date: Wed Jun 12 07:01:54 2024 +0200 Co-authored-by: Kenneth Benzie (Benie) -commit c41a0562e9a4a9ace16373b819d15a38ec467c4e +commit https://github.com/intel/llvm/commit/c41a0562e9a4a9ace16373b819d15a38ec467c4e Author: Omar Ahmed Date: Mon Jun 10 15:37:57 2024 +0100 @@ -1420,13 +1420,13 @@ Date: Mon Jun 10 15:37:57 2024 +0100 Co-authored-by: Kenneth Benzie (Benie) -commit 3935e06bc2e3794b7eac715c069e28c30aeaee9c +commit https://github.com/intel/llvm/commit/3935e06bc2e3794b7eac715c069e28c30aeaee9c Author: Ewan Crawford Date: Mon Jun 10 12:46:11 2024 +0100 [SYCL][Graph] Combined L0 Graph Update fixes (#14111) - Bumps the L0 adapter UR commit to include several merged fixes to the L0 + Bumps the L0 adapter UR commit https://github.com/intel/llvm/commit/to include several merged fixes to the L0 adapter for implementing the SYCL-Graph update feature: * [Use fence rather than event for sync in L0 command-buffer @@ -1436,7 +1436,7 @@ Date: Mon Jun 10 12:46:11 2024 +0100 * [Fix L0 Event leak without return sync point](https://github.com/oneapi-src/unified-runtime/pull/1706) -commit fcfe36b705fa715b4813de95565bbba9a5b88223 +commit https://github.com/intel/llvm/commit/fcfe36b705fa715b4813de95565bbba9a5b88223 Author: Kenneth Benzie (Benie) Date: Mon Jun 10 10:16:54 2024 +0100 @@ -1449,19 +1449,19 @@ Date: Mon Jun 10 10:16:54 2024 +0100 * https://github.com/oneapi-src/unified-runtime/pull/1634 * https://github.com/oneapi-src/unified-runtime/pull/1669 -commit 0cec12826baea60a15483081b0feece49013049f +commit https://github.com/intel/llvm/commit/0cec12826baea60a15483081b0feece49013049f Author: Kenneth Benzie (Benie) Date: Wed Jun 5 11:20:25 2024 +0100 [UR] Bump HIP tag to 399430da (#14037) -commit 2838f40382bedddbda0a5f20ebeeba86310044da +commit https://github.com/intel/llvm/commit/2838f40382bedddbda0a5f20ebeeba86310044da Author: Ewan Crawford Date: Wed Jun 5 09:20:03 2024 +0100 [SYCL][Graph][L0] Correctly report when device supports update (#13987) - Bump UR L0 commit to + Bump UR L0 commit https://github.com/intel/llvm/commit/to https://github.com/oneapi-src/unified-runtime/pull/1694 so that the SYCL device aspect for supporting update in graphs is correctly reported for L0 devices. Currently, support can be incorrectly reported. @@ -1470,13 +1470,13 @@ Date: Wed Jun 5 09:20:03 2024 +0100 Co-authored-by: Kenneth Benzie (Benie) -commit 20991b1c2ee906148706aa1e7ae62c1084834799 +commit https://github.com/intel/llvm/commit/20991b1c2ee906148706aa1e7ae62c1084834799 Author: Kenneth Benzie (Benie) Date: Wed Jun 5 08:48:18 2024 +0100 [UR] Bump CUDA tag to 0e38fda0 (#14030) -commit 18c4fb2c57f3b937451becda4ca25468397128f5 +commit https://github.com/intel/llvm/commit/18c4fb2c57f3b937451becda4ca25468397128f5 Author: Pietro Ghiglio Date: Tue Jun 4 18:37:40 2024 +0200 @@ -1484,7 +1484,7 @@ Date: Tue Jun 4 18:37:40 2024 +0200 Testing for https://github.com/oneapi-src/unified-runtime/pull/1527 -commit 781b75abfd1dac36a2c68fbc13bd6f1bb845d35b +commit https://github.com/intel/llvm/commit/781b75abfd1dac36a2c68fbc13bd6f1bb845d35b Author: Wu Yingcong Date: Tue Jun 4 06:09:03 2024 -0700 @@ -1496,7 +1496,7 @@ Date: Tue Jun 4 06:09:03 2024 -0700 Co-authored-by: Kenneth Benzie (Benie) -commit f2a2de3b6e735ee4a54ecc212b648f370e47abbc +commit https://github.com/intel/llvm/commit/f2a2de3b6e735ee4a54ecc212b648f370e47abbc Author: Ewan Crawford Date: Thu May 30 14:28:35 2024 +0100 @@ -1509,7 +1509,7 @@ Date: Thu May 30 14:28:35 2024 +0100 Co-authored-by: Kenneth Benzie (Benie) -commit 1fa2ac88a1fb3a5eba0315c03faa03c2d8e3c5f7 +commit https://github.com/intel/llvm/commit/1fa2ac88a1fb3a5eba0315c03faa03c2d8e3c5f7 Author: Kenneth Benzie (Benie) Date: Thu May 30 12:46:50 2024 +0100 @@ -1517,7 +1517,7 @@ Date: Thu May 30 12:46:50 2024 +0100 https://github.com/oneapi-src/unified-runtime/pull/1604 -commit e147f3673c77e566a63a1d4d57d6f5da0153cbdb +commit https://github.com/intel/llvm/commit/e147f3673c77e566a63a1d4d57d6f5da0153cbdb Author: Konrad Kusiak Date: Thu May 30 12:46:39 2024 +0100 @@ -1525,18 +1525,18 @@ Date: Thu May 30 12:46:39 2024 +0100 https://github.com/oneapi-src/unified-runtime/pull/1603 -commit 8086df575d7f622017521fcd2f8b2b90fdd49d39 +commit https://github.com/intel/llvm/commit/8086df575d7f622017521fcd2f8b2b90fdd49d39 Author: Neil R. Spruit Date: Thu May 30 02:57:41 2024 -0700 [UR][L0] Fix Multi Device Event Cache for shared Root Device (#13917) - pre-commit PR for + pre-commit https://github.com/intel/llvm/commit/PR for https://github.com/oneapi-src/unified-runtime/pull/1667 Signed-off-by: Neil R. Spruit -commit 16e0670ab6e2425a20e13aec2c7f5896fd4eabfc +commit https://github.com/intel/llvm/commit/16e0670ab6e2425a20e13aec2c7f5896fd4eabfc Author: Ross Brunton Date: Fri May 24 14:25:29 2024 +0100 @@ -1549,13 +1549,13 @@ Date: Fri May 24 14:25:29 2024 +0100 Co-authored-by: Kenneth Benzie (Benie) -commit 7fa793bc7d17b9447ac0726bd01eb33680432d38 +commit https://github.com/intel/llvm/commit/7fa793bc7d17b9447ac0726bd01eb33680432d38 Author: Kenneth Benzie (Benie) Date: Fri May 24 13:33:08 2024 +0100 [UR] Bump L0 tag to e4287455 (#13910) -commit f05c1c82d07a81050db4931eef6b8d02d359a325 +commit https://github.com/intel/llvm/commit/f05c1c82d07a81050db4931eef6b8d02d359a325 Author: Hugh Delaney Date: Wed May 22 14:37:14 2024 +0100 @@ -1570,7 +1570,7 @@ Date: Wed May 22 14:37:14 2024 +0100 Co-authored-by: Kenneth Benzie (Benie) -commit 5a09c6a15279484434df299d9164d94b96d3507a +commit https://github.com/intel/llvm/commit/5a09c6a15279484434df299d9164d94b96d3507a Author: Kenneth Benzie (Benie) Date: Wed May 22 10:42:06 2024 +0100 @@ -1578,7 +1578,7 @@ Date: Wed May 22 10:42:06 2024 +0100 https://github.com/oneapi-src/unified-runtime/pull/1401 -commit ba19132218050c4791c5aa82316cc10e38986f75 +commit https://github.com/intel/llvm/commit/ba19132218050c4791c5aa82316cc10e38986f75 Author: Hugh Delaney Date: Thu May 16 15:40:30 2024 +0100 @@ -1590,13 +1590,13 @@ Date: Thu May 16 15:40:30 2024 +0100 Co-authored-by: Kenneth Benzie (Benie) -commit 99d7097c10ae92805c6a100ffb544bdf0630c063 +commit https://github.com/intel/llvm/commit/99d7097c10ae92805c6a100ffb544bdf0630c063 Author: Neil R. Spruit Date: Thu May 16 07:06:42 2024 -0700 [UR][L0] ensure a valid kernel handle for the device when reading max wg (#13797) - pre-commit PR for + pre-commit https://github.com/intel/llvm/commit/PR for https://github.com/oneapi-src/unified-runtime/pull/1611 --------- @@ -1604,19 +1604,19 @@ Date: Thu May 16 07:06:42 2024 -0700 Signed-off-by: Neil R. Spruit Co-authored-by: Kenneth Benzie (Benie) -commit 34292bbc89f71233ef687652c33c52b55a38839e +commit https://github.com/intel/llvm/commit/34292bbc89f71233ef687652c33c52b55a38839e Author: Neil R. Spruit Date: Wed May 15 07:43:11 2024 -0700 [UR][L0] Fix timestamp event evict after delete (#13717) - pre-commit PR for + pre-commit https://github.com/intel/llvm/commit/PR for https://github.com/oneapi-src/unified-runtime/pull/1592 Signed-off-by: Neil R. Spruit Co-authored-by: Kenneth Benzie (Benie) -commit 9bf81044bfbe229b6846c96819a470e62065469a +commit https://github.com/intel/llvm/commit/9bf81044bfbe229b6846c96819a470e62065469a Author: Ewan Crawford Date: Wed May 15 15:05:06 2024 +0100 @@ -1628,18 +1628,18 @@ Date: Wed May 15 15:05:06 2024 +0100 Co-authored-by: Kenneth Benzie (Benie) -commit a5da94d1fb9a46f0a8334db500f26d30b62c1c02 +commit https://github.com/intel/llvm/commit/a5da94d1fb9a46f0a8334db500f26d30b62c1c02 Author: Neil R. Spruit Date: Fri May 10 02:20:00 2024 -0700 [UR][L0] Disable Usage of Driver In order Lists by default (#13715) - pre-commit PR for + pre-commit https://github.com/intel/llvm/commit/PR for https://github.com/oneapi-src/unified-runtime/pull/1591 Signed-off-by: Neil R. Spruit -commit 8736efe32c7280335607b3d50f85692e29038097 +commit https://github.com/intel/llvm/commit/8736efe32c7280335607b3d50f85692e29038097 Author: Fábio Date: Wed May 8 13:25:04 2024 +0100 @@ -1648,7 +1648,7 @@ Date: Wed May 8 13:25:04 2024 +0100 UR PR: https://github.com/oneapi-src/unified-runtime/pull/1499 -commit c6be822ba3fbf1dc7c2f89805493400704ad89b5 +commit https://github.com/intel/llvm/commit/c6be822ba3fbf1dc7c2f89805493400704ad89b5 Author: Neil R. Spruit Date: Tue May 7 02:47:56 2024 -0700 @@ -1656,7 +1656,7 @@ Date: Tue May 7 02:47:56 2024 -0700 Signed-off-by: Neil R. Spruit -commit dd183bf2a706571e29428a425d3a5f9bb6133a69 +commit https://github.com/intel/llvm/commit/dd183bf2a706571e29428a425d3a5f9bb6133a69 Author: aarongreig Date: Tue May 7 10:30:00 2024 +0100 @@ -1664,7 +1664,7 @@ Date: Tue May 7 10:30:00 2024 +0100 UR PR https://github.com/oneapi-src/unified-runtime/pull/1513 -commit 85037b20a9131400ce7cddac9c215adf563b6577 +commit https://github.com/intel/llvm/commit/85037b20a9131400ce7cddac9c215adf563b6577 Author: Kenneth Benzie (Benie) Date: Fri May 3 19:22:38 2024 +0100 @@ -1672,7 +1672,7 @@ Date: Fri May 3 19:22:38 2024 +0100 https://github.com/oneapi-src/unified-runtime/pull/1549 -commit d1dddccded89ee1b34a120575726022ef8c97634 +commit https://github.com/intel/llvm/commit/d1dddccded89ee1b34a120575726022ef8c97634 Author: Piotr Balcer Date: Fri May 3 12:57:22 2024 +0200 @@ -1680,13 +1680,13 @@ Date: Fri May 3 12:57:22 2024 +0200 Co-authored-by: Kenneth Benzie (Benie) -commit 1a5595f8e43a12fc361c8868b04f265182259657 +commit https://github.com/intel/llvm/commit/1a5595f8e43a12fc361c8868b04f265182259657 Author: Neil R. Spruit Date: Thu May 2 11:31:48 2024 -0700 [UR][L0] Enable Batching out of order commands without signal events (#13462) - - pre-commit PR for + - pre-commit https://github.com/intel/llvm/commit/PR for https://github.com/oneapi-src/unified-runtime/pull/1526 --------- @@ -1694,7 +1694,7 @@ Date: Thu May 2 11:31:48 2024 -0700 Signed-off-by: Neil R. Spruit Co-authored-by: Kenneth Benzie (Benie) -commit f34a65012c21192d6f90c10a893cffb35a250dff +commit https://github.com/intel/llvm/commit/f34a65012c21192d6f90c10a893cffb35a250dff Author: Konrad Kusiak Date: Thu May 2 07:22:06 2024 +0100 @@ -1706,7 +1706,7 @@ Date: Thu May 2 07:22:06 2024 +0100 Co-authored-by: Kenneth Benzie (Benie) -commit 4c7baa7aa553ce5a6f68eeb74851ece279efbd3d +commit https://github.com/intel/llvm/commit/4c7baa7aa553ce5a6f68eeb74851ece279efbd3d Author: jinge90 Date: Tue Apr 30 21:20:23 2024 +0800 @@ -1719,7 +1719,7 @@ Date: Tue Apr 30 21:20:23 2024 +0800 Signed-off-by: jinge90 Co-authored-by: Kenneth Benzie (Benie) -commit 15c9c62bc171c849588fa58029f4c40dc142e80f +commit https://github.com/intel/llvm/commit/15c9c62bc171c849588fa58029f4c40dc142e80f Author: Omar Ahmed <30423288+omarahmed1111@users.noreply.github.com> Date: Tue Apr 30 11:04:50 2024 +0100 @@ -1736,26 +1736,26 @@ Date: Tue Apr 30 11:04:50 2024 +0100 Co-authored-by: Kenneth Benzie (Benie) -commit 7cd48cbac6ddcb3e950748b76acd42812fab18bb +commit https://github.com/intel/llvm/commit/7cd48cbac6ddcb3e950748b76acd42812fab18bb Author: Ben Tracy Date: Mon Apr 29 08:37:48 2024 +0100 - [SYCL][Graph] Bump UR commit for in-order L0 optimization (#13565) + [SYCL][Graph] Bump UR commit https://github.com/intel/llvm/commit/for in-order L0 optimization (#13565) - - Bumps commit only and includes minimal pi2ur changes for new + - Bumps commit https://github.com/intel/llvm/commit/only and includes minimal pi2ur changes for new descriptor members - In-order path not currently used, enable profiling by default (match previous behaviour) UR PR: https://github.com/oneapi-src/unified-runtime/pull/1442 -commit 95420f09ea81539d8c18fbb7d7406ec82947aeb5 +commit https://github.com/intel/llvm/commit/95420f09ea81539d8c18fbb7d7406ec82947aeb5 Author: Winston Zhang Date: Fri Apr 26 06:56:11 2024 -0700 [UR][L0] Testing for counter-based-events implementation in URT draft (#12848) - commit tag: 4134bfce72d33e89eebcad11186bdf00310bba83 + commit https://github.com/intel/llvm/commit/tag: 4134bfce72d33e89eebcad11186bdf00310bba83 URT PR: https://github.com/oneapi-src/unified-runtime/pull/1370 --------- @@ -1763,13 +1763,13 @@ Date: Fri Apr 26 06:56:11 2024 -0700 Signed-off-by: Zhang, Winston Co-authored-by: Kenneth Benzie (Benie) -commit fc94a16a8a97464f96ea07bf77600d6337c00f76 +commit https://github.com/intel/llvm/commit/fc94a16a8a97464f96ea07bf77600d6337c00f76 Author: Neil R. Spruit Date: Fri Apr 26 03:16:19 2024 -0700 [UR][L0] reset command lists on error unknown (#13522) - pre-commit PR for + pre-commit https://github.com/intel/llvm/commit/PR for https://github.com/oneapi-src/unified-runtime/pull/1539 --------- @@ -1777,17 +1777,17 @@ Date: Fri Apr 26 03:16:19 2024 -0700 Signed-off-by: Neil R. Spruit Co-authored-by: Kenneth Benzie (Benie) -commit 719207dbac44ebb1bcae96eca992276171172120 +commit https://github.com/intel/llvm/commit/719207dbac44ebb1bcae96eca992276171172120 Author: Ewan Crawford Date: Wed Apr 24 09:10:44 2024 +0100 - [SYCL][Graph] Bump UR commit to OpenCL kernel update (#12724) + [SYCL][Graph] Bump UR commit https://github.com/intel/llvm/commit/to OpenCL kernel update (#12724) - Test the UR commit that enables updating kernel commands in a + Test the UR commit https://github.com/intel/llvm/commit/that enables updating kernel commands in a command-buffer in the OpenCL adapter https://github.com/oneapi-src/unified-runtime/pull/1358 -commit 96b07cf9c3b8407194d0082b0b30170f4f232a39 +commit https://github.com/intel/llvm/commit/96b07cf9c3b8407194d0082b0b30170f4f232a39 Author: Kenneth Benzie (Benie) Date: Tue Apr 23 11:14:52 2024 +0100 @@ -1797,7 +1797,7 @@ Date: Tue Apr 23 11:14:52 2024 +0100 * https://github.com/oneapi-src/unified-runtime/pull/1183 * https://github.com/oneapi-src/unified-runtime/pull/1243 -commit 723b7b7b043783f04b6b0ec2195971a5e95f216b +commit https://github.com/intel/llvm/commit/723b7b7b043783f04b6b0ec2195971a5e95f216b Author: aarongreig Date: Fri Apr 19 22:15:55 2024 +0100 @@ -1810,7 +1810,7 @@ Date: Fri Apr 19 22:15:55 2024 +0100 Also remove now unnecessary XFAIL from Basic/kernel_max_wg_size.cpp -commit 8cd2eb0ac2efc65cd109e0bfce02aedd69ce4cf2 +commit https://github.com/intel/llvm/commit/8cd2eb0ac2efc65cd109e0bfce02aedd69ce4cf2 Author: Igor Chorążewicz Date: Tue Apr 16 11:17:23 2024 -0700 @@ -1822,7 +1822,7 @@ Date: Tue Apr 16 11:17:23 2024 -0700 Fix this by making CHECK-NOT only match output generated by UR_L0_DEBUG. -commit 9958a742ab498b89fb5c49ccbe94fe6f9a7a6bf6 +commit https://github.com/intel/llvm/commit/9958a742ab498b89fb5c49ccbe94fe6f9a7a6bf6 Author: Kenneth Benzie (Benie) Date: Tue Apr 16 18:00:03 2024 +0100 @@ -1830,7 +1830,7 @@ Date: Tue Apr 16 18:00:03 2024 +0100 https://github.com/oneapi-src/unified-runtime/pull/1510 -commit c959b5313c6c74f206609b526272206a4b144315 +commit https://github.com/intel/llvm/commit/c959b5313c6c74f206609b526272206a4b144315 Author: Hugh Delaney Date: Tue Apr 16 07:56:06 2024 -0500 @@ -1838,7 +1838,7 @@ Date: Tue Apr 16 07:56:06 2024 -0500 https://github.com/oneapi-src/unified-runtime/pull/1437 -commit 684cd90e22fe67d4a524be92c69e026cca262f1c +commit https://github.com/intel/llvm/commit/684cd90e22fe67d4a524be92c69e026cca262f1c Author: aarongreig Date: Tue Apr 16 13:17:43 2024 +0100 @@ -1849,7 +1849,7 @@ Date: Tue Apr 16 13:17:43 2024 +0100 https://github.com/oneapi-src/unified-runtime/pull/1494 https://github.com/oneapi-src/unified-runtime/pull/1507 -commit e6d9d4c6bfabae78c29aa3b376e568974860a219 +commit https://github.com/intel/llvm/commit/e6d9d4c6bfabae78c29aa3b376e568974860a219 Author: Kenneth Benzie (Benie) Date: Tue Apr 16 10:29:06 2024 +0100 @@ -1857,7 +1857,7 @@ Date: Tue Apr 16 10:29:06 2024 +0100 https://github.com/oneapi-src/unified-runtime/pull/1342 -commit a884a54914f9e9cf052591d70eb5cac20a25a210 +commit https://github.com/intel/llvm/commit/a884a54914f9e9cf052591d70eb5cac20a25a210 Author: Neil R. Spruit Date: Mon Apr 15 01:58:14 2024 -0700 @@ -1867,7 +1867,7 @@ Date: Mon Apr 15 01:58:14 2024 -0700 - reenable the image interop test with fix to image interop in release builds - - precommit PR for + - precommit https://github.com/intel/llvm/commit/PR for https://github.com/oneapi-src/unified-runtime/pull/1498 --------- @@ -1875,7 +1875,7 @@ Date: Mon Apr 15 01:58:14 2024 -0700 Signed-off-by: Neil R. Spruit Co-authored-by: Aaron Greig -commit 71358f095be30b1cccd8c39a5ac2224fab9491b5 +commit https://github.com/intel/llvm/commit/71358f095be30b1cccd8c39a5ac2224fab9491b5 Author: Kenneth Benzie (Benie) Date: Mon Apr 15 09:49:02 2024 +0100 @@ -1883,7 +1883,7 @@ Date: Mon Apr 15 09:49:02 2024 +0100 https://github.com/oneapi-src/unified-runtime/pull/1317 -commit d7c5a9c6b2c9edb52f14adca5c84c3c3e3419d7b +commit https://github.com/intel/llvm/commit/d7c5a9c6b2c9edb52f14adca5c84c3c3e3419d7b Author: Konrad Kusiak Date: Mon Apr 15 08:43:44 2024 +0100 @@ -1894,7 +1894,7 @@ Date: Mon Apr 15 08:43:44 2024 +0100 Co-authored-by: Kenneth Benzie (Benie) -commit 16da4ec202cc818b9e79b75ecd9b7e301e3bea53 +commit https://github.com/intel/llvm/commit/16da4ec202cc818b9e79b75ecd9b7e301e3bea53 Author: Konrad Kusiak Date: Fri Apr 12 15:33:18 2024 +0100 @@ -1902,13 +1902,13 @@ Date: Fri Apr 12 15:33:18 2024 +0100 https://github.com/oneapi-src/unified-runtime/pull/1395 -commit 6ba50805672c72654c8288d33960f36c09cc89bb +commit https://github.com/intel/llvm/commit/6ba50805672c72654c8288d33960f36c09cc89bb Author: Neil R. Spruit Date: Fri Apr 12 02:07:48 2024 -0700 [UR][L0] Fix regular in order command list reuse given inorder queue (#13195) - pre-commit PR for + pre-commit https://github.com/intel/llvm/commit/PR for https://github.com/oneapi-src/unified-runtime/pull/1483 --------- @@ -1916,38 +1916,38 @@ Date: Fri Apr 12 02:07:48 2024 -0700 Signed-off-by: Neil R. Spruit Co-authored-by: Aaron Greig -commit 1c89e51aa23fbd01eab1a2bba98ffc3598470e93 +commit https://github.com/intel/llvm/commit/1c89e51aa23fbd01eab1a2bba98ffc3598470e93 Author: Ewan Crawford Date: Fri Apr 12 10:05:29 2024 +0100 [UR] Bump HIP tag to 760eaa38 (#12758) - Bump UR commit to include a bugfix for HIP UR adapter dereferencing a + Bump UR commit https://github.com/intel/llvm/commit/to include a bugfix for HIP UR adapter dereferencing a nullptr https://github.com/oneapi-src/unified-runtime/pull/1357 -commit e404d9984d1587ca130d267c342d10747bc09a1f +commit https://github.com/intel/llvm/commit/e404d9984d1587ca130d267c342d10747bc09a1f [SYCL][NATIVECPU] Threadpool implementation for Native CPU (#13176) Native CPU backend improvement to be able to run work-groups in parallel? -commit 1d52f907d28edab7e23f69175a5b00d1bbe0acdc +commit https://github.com/intel/llvm/commit/1d52f907d28edab7e23f69175a5b00d1bbe0acdc Author: Fábio Date: Wed Apr 10 17:56:05 2024 +0100 [UR] Bump CUDA tag to 6e76c98a (#12285) -commit 7cf70ddd403d3262b51d0729cdc8a19e1bec7fab +commit https://github.com/intel/llvm/commit/7cf70ddd403d3262b51d0729cdc8a19e1bec7fab Author: Kenneth Benzie (Benie) Date: Wed Apr 10 17:43:57 2024 +0100 [UR] Bump HIP tag to 08b3e8fe (#13352) -commit a14d0b548e96014c643b00927be128193781769c +commit https://github.com/intel/llvm/commit/a14d0b548e96014c643b00927be128193781769c Author: Kenneth Benzie (Benie) Date: Wed Apr 10 16:06:52 2024 +0100 [UR] Bump Native CPU tag to e2b5b7fa (#13349) -commit 60a5c90b5dc4736ff818586072b7c7a270ac40c1 +commit https://github.com/intel/llvm/commit/60a5c90b5dc4736ff818586072b7c7a270ac40c1 Author: Georgi Mirazchiyski Date: Wed Apr 10 16:06:37 2024 +0100 @@ -1959,19 +1959,19 @@ Date: Wed Apr 10 16:06:37 2024 +0100 Co-authored-by: Aaron Greig -commit e3b112bae042f3293d13dd64dc825809a4348dff +commit https://github.com/intel/llvm/commit/e3b112bae042f3293d13dd64dc825809a4348dff Author: Fábio Date: Wed Apr 10 16:06:20 2024 +0100 [UR] Bump CUDA tag to cda0cd94 (#12287) -commit 090323ea1c1007c12e184f8c990d6a45238529a0 +commit https://github.com/intel/llvm/commit/090323ea1c1007c12e184f8c990d6a45238529a0 Author: Kenneth Benzie (Benie) Date: Wed Apr 10 12:24:30 2024 +0100 [UR] Bump CUDA tag to 05b58992 (#13344) -commit cb28e0941683b921583553d9c3c5f29add7e42c2 +commit https://github.com/intel/llvm/commit/cb28e0941683b921583553d9c3c5f29add7e42c2 Author: Kenneth Benzie (Benie) Date: Tue Apr 9 17:47:00 2024 +0100 @@ -1981,27 +1981,27 @@ Date: Tue Apr 9 17:47:00 2024 +0100 OpenCL adapter changes from https://github.com/oneapi-src/unified-runtime/pull/1496. -commit c74a14414b1ae8070421ee07b037bd8e9b1e704a +commit https://github.com/intel/llvm/commit/c74a14414b1ae8070421ee07b037bd8e9b1e704a Author: Neil R. Spruit Date: Mon Apr 8 01:54:49 2024 -0700 [UR][L0] Fix DeviceInfo global mem free to report unsupported given MemCount==0 (#13209) - pre-commit PR for + pre-commit https://github.com/intel/llvm/commit/PR for https://github.com/oneapi-src/unified-runtime/pull/1486 Signed-off-by: Neil R. Spruit -commit d86a50045bbbe488869991be49cbfe3213809d72 +commit https://github.com/intel/llvm/commit/d86a50045bbbe488869991be49cbfe3213809d72 [UR][CL] Atomic order memory capability for Intel FPGA driver (#13041) Potentially user-visible fix. -commit 2e2010e2cc4acf1375cf88ce65d3a5cb8cbc9427 +commit https://github.com/intel/llvm/commit/2e2010e2cc4acf1375cf88ce65d3a5cb8cbc9427 [UR] Add DEVICE_NOT_AVAILABLE UR error code and PI translation for same. (#13206) Does it fix any actual issues in some negative cases where we previosly reported a wrong error if device is not available? -commit 3288a66d48d5aee7412ad12118794f28e6634550 +commit https://github.com/intel/llvm/commit/3288a66d48d5aee7412ad12118794f28e6634550 Author: aarongreig Date: Mon Apr 1 10:04:11 2024 +0100 @@ -2009,13 +2009,13 @@ Date: Mon Apr 1 10:04:11 2024 +0100 UR PR: https://github.com/oneapi-src/unified-runtime/pull/1467 -commit 93a1abb42f352eff587cd1a081e90089c232339b +commit https://github.com/intel/llvm/commit/93a1abb42f352eff587cd1a081e90089c232339b Author: Piotr Balcer Date: Wed Mar 27 12:11:36 2024 +0100 [UR][L0] fix a deadlock on a recursive event rwlock (#13112) -commit dd78c6e9c0dc6afc6fb5757fb88c4c5b0b0fe5b5 +commit https://github.com/intel/llvm/commit/dd78c6e9c0dc6afc6fb5757fb88c4c5b0b0fe5b5 Author: Raiyan Latif Date: Fri Mar 22 09:35:09 2024 -0700 @@ -2024,7 +2024,7 @@ Date: Fri Mar 22 09:35:09 2024 -0700 Signed-off-by: Raiyan Latif Co-authored-by: Kenneth Benzie (Benie) -commit 7c70e59db3ec813021beb970ebd21034586da53e +commit https://github.com/intel/llvm/commit/7c70e59db3ec813021beb970ebd21034586da53e Author: Ewan Crawford Date: Thu Mar 21 10:28:46 2024 +0000 @@ -2036,7 +2036,7 @@ Date: Thu Mar 21 10:28:46 2024 +0000 This requirement is also explicitly mentioned in the design doc. -commit 43f096308b03fa4c5a7f6845461a133d6cfaceae +commit https://github.com/intel/llvm/commit/43f096308b03fa4c5a7f6845461a133d6cfaceae Author: Hugh Delaney Date: Wed Mar 20 07:04:37 2024 +0000 @@ -2048,20 +2048,20 @@ Date: Wed Mar 20 07:04:37 2024 +0000 Co-authored-by: Kenneth Benzie (Benie) -commit 1f9bf7a731b16d6d0d017c35245991ca95d0aef7 +commit https://github.com/intel/llvm/commit/1f9bf7a731b16d6d0d017c35245991ca95d0aef7 Author: Artur Gainullin Date: Tue Mar 19 14:47:58 2024 -0700 [SYCL][Graph][UR] Update UR to support updating kernel commands in command buffers for L0 (#12897) -commit cf402b8473e9b3a4ee675a6154b80f0d54b198d1 +commit https://github.com/intel/llvm/commit/cf402b8473e9b3a4ee675a6154b80f0d54b198d1 [UR][L0] Support for urUsmP2PPeerAccessGetInfoExp to query p2p access… (#12983) Strictly speaking, this may have a visible effect for end users since some of queries won't always return `false` anymore. # Mar'24 release notes -Release notes for commit range [f4e0d3177338](https://github.com/intel/llvm/commit/f4ed132f243ab43816ebe826669d978139964df2).. [d2817d6d317db1](https://github.com/intel/llvm/commit/d2817d6d317db1143bb227168e85c409d5ab7c82) +Release notes for commit https://github.com/intel/llvm/commit/range [f4e0d3177338](https://github.com/intel/llvm/commit/f4ed132f243ab43816ebe826669d978139964df2).. [d2817d6d317db1](https://github.com/intel/llvm/commit/d2817d6d317db1143bb227168e85c409d5ab7c82) ## New Features ### SYCL Compiler @@ -2211,7 +2211,7 @@ The following changes ared only in effect if the `-fpreview-breaking-changes` fl # Nov'23 release notes -Release notes for commit range f4e0d3177338..f4ed132f243a +Release notes for commit https://github.com/intel/llvm/commit/range f4e0d3177338..f4ed132f243a ## New Features ### SYCL Compiler @@ -2382,7 +2382,7 @@ The following changes ared only in effect if the `-fpreview-breaking-changes` fl # Oct'23 release notes -Release notes for commit range [`cb91c232c661..f4e0d3177338`](https://github.com/intel/llvm/compare/cb91c232c661..f4e0d3177338) +Release notes for commit https://github.com/intel/llvm/commit/range [`cb91c232c661..f4e0d3177338`](https://github.com/intel/llvm/compare/cb91c232c661..f4e0d3177338) ## New features @@ -2708,7 +2708,7 @@ Release notes for commit range [`cb91c232c661..f4e0d3177338`](https://github.com # March'23 release notes -Release notes for commit range [`ca54ea30..cb91c232`](https://github.com/intel/llvm/compare/ca54ea30...cb91c232) +Release notes for commit https://github.com/intel/llvm/commit/range [`ca54ea30..cb91c232`](https://github.com/intel/llvm/compare/ca54ea30...cb91c232) ## New features @@ -2941,7 +2941,7 @@ in certain cases likely due to strict aliasing violations [86c08b3c] # December'22 release notes -Release notes for commit range [`0f579bae..6977f1ac`](https://github.com/intel/llvm/compare/0f579bae...6977f1ac) +Release notes for commit https://github.com/intel/llvm/commit/range [`0f579bae..6977f1ac`](https://github.com/intel/llvm/compare/0f579bae...6977f1ac) ## New features @@ -3301,7 +3301,7 @@ missing. [cd832bff] # September'22 release notes -Release notes for commit range [`4043dda3..0f579bae`](https://github.com/intel/llvm/compare/4043dda3...0f579bae) +Release notes for commit https://github.com/intel/llvm/commit/range [`4043dda3..0f579bae`](https://github.com/intel/llvm/compare/4043dda3...0f579bae) ## New features @@ -3591,7 +3591,7 @@ Release notes for commit range [`4043dda3..0f579bae`](https://github.com/intel/l # June'22 release notes -Release notes for commit range f34ba2c..4043dda +Release notes for commit https://github.com/intel/llvm/commit/range f34ba2c..4043dda ## New features ### SYCL Compiler @@ -3965,7 +3965,7 @@ Release notes for commit range f34ba2c..4043dda # December'21 release notes -Release notes for commit range 23ca0c2..27f59d8 +Release notes for commit https://github.com/intel/llvm/commit/range 23ca0c2..27f59d8 ## New features ### SYCL Compiler @@ -4390,7 +4390,7 @@ Release notes for commit range 23ca0c2..27f59d8 # September'21 release notes -Release notes for commit range 4fc5ebe..bd68232 +Release notes for commit https://github.com/intel/llvm/commit/range 4fc5ebe..bd68232 ## New features ### SYCL Compiler @@ -4634,7 +4634,7 @@ Release notes for commit range 4fc5ebe..bd68232 # July'21 release notes -Release notes for commit range 6a49170027fb..962909fe9e78 +Release notes for commit https://github.com/intel/llvm/commit/range 6a49170027fb..962909fe9e78 ## New features - Implemented SYCL 2020 specialization constants [07b27965] [ba3d657] @@ -4873,7 +4873,7 @@ Release notes for commit range 6a49170027fb..962909fe9e78 # May'21 release notes -Release notes for commit range 2ffafb95f887..6a49170027fb +Release notes for commit https://github.com/intel/llvm/commit/range 2ffafb95f887..6a49170027fb ## New features - [ESIMD] Allowed ESIMD and regular SYCL kernels to coexist in the same @@ -5078,7 +5078,7 @@ Release notes for commit range 2ffafb95f887..6a49170027fb # January'21 release notes -Release notes for commit range 5eebd1e4bfce..2ffafb95f887 +Release notes for commit https://github.com/intel/llvm/commit/range 5eebd1e4bfce..2ffafb95f887 ## New features ### SYCL Compiler @@ -5185,7 +5185,7 @@ Release notes for commit range 5eebd1e4bfce..2ffafb95f887 # December'20 release notes -Release notes for commit range 5d7e0925..5eebd1e4bfce +Release notes for commit https://github.com/intel/llvm/commit/range 5d7e0925..5eebd1e4bfce ## New features ### SYCL Compiler @@ -5317,7 +5317,7 @@ Release notes for commit range 5d7e0925..5eebd1e4bfce # November'20 release notes -Release notes for commit range c9d50752..5d7e0925 +Release notes for commit https://github.com/intel/llvm/commit/range c9d50752..5d7e0925 ## New features - Implemented support for new loop attribute(intel::nofusion) for FPGA @@ -5513,7 +5513,7 @@ Release notes for commit range c9d50752..5d7e0925 # September'20 release notes -Release notes for commit range 5976ff0..1fc0e4f +Release notes for commit https://github.com/intel/llvm/commit/range 5976ff0..1fc0e4f ## New features @@ -5652,7 +5652,7 @@ Release notes for commit range 5976ff0..1fc0e4f # August'20 release notes -Release notes for the commit range 75b3dc2..5976ff0 +Release notes for the commit https://github.com/intel/llvm/commit/range 75b3dc2..5976ff0 ## New features - Implemented basic support for the [Explicit SIMD extension](doc/extensions/experimental/sycl_ext_intel_esimd/sycl_ext_intel_esimd.md) @@ -5833,7 +5833,7 @@ Release notes for the commit range 75b3dc2..5976ff0 # June'20 release notes -Release notes for the commit range ba404be..24726df +Release notes for the commit https://github.com/intel/llvm/commit/range ba404be..24726df ## New features - Added switch to assume that each amount of work-items in each ND-range @@ -5987,7 +5987,7 @@ Release notes for the commit range ba404be..24726df # May'20 release notes -Release notes for the commit range ba404be..67d3d9e +Release notes for the commit https://github.com/intel/llvm/commit/range ba404be..67d3d9e ## New features - Implemented [reduction extension](doc/extensions/deprecated/sycl_ext_oneapi_nd_range_reductions.md) @@ -6138,7 +6138,7 @@ Release notes for the commit range ba404be..67d3d9e # March'20 release notes -Release notes for the commit range e8f1f29..ba404be +Release notes for the commit https://github.com/intel/llvm/commit/range e8f1f29..ba404be ## New features - Initial CUDA backend support [7a9a425] @@ -6310,7 +6310,7 @@ Please, see the runtime installation guide [here](https://github.com/intel/llvm/ # February'20 release notes -Release notes for commit e8f1f29 +Release notes for commit https://github.com/intel/llvm/commit/e8f1f29 ## New features - Added `__builtin_intel_fpga_mem` for the FPGA SYCL device. The built-in is @@ -6447,7 +6447,7 @@ Please, see the runtime installation guide [here](https://github.com/intel/llvm/ # December'19 release notes -Release notes for commit 78d80a1cc628af76f09c53673ada906a3d2f0131 +Release notes for commit https://github.com/intel/llvm/commit/78d80a1cc628af76f09c53673ada906a3d2f0131 ## New features - New attributes for Intel FPGA devices : `num_simd_work_items`, `bank_bits`, @@ -6548,7 +6548,7 @@ Please, see the runtime installation guide [here](https://github.com/intel/llvm/ # November'19 release notes -Release notes for commit e0a62df4e20eaf4bdff5c7dd46cbde566fbaee90 +Release notes for commit https://github.com/intel/llvm/commit/e0a62df4e20eaf4bdff5c7dd46cbde566fbaee90 ## New features @@ -6700,7 +6700,7 @@ Please, see the runtime installation guide [here](https://github.com/intel/llvm/ # October'19 release notes -Release notes for commit 918b285d8dede6ab0561fccc622f71cb858849a6 +Release notes for commit https://github.com/intel/llvm/commit/918b285d8dede6ab0561fccc622f71cb858849a6 ## New features - `cl::sycl::queue::mem_advise` method was implemented [4828db5] @@ -6891,7 +6891,7 @@ Please, see the runtime installation guide [here](https://github.com/intel/llvm/ # September'19 release notes -Release notes for commit d4efd2ae3a708fc995e61b7da9c7419dac900372 +Release notes for commit https://github.com/intel/llvm/commit/d4efd2ae3a708fc995e61b7da9c7419dac900372 ## New features - Added support for `reqd_work_group_size` attribute. [68578d7] @@ -6996,7 +6996,7 @@ Please, see the runtime installation guide [here](https://github.com/intel/llvm/ # August'19 release notes -Release notes for commit c557eb740d55e828fcf74b28d2b686c928e45318. +Release notes for commit https://github.com/intel/llvm/commit/c557eb740d55e828fcf74b28d2b686c928e45318. ## New features - Support for `image accessor` has been landed. @@ -7081,7 +7081,7 @@ Release notes for commit c557eb740d55e828fcf74b28d2b686c928e45318. # July'19 release notes -Release notes for commit 64c0262c0f0b9e1b7b2e2dcef57542a3fe3bdb97. +Release notes for commit https://github.com/intel/llvm/commit/64c0262c0f0b9e1b7b2e2dcef57542a3fe3bdb97. ## New features - `cl::sycl::stream` class support has been added. From bfcd5fe3a1bb505fa5252db5e62a6518d06d2291 Mon Sep 17 00:00:00 2001 From: Dmitry Vodoypanov Date: Tue, 20 Aug 2024 08:46:52 -0700 Subject: [PATCH 03/30] Small fix --- sycl/ReleaseNotes.md | 34 +++++++++++++++++----------------- 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/sycl/ReleaseNotes.md b/sycl/ReleaseNotes.md index 9f6eaa7d4179..406ade532356 100644 --- a/sycl/ReleaseNotes.md +++ b/sycl/ReleaseNotes.md @@ -1,6 +1,6 @@ # Release notes Jul'24 -Release notes for commit https://github.com/intel/llvm/commit/range +Release notes for commit range [d2817d6d317db1](https://github.com/intel/llvm/commit/d2817d6d317db1143bb227168e85c409d5ab7c82) ... [ebb3b4a21b3b0e](https://github.com/intel/llvm/commit/ebb3b4a21b3b0e977f44434781729df7de83e436) @@ -43,7 +43,7 @@ commit a8609a5925a3fcb2bd85636702556d15ae5574f4 [New offload][llc] Pass -relocation-model=pic option to llc when building shared libraries (#13687) commit 16007fa8be4292159f0b19e5fb911b90e3f84aa4 [SYCL][Device libs][New offload] Add missing fallback SYCL device library files (#13869) -commit /5ddc6881a4f5c2ee5f0ccbbd873e57a62bceb30d +commit 5ddc6881a4f5c2ee5f0ccbbd873e57a62bceb30d [Driver][SYCL][NewOffloadModel] Improve arch association for device (#13898) commit 7439fb46f1469cf401d89bf203f91ea22bc7ee57 [Driver][SYCL][NewOffloadModel] Hook up options for the offload-wrapper (#14001) @@ -2061,7 +2061,7 @@ commit https://github.com/intel/llvm/commit/cf402b8473e9b3a4ee675a6154b80f0d54b1 # Mar'24 release notes -Release notes for commit https://github.com/intel/llvm/commit/range [f4e0d3177338](https://github.com/intel/llvm/commit/f4ed132f243ab43816ebe826669d978139964df2).. [d2817d6d317db1](https://github.com/intel/llvm/commit/d2817d6d317db1143bb227168e85c409d5ab7c82) +Release notes for commit range [f4e0d3177338](https://github.com/intel/llvm/commit/f4ed132f243ab43816ebe826669d978139964df2).. [d2817d6d317db1](https://github.com/intel/llvm/commit/d2817d6d317db1143bb227168e85c409d5ab7c82) ## New Features ### SYCL Compiler @@ -2211,7 +2211,7 @@ The following changes ared only in effect if the `-fpreview-breaking-changes` fl # Nov'23 release notes -Release notes for commit https://github.com/intel/llvm/commit/range f4e0d3177338..f4ed132f243a +Release notes for commit range f4e0d3177338..f4ed132f243a ## New Features ### SYCL Compiler @@ -2382,7 +2382,7 @@ The following changes ared only in effect if the `-fpreview-breaking-changes` fl # Oct'23 release notes -Release notes for commit https://github.com/intel/llvm/commit/range [`cb91c232c661..f4e0d3177338`](https://github.com/intel/llvm/compare/cb91c232c661..f4e0d3177338) +Release notes for commit range [`cb91c232c661..f4e0d3177338`](https://github.com/intel/llvm/compare/cb91c232c661..f4e0d3177338) ## New features @@ -2708,7 +2708,7 @@ Release notes for commit https://github.com/intel/llvm/commit/range [`cb91c232c6 # March'23 release notes -Release notes for commit https://github.com/intel/llvm/commit/range [`ca54ea30..cb91c232`](https://github.com/intel/llvm/compare/ca54ea30...cb91c232) +Release notes for commit range [`ca54ea30..cb91c232`](https://github.com/intel/llvm/compare/ca54ea30...cb91c232) ## New features @@ -2941,7 +2941,7 @@ in certain cases likely due to strict aliasing violations [86c08b3c] # December'22 release notes -Release notes for commit https://github.com/intel/llvm/commit/range [`0f579bae..6977f1ac`](https://github.com/intel/llvm/compare/0f579bae...6977f1ac) +Release notes for commit range [`0f579bae..6977f1ac`](https://github.com/intel/llvm/compare/0f579bae...6977f1ac) ## New features @@ -3301,7 +3301,7 @@ missing. [cd832bff] # September'22 release notes -Release notes for commit https://github.com/intel/llvm/commit/range [`4043dda3..0f579bae`](https://github.com/intel/llvm/compare/4043dda3...0f579bae) +Release notes for commit range [`4043dda3..0f579bae`](https://github.com/intel/llvm/compare/4043dda3...0f579bae) ## New features @@ -3591,7 +3591,7 @@ Release notes for commit https://github.com/intel/llvm/commit/range [`4043dda3.. # June'22 release notes -Release notes for commit https://github.com/intel/llvm/commit/range f34ba2c..4043dda +Release notes for commit range f34ba2c..4043dda ## New features ### SYCL Compiler @@ -3965,7 +3965,7 @@ Release notes for commit https://github.com/intel/llvm/commit/range f34ba2c..404 # December'21 release notes -Release notes for commit https://github.com/intel/llvm/commit/range 23ca0c2..27f59d8 +Release notes for commit range 23ca0c2..27f59d8 ## New features ### SYCL Compiler @@ -4390,7 +4390,7 @@ Release notes for commit https://github.com/intel/llvm/commit/range 23ca0c2..27f # September'21 release notes -Release notes for commit https://github.com/intel/llvm/commit/range 4fc5ebe..bd68232 +Release notes for commit range 4fc5ebe..bd68232 ## New features ### SYCL Compiler @@ -4634,7 +4634,7 @@ Release notes for commit https://github.com/intel/llvm/commit/range 4fc5ebe..bd6 # July'21 release notes -Release notes for commit https://github.com/intel/llvm/commit/range 6a49170027fb..962909fe9e78 +Release notes for commit range 6a49170027fb..962909fe9e78 ## New features - Implemented SYCL 2020 specialization constants [07b27965] [ba3d657] @@ -4873,7 +4873,7 @@ Release notes for commit https://github.com/intel/llvm/commit/range 6a49170027fb # May'21 release notes -Release notes for commit https://github.com/intel/llvm/commit/range 2ffafb95f887..6a49170027fb +Release notes for commit range 2ffafb95f887..6a49170027fb ## New features - [ESIMD] Allowed ESIMD and regular SYCL kernels to coexist in the same @@ -5078,7 +5078,7 @@ Release notes for commit https://github.com/intel/llvm/commit/range 2ffafb95f887 # January'21 release notes -Release notes for commit https://github.com/intel/llvm/commit/range 5eebd1e4bfce..2ffafb95f887 +Release notes for commit range 5eebd1e4bfce..2ffafb95f887 ## New features ### SYCL Compiler @@ -5185,7 +5185,7 @@ Release notes for commit https://github.com/intel/llvm/commit/range 5eebd1e4bfce # December'20 release notes -Release notes for commit https://github.com/intel/llvm/commit/range 5d7e0925..5eebd1e4bfce +Release notes for commit range 5d7e0925..5eebd1e4bfce ## New features ### SYCL Compiler @@ -5317,7 +5317,7 @@ Release notes for commit https://github.com/intel/llvm/commit/range 5d7e0925..5e # November'20 release notes -Release notes for commit https://github.com/intel/llvm/commit/range c9d50752..5d7e0925 +Release notes for commit range c9d50752..5d7e0925 ## New features - Implemented support for new loop attribute(intel::nofusion) for FPGA @@ -5513,7 +5513,7 @@ Release notes for commit https://github.com/intel/llvm/commit/range c9d50752..5d # September'20 release notes -Release notes for commit https://github.com/intel/llvm/commit/range 5976ff0..1fc0e4f +Release notes for commit range 5976ff0..1fc0e4f ## New features From e4f971aebcc0786c8bc393b978936cbd76a65e1f Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Wed, 4 Sep 2024 03:47:13 -0700 Subject: [PATCH 04/30] Further updates --- sycl/ReleaseNotes.md | 257 +++++++++++-------------------------------- 1 file changed, 67 insertions(+), 190 deletions(-) diff --git a/sycl/ReleaseNotes.md b/sycl/ReleaseNotes.md index 406ade532356..02d134c71534 100644 --- a/sycl/ReleaseNotes.md +++ b/sycl/ReleaseNotes.md @@ -28,6 +28,11 @@ Release notes for commit range type. This mode is only supported by Intel GPUs. intel/llvm#13912 - Introduced `__PTX_VERSION__` macro that corresponds to the PTX version used when compiling NVPTX. intel/llvm#14621 +- Added support for `::rand` and `::srand` in device code on Intel devices. intel/llvm#13506 +- Added support for `sm90a` CUDA target architecture. intel/llvm#14075 +- Added support for detecting misaligned data accesses via address sanitizer. intell/llvm#14148 +- Added support for emitting multiple error reports via address sanitizer + through `-fsanitize-recover=address`. intel/llvm#13948 commit 5b3e7c8f60f0c66ec92f2a19b43ec147d40bd5ed [SYCL][New offload driver][LLVM-SPIRV] Send all translator options to linker wrapper (#13394) @@ -186,6 +191,19 @@ commit https://github.com/intel/llvm/commit/c8ae6c68943b9635cd9822f3c9ee7b5cc8d9 `-fno-system-debug`. intel/llvm#13256 - Improved wording of an error about implicit `this` capture in a kernel. intel/llvm#14100 - Improved `--save-temps` to work with `-fsycl-host-compiler`. intel/llvm#114751 +- Improved error message about missing AMDGPU architecture when several values + are passed into `-fsycl-targets`. intel/llvm#13078 +- Reduced list of commands invoked to generate dependencies using `-MD` flag + by one command. intel/llvm#13217 +- Enchanced diagnostic emitted if CUDA target triple passed to `-fsycl-targets` + is incorrect. intel/llvm#14673 +- Reduced size of shadow memory used by address sanitizer to avoid running out + of memory in multi-GPU environments. intel/llvm#13857 +- Enchanced address sanitizer to be able to detect out-of-bounds access to + local accessors. intel/llvm#13503 +- Enchanced address sanitizer to detect incorrect uses of USM deallocation + functions (like calling `sycl::free` on a pointer that was not allocated as + a USM pointer). intel/llvm#12882 ### SYCL Library @@ -213,8 +231,8 @@ commit https://github.com/intel/llvm/commit/c8ae6c68943b9635cd9822f3c9ee7b5cc8d9 intel/llvm#14520 intel/llvm#14485 intel/llvm#14510 intel/llvm#14483 intel/llvm#14487 intel/llvm#14488 - Added support for `sycl::vec::convert` to/from `vec`. intel/llvm#14105 -- Deprecated `marray::operator++/--`, `accessor::get_multi_ptr` for - non-device accessors. intel/llvm#13443 +- Deprecated `marray::operator++/--`. intel/llvm#13443 +- Deprecated `accessor::get_multi_ptr` for non-device accessors. intel/llvm#13443 - Moved ESIMD named barrier APIs out of `experimental` namespace. intel/llvm#13704 - Implemented latest revision of [`sycl_ext_oneapi_free_function_queries`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/supported/sycl_ext_oneapi_free_function_queries.asciidoc) extension. intel/llvm#13257 @@ -238,7 +256,15 @@ commit https://github.com/intel/llvm/commit/c8ae6c68943b9635cd9822f3c9ee7b5cc8d9 `opencl` backend. intel/llvm#14119 - Moved bit shift and rotate ESIMD functions out of `experimental` namespace. intel/llvm#13545 +- moved `rdtsc` ESIMD function out of `experimental` namespace. intel/llvm#13417 - Added check for template argument `N` of `media_block_load` ESIMD API. intel/llvm#13668 +- Enchanced deprecation message for `sub_group::barrier` to indicate which API + should be used instead. intel/llvm#13276 +- Added deprecation messages for `image_max_array_size` and `opencl_c_version` + device info queries. intel/llvm#13279 +- Updated [`sycl_ext_intel_device_info`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/supported/sycl_ext_intel_device_info.md) + extension implementation to throw synchronious exception with + `feature_not_supported` error code. intel/llvm#14788 commit https://github.com/intel/llvm/commit/c5b174d8507cad1328b3121e650120e85f1da213 [SYCL] Implement latest version of sycl_ext_oneapi_free_function_queries (#13257) @@ -272,6 +298,8 @@ commit https://github.com/intel/llvm/commit/ebb3b4a21b3b0e977f44434781729df7de83 - Updated [`sycl_ext_oneapi_device_architecture`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_device_architecture.asciidoc) to return new `unknown` enumerator if device architecture cannot be properly detected. intel/llvm#14077 +- Clarified [`sycl_ext_intel_device_info`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/supported/sycl_ext_intel_device_info.md) + extension to specify which exact exceptions are being thrown on errors. intel/llvm#14576 commit https://github.com/intel/llvm/commit/ffc0de03f900da2d0262ea8ec41ac3847a1edbcc [SYCL][Graph][Doc] Remove outdated limitation from spec (#13163) @@ -424,13 +452,11 @@ commit https://github.com/intel/llvm/commit/dce651bd69ea12c935c70990ed3290007a00 commit https://github.com/intel/llvm/commit/4e36825beabb4b4a7435470ac633768dcbd7b376 [SYCL] Record aspect names when computing device requirements (#13974) + likely not user-visible and needs to be merged with other optional kernel features AOT items commit https://github.com/intel/llvm/commit/a35f862445b5666c63469cda2656b0a9946df25c [SYCL][Graph] fix the address pointer in graph print (#13595) -commit https://github.com/intel/llvm/commit/f204869281570959af82fff638df6b34151718f4 - [SYCL] Add sm90a Cuda target architecture support (#14075) - commit https://github.com/intel/llvm/commit/c1b17e00f9b5c51db1f8385435d7a591224b01e0 [SYCL] Enable CET for wqlibsycl-devicelib-host.a (#14135) @@ -444,9 +470,6 @@ commit https://github.com/intel/llvm/commit/ea2111c1a022a1bd7a818ef9796d70d22f3b commit https://github.com/intel/llvm/commit/c2ebf84fd7ffcc8f40dd9eef2aed163437792cd5 [SYCL] Make `vec` conversion operator to scalar non-template (#14668) -commit https://github.com/intel/llvm/commit/4240ef0d9db3577b057d27233c5393cc7f6b774e - [SYCL] Add check for valid SYCL triple for NVidia GPUs. (#14673) - commit https://github.com/intel/llvm/commit/3fdfbfed1ed0062b9f3848a100093b340183c6a3 [SYCL][NATIVECPU] Support reqd_work_group_size on Native CPU (#13175) @@ -471,9 +494,6 @@ commit https://github.com/intel/llvm/commit/f51e43b2f0616934116626dc48c83282a840 commit https://github.com/intel/llvm/commit/7a9d3b1e9483b69baa0b8c6f1097016efd52854c [SYCL][NVPTX] Do not decompose SYCL functor unless necessary (#14434) -commit https://github.com/intel/llvm/commit/d42d90e52d71c16739e26de353f69930cbe1f860 - [SYCL] Change the ext_intel_device_info spec to throw a feature not supported error when a query is not supported (#14576) - commit https://github.com/intel/llvm/commit/3561c9bb854d35eeb9fc4da3550334faaf316a4f [SYCL] Add support of more Intel GPU arch versions to sycl_ext_oneapi_device_architecture (#14582) @@ -558,33 +578,12 @@ commit https://github.com/intel/llvm/commit/771ffa4e967f3058c500c87297c2d1a7be15 commit https://github.com/intel/llvm/commit/e1119d9d2753dc9165e10c2e8c11e222cc549ba9 [SYCL][ESIMD] Add more compile time checks to rdregion and wrregion API (#13158) -commit https://github.com/intel/llvm/commit/c5cf452d663b96479341daa182c7e305baf542aa - [SYCL][libdevice] Add simple rand for ease of use in device (#13506) - -commit https://github.com/intel/llvm/commit/a36e9f8969a5ad4346f84c925aa89e1a00128b7f - [SYCL] APIs cleanup (#13443) - commit https://github.com/intel/llvm/commit/ef6d2bb3caf36eaa1149369f8aee1578d6e31a6e [SYCL][ESIMD] Add support for transposed prefetch for 1/2 byte elements (#13452) commit https://github.com/intel/llvm/commit/5a07640e1ce68584a60b1a0450526e928340d1e0 [SYCL][ESIMD] Add native FMA function (#13366) -commit https://github.com/intel/llvm/commit/a4fdfdad53c3f1b2e423cdf5f5f0f977ce055593 - [ESIMD][NFC][DOC] Fix misprints and punctuation in esimd-functions doc (#13239) - -commit https://github.com/intel/llvm/commit/0557c7bb247f6b90ce9e389b9bde341f98dca667 - [NFC][SYCL] Use __SYCL2020_DEPRECATED macro for any/all builtins (#13237) - -commit https://github.com/intel/llvm/commit/3f841ff1d21bac2601beaf087bc3d09170af6d35 - [SYCL] Deprecate legacy information descriptors (#13279) - -commit https://github.com/intel/llvm/commit/902dadc476617379a8d38206d6f2183b657acf62 - [SYCL] Add alternative to deprecated barrier() function for sub-group (#13276) - -commit https://github.com/intel/llvm/commit/fc9d62f6c93a47b5b980e4c2840f349c4b2db93a - [SYCL][AMDGCN] Provide a more helpful --offload-arch error (#13078) - commit https://github.com/intel/llvm/commit/0c0b58686a79c8d9a8ef547a96b5c1642480e591 [XPTI][INFRA] Sample E2E data collection timing test for XPTI (#13045) @@ -609,12 +608,6 @@ commit https://github.com/intel/llvm/commit/d0744751abe535c1470ca8833d5dd3b3d1a7 commit https://github.com/intel/llvm/commit/38e663ecd37de513d8e31afdfdf245cf8c9d17f0 [SYCL] Declare __devicelib_assert_read only when fallback assert is enabled (#13241) -commit https://github.com/intel/llvm/commit/66865607bb90f7a7ca7602e5e18d8314659ffba5 - [SYCL][NFC] Remove legacy SYCL_EXT_ONEAPI_MATRIX_VERSION usages (#13235) - -commit https://github.com/intel/llvm/commit/9612159998a1c05525f08f0a6775d875d86da518 - [Driver][SYCL] Cleanup redundant dependency steps (#13217) - commit https://github.com/intel/llvm/commit/c821dc934dc7934b0209b5d3f88a280bbaa7145c [SYCL] Add support for multiple filtered outputs in sycl-post-link (#12727) merge with other optional kernel features AOT improvements @@ -641,9 +634,6 @@ commit https://github.com/intel/llvm/commit/e9befa2d10f6c23a66ac780df7a1ddda5527 [SYCL][DebugInfo] Switch to nonsemantic-shader-200 for non-FPGA HW on linux (#13107) do we need to mention it? -commit https://github.com/intel/llvm/commit/a0d8f01c82dda1ed5227945001a179f97774474f - [SYCL][ESIMD] Move rdtsc function out of experimental namespace (#13417) - commit https://github.com/intel/llvm/commit/2a1002b9fac9c4b878c6625c3cfafa61dea07ea2 [SYCL][JIT] Load SYCL JIT lazily (#13433) @@ -661,6 +651,8 @@ commit https://github.com/intel/llvm/commit/67d8ea1cdaef29afd75f7f085f0b6c6d73af commit https://github.com/intel/llvm/commit/1665cc0dd57266d2677c625725d38973cce3e8d9 [SYCL][Graph] Enable in-order cmd-list (#13088) + perf optimization + new feature? commit https://github.com/intel/llvm/commit/d13fdbe4ee02c39b1939bae7da61392e75ce2c78 [Bindless][Exp] Add texture fetch functionality (#12447) @@ -709,9 +701,6 @@ commit https://github.com/intel/llvm/commit/9800153d373eed9bb5d23acf965541ab0a99 commit https://github.com/intel/llvm/commit/2bac63f5ebd62b29c8fe916a89b8b42ae536d609 [ESIMD] Infer address space of pointer that are passed through invoke_simd to ESIMD API to generate better code on BE (#14628) -commit https://github.com/intel/llvm/commit/14aabdd3d081fea4ab7f66edc42b4b53eb9c50fe - [SYCL] Throw exception when device does not support queries in sycl_ext_intel_device_info (#14788) - commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad6cad [DeviceSanitizer] Disable handling no return calls (#14652) // bugfix? @@ -731,6 +720,11 @@ commit https://github.com/intel/llvm/commit/e38dcdc8bb547f4b63c7b860c1cd9948c090 - Fixed a bug with incorrect file extensions being emitted in AOT compilation when `--save-temps` is used. intel/llvm#14214 - Made `-fsycl-add-default-spec-consts-image` available with `clang-cl`. intel/llvm#13168 +- Fixed an issue where performing separate compilation and linking with + `-fsycl-link` would result in "number of output files and targets should match + in unbundling mode" error emitted by the compiler during link step. intel/llvm#13002 +- Fixed a bug in address sanitizer which may lead to crashes when an application + is launched on OpenCL CPU device. intel/llvm#13262 ### SYCL Library @@ -797,6 +791,20 @@ commit https://github.com/intel/llvm/commit/e38dcdc8bb547f4b63c7b860c1cd9948c090 - Added missing `value_type` and `vector_t` member type aliases to swizzles. intel/llvm#13040 - Fixed shutdown sequence issues when SYCL RT is used from an application or library that has its own shutdown sequence using global destructors. intel/llvm#14153 +- Fixed a bug where calling `event::get_backend()` on default-constructed event + in environment where `ONEAPI_DEVICE_SELECTOR` is set and mall-formed would + result in a crash. intel/llvm#13419 intel/llvm#13521 +- Fixed a bug in `sycl-ls` where using `--ignore-device-selectors` option won't + actually ignore the env variable, but still honor them. intel/llvm#13047 +- Fixed memory order capabilities returned by Native CPU backend. intel/llvm#13469 +- Fixed variadic constructor of `sycl::ext::oneapi::experimental::properties` + to match the extension specification, i.e. there must be exactly one argument + of each property type in the `properties` that is not default constructible + and at most one argument of each property type in the `properties` that is + default constructible. intel/llvm#13676 +- Fixed a bug where using `load_2d`, `store_2d` or `prefetch_2d` ESIMD + functions would lead to a build program failure error emitted by device + compiler. intel/llvm#13613 commit https://github.com/intel/llvm/commit/c2a4054980fd4ce4c4d6cfa425cbc71b20d5f450 [SYCL] Fix barrier with wait list handling (#13863) @@ -830,6 +838,8 @@ commit https://github.com/intel/llvm/commit/33325d4af0b66c33f7a42f0bf584645972a7 ### SYCLcompat +- Fixed compilation issue on Windows with `syclcompat::cabs`. intel/llvm#13518 + commit https://github.com/intel/llvm/commit/29230c80d117e29ba113ac8522b6fd2946ac56f9 [SYCL][COMPAT] Fix using address of a temporary queue_ptr in util.hpp (#14440) Was it user-visible? @@ -888,14 +898,6 @@ commit https://github.com/intel/llvm/commit/5794326b965071a69273a1f653405670b728 [SYCL][NATIVECPU][DRIVER] Select remangled libclc variant for Native CPU (#13765) ??? -commit https://github.com/intel/llvm/commit/014004cf0f7cc21195a4a0ed4f16a003ecb7be72 - [SYCL] event() fail fast (#13419) - What was the problem with this? - -commit https://github.com/intel/llvm/commit/bc9e30eff79a091bb3db3fc1a005009049734798 - [SYCL] Use 32-bit integers where it's appropriate for matrix instructions (#12867) - do we even need to mention this? - commit https://github.com/intel/llvm/commit/0fde69dbfa18e0c9b477a916477297a832e194a3 [SYCL] Do not enable SPV_KHR_bit_instructions until downstream tools are ready (#13044) Perhaps it can be fully omitted, because it may have been "reverted" later @@ -909,19 +911,7 @@ commit https://github.com/intel/llvm/commit/5332773b17efbf10e1b72cd633c1d7e2b4f7 commit https://github.com/intel/llvm/commit/4b14d706d93891cdb5b0e6a8d4b0b027c1d54ab8 [SYCL][DeviceSanitizer] Use -asan-constructor-kind=none to disable ctor/dtor (#13259) -commit https://github.com/intel/llvm/commit/13a80f87098f4e6db25b75b46736ebd967110953 - [DeviceSanitizer] Strip off pointer casts and inbounds GEPs (#13262) - all device sanitizer PRs can probably be merged into a single line -commit https://github.com/intel/llvm/commit/4723efc481cc18160cfa2f76d89378a84c43df64 - [SYCL][DeviceSanitizer] Checking "sycl::free" related errors (#12882) -commit https://github.com/intel/llvm/commit/247e5e0a68b25af8d0f76855d231b9e5045b9c9a - [SYCL][DeviceSanitizer] Checking out-of-bounds error on sycl::local_accessor (#13503) -commit https://github.com/intel/llvm/commit/7b4fbac8f29faa533b55c80bf4adbc51f5afe833 - [DeviceSanitizer] Support multiple error reports (-fsanitize-recover=address) (#13948) -commit https://github.com/intel/llvm/commit/2a5f9137ca2a6c1004a059ed95d3bfd79cf3ad41 - [DeviceSanitizer] Support detecting misaligned access error (#14148) -commit https://github.com/intel/llvm/commit/a0cc14f9ff5aad889b31534a27e4a39d5b2c25c2 - [DeviceSanitizer]Change ASan shadow scale from 3 to 4 (#13857) + was the bug really user-visible? commit https://github.com/intel/llvm/commit/0939f39818225ce3e469e08f6a45711b449a8ad4 [SYCL] Align assert ext name with libdevice implementation (#13312) @@ -941,22 +931,9 @@ commit https://github.com/intel/llvm/commit/bcf7d4df6acf33a75c195215afad78113d14 commit https://github.com/intel/llvm/commit/d6340b67391cd8e9e4c7775a3c1ada8f2755bb06 [SYCL][Graph] in-order queue barrier fix (#13193) -commit https://github.com/intel/llvm/commit/fffe9a10d1d65d97302fd0ec88ce015ab625033d - [clang][FE][Cuda] Fix a sm90a cuda arch define check in TargetInfo (#12885) - -commit https://github.com/intel/llvm/commit/0e892be1316ccf019688e420eaa770ec4a4a30fa - [SYCL][COMPAT] Specify proper namespace for abs with sycl::complex (#13518) - commit https://github.com/intel/llvm/commit/628ede6edf2448c531bea7f818dc6819d9e7393f [libclc] Fix UB in double->int conversions (#13546) -commit https://github.com/intel/llvm/commit/13a06d8c6bb468165fbdd2a2fc24dc79d6110b4f - [ESIMD] Use 1-element mask for load_2d()/store_2d()/prefetch_2d() (#13613) - -commit https://github.com/intel/llvm/commit/646db9cdb1899bbfefbbcb77d6ea256c4e9789c0 - [SYCL][Matrix] Fix checked matrix instructions (#13287) - If I understand correctly, that is a non-functional change. @MrSidims? - commit https://github.com/intel/llvm/commit/6b2fb665e9aa0bb7f2e034a22a153b7006c19d8a [SYCL] Fix Level-Zero's `sycl::make_device` interop (#13483) @@ -969,20 +946,10 @@ commit https://github.com/intel/llvm/commit/b4e0450207b5a85d5b985de0c0ff6fecdfeb commit https://github.com/intel/llvm/commit/6934bcfb13415dc5bda85876b5cfc361678523f4 [SYCL] Do not attach reqd_work_group_size info when multiple are detected (#13523) -commit https://github.com/intel/llvm/commit/f90554f8ec4a9e58a9e96865afed98deb9615ef4 - [SYCL][Docs] Fix variadic properties ctor (#13676) - -commit https://github.com/intel/llvm/commit/601f12103f2cb3bed8c61a9b25122144bf9a663c - [clang][FE] Remove duplicate preprocessor defines of HIP memory scope (#12871) - commit https://github.com/intel/llvm/commit/563904b2aebb791adf0e1ad955a43e226c9a6caf [SYCL] Add aspect names to sycl_used_aspects before cleaning up (#13486) part of optional kernel features AOT? -commit https://github.com/intel/llvm/commit/8eff95ca51963dc6b4ec629da0dfaf134239cefc - [SYCL] Fix FloatVecToBF16Vec build (#14161) - Is it a user-visible fix? - commit https://github.com/intel/llvm/commit/82f77d10dd092ea419115f61a7715655f055b7bb [SYCL][Graph] Fix queue recording barrier to different graphs (#14212) @@ -1001,12 +968,6 @@ commit https://github.com/intel/llvm/commit/5ad97902643da043233ec21ac203cca329df commit https://github.com/intel/llvm/commit/daaece06ce68544eaae078899c559f571297d8c0 [SYCL][Graph] Fix access modes not being respected (#13011) -commit https://github.com/intel/llvm/commit/c63b49ddfacf2f17135663a320e15e93be2971aa - [Driver][SYCL] Address issue with improper bundler call with -fsycl-link (#13002) - -commit https://github.com/intel/llvm/commit/6e9a3dd987ce6f1c7384623713cea14f084cab9d - [SYCL] Fix 'ignore-device-selectors' sycl-ls CLI option on windows (#13047) - commit https://github.com/intel/llvm/commit/b13a3c4c39a356c47cda983350f06000330a42f1 [libclc][hip] Fix half shuffles and reenable reduction test (#13016) @@ -1079,6 +1040,7 @@ of some classes to use so-called preview implementation. - Removed deprecated APIs related to `sycl_ext_oneapi_free_function_queries`. intel/llvm#13257 - Moved `slm_allocator` ESIMD APIs into `experimental` namespace. intel/llvm#13901 +- Removed deprecated `usm_system_allocator` aspect. intel/llvm#13279 Breaking changes were also made to compiler flags: @@ -1177,30 +1139,15 @@ Date: Thu Jul 11 15:15:12 2024 +0100 commit https://github.com/intel/llvm/commit/c30769b122d99eb4d05bcb78f15e593491fe31ae Author: Neil R. Spruit -Date: Wed Jul 10 21:58:04 2024 -0700 - [UR][L0] Use Intel Level Zero Driver Version String extension (#14426) - - pre-commit https://github.com/intel/llvm/commit/PR for https://github.com/oneapi-src/unified-runtime/pull/1816 - - --------- - - Signed-off-by: Neil R. Spruit - Co-authored-by: Kenneth Benzie (Benie) + Sounds like improvement to stability of driver version query commit https://github.com/intel/llvm/commit/8ddd7291219256f9bcb78328cc85322037736171 Author: Ross Brunton -Date: Wed Jul 10 15:12:23 2024 +0100 - [UR] Update to new urProgramLink interface (#13085) - - Pre-commit https://github.com/intel/llvm/commit/PR for https://github.com/oneapi-src/unified-runtime/pull/1458 - - --------- - - Co-authored-by: Kenneth Benzie (Benie) + Seems to be an internal UR bugfix/improvement commit https://github.com/intel/llvm/commit/13ae57f97cfb45cbcee8db6155ac8b0f7b7fbb82 Author: Kenneth Benzie (Benie) @@ -1212,17 +1159,9 @@ Date: Wed Jul 10 10:53:12 2024 +0100 commit https://github.com/intel/llvm/commit/db4d83e3969a5f7b5313aa5fb8466dd2ebbf9283 Author: Neil R. Spruit -Date: Tue Jul 9 06:56:01 2024 -0700 - [UR][L0] Fix Queue get info and fix Queue release decrement (#14411) - - pre-commit https://github.com/intel/llvm/commit/PR for https://github.com/oneapi-src/unified-runtime/pull/1814 - - --------- - - Signed-off-by: Neil R. Spruit - Co-authored-by: Kenneth Benzie (Benie) + Could be an actual bugfix commit https://github.com/intel/llvm/commit/78ae397aab9b2040be945ee2f7f73d93404ffa06 Author: Artur Gainullin @@ -1242,17 +1181,9 @@ Date: Mon Jul 8 16:23:41 2024 +0100 commit https://github.com/intel/llvm/commit/eb03091539daa68a582ceab950379ca482e118d9 Author: Neil R. Spruit -Date: Mon Jul 8 05:50:54 2024 -0700 - [UR][L0] Fix Device Info return code to report unsupported enumeration (#14407) - - pre-commit https://github.com/intel/llvm/commit/PR for https://github.com/oneapi-src/unified-runtime/pull/1809 - - --------- - - Signed-off-by: Neil R. Spruit - Co-authored-by: Kenneth Benzie (Benie) + ??? commit https://github.com/intel/llvm/commit/577c349c5f3b1c893160de2470aff5ee3f87f0bc Author: Neil R. Spruit @@ -1271,15 +1202,9 @@ Date: Fri Jul 5 04:30:49 2024 -0700 commit https://github.com/intel/llvm/commit/f2bd076eb55a2cc79de2e9d4748967ed3cb13c9b Author: Wu Yingcong -Date: Thu Jun 27 02:26:23 2024 -0700 - [UR] fix use-after-free problems (#13855) - UR PR: https://github.com/oneapi-src/unified-runtime/pull/1637 - - --------- - - Co-authored-by: Callum Fare + Related to ASAN commit https://github.com/intel/llvm/commit/c6428bee93a01009291ee704dca9db6262045aed Author: Neil R. Spruit @@ -1395,31 +1320,6 @@ Date: Wed Jun 12 13:15:51 2024 +0100 Co-authored-by: Kenneth Benzie (Benie) -commit https://github.com/intel/llvm/commit/c168f213381645e695eaed3500a7ba7bcc655321 -Author: Andrey Alekseenko -Date: Wed Jun 12 07:01:54 2024 +0200 - - [UR] Fix size confusion for several device property queries (#12488) - - For testing https://github.com/oneapi-src/unified-runtime/pull/1282 - - --------- - - Co-authored-by: Kenneth Benzie (Benie) - -commit https://github.com/intel/llvm/commit/c41a0562e9a4a9ace16373b819d15a38ec467c4e -Author: Omar Ahmed -Date: Mon Jun 10 15:37:57 2024 +0100 - - [UR] Remove redundant mem type (#13058) - - Testing PR for [UR - PR](https://github.com/oneapi-src/unified-runtime/pull/1409) - - --------- - - Co-authored-by: Kenneth Benzie (Benie) - commit https://github.com/intel/llvm/commit/3935e06bc2e3794b7eac715c069e28c30aeaee9c Author: Ewan Crawford Date: Mon Jun 10 12:46:11 2024 +0100 @@ -1476,14 +1376,6 @@ Date: Wed Jun 5 08:48:18 2024 +0100 [UR] Bump CUDA tag to 0e38fda0 (#14030) -commit https://github.com/intel/llvm/commit/18c4fb2c57f3b937451becda4ca25468397128f5 -Author: Pietro Ghiglio -Date: Tue Jun 4 18:37:40 2024 +0200 - - [SYCL] [NATIVECPU] Report correct memory order capabilities for Native CPU (#13469) - - Testing for https://github.com/oneapi-src/unified-runtime/pull/1527 - commit https://github.com/intel/llvm/commit/781b75abfd1dac36a2c68fbc13bd6f1bb845d35b Author: Wu Yingcong Date: Tue Jun 4 06:09:03 2024 -0700 @@ -1524,6 +1416,12 @@ Date: Thu May 30 12:46:39 2024 +0100 [UR] Modify fill emulation to work for patterns which are not powers of 2 (#13779) https://github.com/oneapi-src/unified-runtime/pull/1603 + Follow-up fix: + commit https://github.com/intel/llvm/commit/f34a65012c21192d6f90c10a893cffb35a250dff + Author: Konrad Kusiak + https://github.com/oneapi-src/unified-runtime/pull/1412 + + This patch is needed for #13788 commit https://github.com/intel/llvm/commit/8086df575d7f622017521fcd2f8b2b90fdd49d39 Author: Neil R. Spruit @@ -1639,15 +1537,6 @@ Date: Fri May 10 02:20:00 2024 -0700 Signed-off-by: Neil R. Spruit -commit https://github.com/intel/llvm/commit/8736efe32c7280335607b3d50f85692e29038097 -Author: Fábio -Date: Wed May 8 13:25:04 2024 +0100 - - [UR][EXP][Command-Buffer] Remove duplicated code from headers (#13374) - - UR PR: https://github.com/oneapi-src/unified-runtime/pull/1499 - - commit https://github.com/intel/llvm/commit/c6be822ba3fbf1dc7c2f89805493400704ad89b5 Author: Neil R. Spruit Date: Tue May 7 02:47:56 2024 -0700 @@ -1694,18 +1583,6 @@ Date: Thu May 2 11:31:48 2024 -0700 Signed-off-by: Neil R. Spruit Co-authored-by: Kenneth Benzie (Benie) -commit https://github.com/intel/llvm/commit/f34a65012c21192d6f90c10a893cffb35a250dff -Author: Konrad Kusiak -Date: Thu May 2 07:22:06 2024 +0100 - - [UR] CI for: Emulate Fill with copy when patternSize is not a power of 2 (#12912) - - https://github.com/oneapi-src/unified-runtime/pull/1412 - - --------- - - Co-authored-by: Kenneth Benzie (Benie) - commit https://github.com/intel/llvm/commit/4c7baa7aa553ce5a6f68eeb74851ece279efbd3d Author: jinge90 Date: Tue Apr 30 21:20:23 2024 +0800 From 52a0a632397f589c48e6cc0a4c7bcbf4629aaebc Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Wed, 4 Sep 2024 08:42:43 -0700 Subject: [PATCH 05/30] Further updates --- sycl/ReleaseNotes.md | 172 +++++++++++-------------------------------- 1 file changed, 42 insertions(+), 130 deletions(-) diff --git a/sycl/ReleaseNotes.md b/sycl/ReleaseNotes.md index 02d134c71534..eab0b29f5731 100644 --- a/sycl/ReleaseNotes.md +++ b/sycl/ReleaseNotes.md @@ -19,7 +19,17 @@ Release notes for commit range experimental range rounding mode in which range rounding is performed across all dimensions. intel/llvm#12690 - Added support for the so-called [new offloading model](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/design/OffloadDesign.md). - It can be enabled by `--offload-new-driver` command line option and should + It can be enabled by `--offload-new-driver` command line option and provides + a better infrastucture for us. In the future we expect to leverage that + infrastructure to improve link times by reducing amount of I/O used by the + compiler and amount of external processes that it spawns. + Foundation for this work has been performed in previous release timeframe and + the following list of PRs only includes those done within this release + timeframe. intel/llvm#14252 intel/llvm#13394 intel/llvm#13648 intel/llvm#13687 + intel/llvm#14001 intel/llvm#14006 intel/llvm#14101 intel/llvm#14151 + intel/llvm#14253 intel/llvm#14177 intel/llvm#13672 intel/llvm#13688 + intel/llvm#13579 intel/llvm#13869 intel/llvm#14102 intel/llvm#14541 + intel/llvm#14541 intel/llvm#13898 intel/llvm#14143 allow us to improve link time by reducing amount of external processes and temporary files used by the compiler. **Do we need to list PRs here?** There were many of them and some of them were merged in scope of a previous release. @@ -34,43 +44,6 @@ Release notes for commit range - Added support for emitting multiple error reports via address sanitizer through `-fsanitize-recover=address`. intel/llvm#13948 -commit 5b3e7c8f60f0c66ec92f2a19b43ec147d40bd5ed - [SYCL][New offload driver][LLVM-SPIRV] Send all translator options to linker wrapper (#13394) -commit cd24d808382598565f9ec3d9e8faf3e83bfa9aa4 - [Driver][SYCL] Pass full set of sycl-post-link options to linker wrapper (#13648) -commit ece73ad61b49eaf9ecb6e2060e5f20e09e26def6 - [NewOffloadModel][SYCL DeviceLib] Generate SYCL device library objects using new offload model (#13579) -commit 90c659e4a04e0c74a0eb99d4c72d79c8bc0f783e - [Driver][SYCL][NewOffloadModel] Hook up -fsycl-device-only behaviors (#13672) -commit 91b840c2c44cbc10149f2c77aa4094d6c73bd73f - [Driver][SYCL][NewOffloadModel] Hook up -fsycl-device-obj support (#13688) -commit a8609a5925a3fcb2bd85636702556d15ae5574f4 - [New offload][llc] Pass -relocation-model=pic option to llc when building shared libraries (#13687) -commit 16007fa8be4292159f0b19e5fb911b90e3f84aa4 - [SYCL][Device libs][New offload] Add missing fallback SYCL device library files (#13869) -commit 5ddc6881a4f5c2ee5f0ccbbd873e57a62bceb30d - [Driver][SYCL][NewOffloadModel] Improve arch association for device (#13898) -commit 7439fb46f1469cf401d89bf203f91ea22bc7ee57 - [Driver][SYCL][NewOffloadModel] Hook up options for the offload-wrapper (#14001) -commit 3aee3dbc23d2981c049fdfa6b12b7f759062326d - [Driver][SYCL][NewOffload] Update option passing for packager and AOT (#14066) -commit c2cdfccf4aa90a6da35826c55815f680db3b3805 - [New offload driver][sycl-post-link] Move sycl-post-link target specific options generation to linker wrapper (#14101) -commit 934b46f2fff602bcbe7c99c13a1dd7ed6955ae4f - [Driver][SYCL][NewOffload] Fix duplication of device targets (#14143) -commit 6ecce4fdab98b31a897af23444ca4d636af89862 - [New offload driver][Device lib] Add SYCL device library files for all targets (#14102) -commit f9fd95ec6c2aa4b77b0503ae1a7a82d6747df105 - [Driver][SYCL][NewOffloadModel] Incorporate -device settings for GPU (#14151) -commit 9691782beff5456c297063223ff831d54a8cd624 - [SYCL][NFC][New offload model][llvm-spirv] Refactor llvm-spirv options generation for enabling correct use under new offload model (#14253) -commit 3e474e050206e234759115a6442cdd5fb084d3f6 - [New offload model] Cleanup the way sycl-post-link options are generated (#14177) -commit fe2b47f08bee73be8bd978c71f3946852e94d790 - [SYCL][AOT][New offload model] Add AOT support in clang-linker-wrapper for Intel CPUs/GPUs (#14252) -commit 1c13e6f5e6bae9df42b483852c60631609422043 - [SYCL][ClangLinkerWrapper] Unconditionally pass -properties to sycl-post-link (#14541) - ### SYCL Library - Added support for JIT-compilation for AMD and NVIDIA backends. intel/llvm#14280 @@ -121,17 +94,23 @@ commit 1c13e6f5e6bae9df42b483852c60631609422043 - Added `filter_device` and `list_devices` APIs. intel/llvm#14016 - Added `funnelshift_*` APIs. intel/llvm#13825 - Added `match_[any|all]_over_sub_group` APIs. intel/llvm#12973 -- Added API to manage kernel libraries loading/unloading. intel/llvm#13053 +- Added API to manage kernel libraries loading/unloading. intel/llvm#13053 intel/llvm#13932 - Added `cmul_add` API. intel/llvm#12969 - Added experimental APIs for maksed operations over sub-groups (`select`, `shift`, etc.). intel/llvm#12972 - -commit https://github.com/intel/llvm/commit/e0d020a74fee74a1fcda97b9a9854ad07bde4eae - [SYCL][COMPAT] Added utility helpers to simplify code translation (#12970) - ??? - -commit https://github.com/intel/llvm/commit/4ade7b71db910a694e1da4d73495fd1903da1622 - [SYCL][COMPAT] Added support for multiple math ops (#13005) - ??? +- Added various helper APIs: a mechanism to extract arguments from a kernel and + its kernel parameters; type casting helper for generic address -> queue + pointer; a wrapper to provide better support for logical groups; an enum to + list supported group types. intel/llvm#12970 +- Added wrappers/support to math functions `clamp`, `isnan`, `cbrt`, `min`, + `max`, `fmin_nan`, `fmax_nan`, `pow`, `relu`; wrappers are needed to support + variety of combinations of argument types compared to `sycl::` counterparts + of those functions. intel/llvm#13005 +- Added `SYCLCOMPT_CHECK_ERROR` macro which is an error handling utility for + expressions that throw exceptions. +- Added `image1d_max`, `image2d_max` and `image3d_max` device info getters + and setters. intel/llvm#13973 +- Added `get_major_version` and `get_minor_version` free functions. intel/llvm#14011 +- Expanded list of properties available through `device_info` class. intel/llvm#13050 ### Documentation @@ -142,8 +121,6 @@ commit https://github.com/intel/llvm/commit/4ade7b71db910a694e1da4d73495fd1903da - Added specification for [`sycl_ext_codeplay_enqueue_native_command`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_codeplay_enqueue_native_command.asciidoc) extension. intel/llvm#14136 - Added specification for [`SPV_INTEL_bindless_images`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/design/spirv-extensions/SPV_INTEL_bindless_images.asciidoc) extension. intel/llvm#12927 - - commit https://github.com/intel/llvm/commit/9876e19f4ff387b35b0c98c7d62e5f50e6de187d [SYCL][XPTI] 'queue_id' metadata feature refactoring (#13070) bugfix? @@ -300,6 +277,9 @@ commit https://github.com/intel/llvm/commit/ebb3b4a21b3b0e977f44434781729df7de83 detected. intel/llvm#14077 - Clarified [`sycl_ext_intel_device_info`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/supported/sycl_ext_intel_device_info.md) extension to specify which exact exceptions are being thrown on errors. intel/llvm#14576 +- Introduced versioning and release process + [documenation](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/syclcompat/README.md#versioning) + for SYCLcompat. intel/llvm#14457 commit https://github.com/intel/llvm/commit/ffc0de03f900da2d0262ea8ec41ac3847a1edbcc [SYCL][Graph][Doc] Remove outdated limitation from spec (#13163) @@ -322,19 +302,12 @@ commit https://github.com/intel/llvm/commit/1e2e6baaf86009f0f9067b1146a8ca792343 ### SYCLcompat - Added non-`const` `image2d_max` and `image3d_max` getters. intel/llvm#14138 - -commit https://github.com/intel/llvm/commit/17d2e2d8a483e7a4c33cd542a3c1381b767452bc - [SYCL][COMPAT] Add version & release process (#14457) - -commit https://github.com/intel/llvm/commit/d8c0a9342a7e71420883c8a750f89679897b9ca1 - [SYCL][COMPAT] Memory Header cleanup (#13143) - is it really user-visible? - -commit https://github.com/intel/llvm/commit/89eeb02519cfee2f1d88ffac9f07dd131099b7dd - [SYCL][COMPAT] defs.hpp update with Windows macros. SYCLCOMPAT_CHECK_ERROR added. (#13027) - -commit https://github.com/intel/llvm/commit/0b05577790f2a81cb10a41262324ff9558614f09 - [SYCL[COMPAT][CUDA] Impl masked compat shuffles on cuda (#13363) +- Introduced versioning scheme for the library. intel/llvm#14457 +- Enchanced masked shuffle functions `select_from_sub_group`, + `shift_sub_group_left`, `shift_sub_group_right` and `permute_sub_group_by_xor` + to support CUDA devices. intel/llvm#13363 +- Restricted `memory_order` argument of `atomic_ref` passed to + `experimental::nd_range_barrier` to match supported on a device. intel/llvm#12974 intel/llvm#13641 commit https://github.com/intel/llvm/commit/13c9d0ef964b17dd3e2c297b1ceb2ecb8ea2ffe9 [SYCL][Bindless][Doc][ABI-Break] Rename external semaphore destroy to release (#14535) @@ -369,10 +342,6 @@ commit https://github.com/intel/llvm/commit/bcca7a80adf50b04c0991ef48745353ac782 [SYCL][ESIMD] Move a few math operations to SPIR-V intrinsics and support new functions (#13383) that is a regression, not an improvement :) should be noted in known issues -commit https://github.com/intel/llvm/commit/1d2007ba7c661322584a60d84a40777e0e0d9567 - [SYCL][COMPAT] kernel_function and kernel_library constexpr constructors (#13932) - if those APIs were added in this release, we should squash two items into one - commit https://github.com/intel/llvm/commit/74602458d5583cf69ca575a9167def51dad15052 [SYCL][Bindless] Replace 'image_channel_order' field in 'image_descriptor' with number of channels (#13745) @@ -389,9 +358,6 @@ commit https://github.com/intel/llvm/commit/82aaf27f6f0cf97ba89b58f88a18b09e2309 commit https://github.com/intel/llvm/commit/b11a19b1896cc2f7ab43735aacf265182e22832c [Bindless][SYCL][Doc] Add HintT tparam to cubemap fetch and sample (#13742) -commit https://github.com/intel/llvm/commit/9d1cbc51854f19f89105d502db9156b11e4507f4 - [SYCL][COMPAT] nd_range barriers seq_cst by default in supported devices (#12974) - commit https://github.com/intel/llvm/commit/3756fd1b778ae4ab36bd3988bfdf9ba910b779fd [ESIMD] Enable FADD/FSUB for slm_atomic_update (#13535) ??? @@ -429,9 +395,6 @@ commit https://github.com/intel/llvm/commit/84bae21d3f63f04ca50bfffc5203909ba3fd commit https://github.com/intel/llvm/commit/da379ecfa649a520f49f8adfb97e73c72ff3fb06 [SYCL] Add support for multiple missing math ops (#13714) -commit https://github.com/intel/llvm/commit/0f6f57b43afa7ee442744b89e7a034673d58c8d8 - [SYCL][Doc] Fix typos and formating of SYCLCompat README (#13961) - commit https://github.com/intel/llvm/commit/0d1dd2d2b1e8655b96940edecef84447866e87bc [SYCL] Add a module flag for device compilations (#13880) @@ -441,15 +404,9 @@ commit https://github.com/intel/llvm/commit/29b4d855fa1a378e89182795e0d368304c40 commit https://github.com/intel/llvm/commit/9f1cee573782772f8d062f6490128c3ee6fa6911 [SYCL][CUDA] Improve kernel launch error handling for out-of-registers (#12604) -commit https://github.com/intel/llvm/commit/b49303c7e13ca0a69454eaaaeb8c3d094916218d - [SYCL][COMPAT] Add Image Max dims to device_info. Updated Max ND Range Size (#13973) - commit https://github.com/intel/llvm/commit/db54535fb389331b167807a5d8f1ed16b5695474 [AMDGPU][SYCL] Make unsafe atomic fadd opt in (#13955) -commit https://github.com/intel/llvm/commit/dce651bd69ea12c935c70990ed3290007a00c6c5 - [SYCL][COMPAT] Migrate bug fixes & refactor of get_*version APIs (#14011) - commit https://github.com/intel/llvm/commit/4e36825beabb4b4a7435470ac633768dcbd7b376 [SYCL] Record aspect names when computing device requirements (#13974) likely not user-visible and needs to be merged with other optional kernel features AOT items @@ -566,9 +523,6 @@ commit https://github.com/intel/llvm/commit/0a1381d286f7c32a256a6dab49917870769f commit https://github.com/intel/llvm/commit/ece19f298b1029121da17a423b801bc2a9267a8d [SYCL][Graph] Clarify graph in-order and out-of-order properties (#13681) -commit https://github.com/intel/llvm/commit/67f3bf292ff58136adc6383a3c5a1b19779e4120 - [SYCL][COMPAT] Removed sycl/sycl.hpp include (#13108) - commit https://github.com/intel/llvm/commit/7d55eb8a8419dac64f065bbf84125ed1d78dc992 [SYCL][Docs] Behavioral changes to in-order queue events extension (#13624) @@ -623,9 +577,6 @@ commit https://github.com/intel/llvm/commit/65bdffb1c9d4c474316d3e330fc3c59338e0 [SYCL][libclc][NATIVECPU] Implement generic atomic load for generic target (#13249) new feature? -commit https://github.com/intel/llvm/commit/d932fcae4aa83d12a3eb30a3003d1718429a9df1 - [SYCL][COMPAT] Extended device_info properties. (#13050) - commit https://github.com/intel/llvm/commit/05644a470303c2af3385b9533b8d23ebdea99eb7 [OpenCL] Config dependent-load flag to exclude CWD from DLL search path (#13327) do we report security issues? @@ -805,6 +756,9 @@ commit https://github.com/intel/llvm/commit/e38dcdc8bb547f4b63c7b860c1cd9948c090 - Fixed a bug where using `load_2d`, `store_2d` or `prefetch_2d` ESIMD functions would lead to a build program failure error emitted by device compiler. intel/llvm#13613 +- Fixed a bug where querying free device memory of integrated Intel GPUs would + return 0 instead of throwing an exception that the feature is not supported + for that device. intel/llvm#13209 commit https://github.com/intel/llvm/commit/c2a4054980fd4ce4c4d6cfa425cbc71b20d5f450 [SYCL] Fix barrier with wait list handling (#13863) @@ -839,22 +793,9 @@ commit https://github.com/intel/llvm/commit/33325d4af0b66c33f7a42f0bf584645972a7 ### SYCLcompat - Fixed compilation issue on Windows with `syclcompat::cabs`. intel/llvm#13518 - -commit https://github.com/intel/llvm/commit/29230c80d117e29ba113ac8522b6fd2946ac56f9 - [SYCL][COMPAT] Fix using address of a temporary queue_ptr in util.hpp (#14440) - Was it user-visible? - -commit https://github.com/intel/llvm/commit/510965a0a098313cc19e8a68cc405098dc9e9501 - [SYCL][COMPAT] fixed byte-dot products to properly call cuda intrinsics (#14463) - -commit https://github.com/intel/llvm/commit/3caa78ecf53644ead4f1d5fa8bc7b4a81a1f4961 - [SYCL][COMPAT] Fixes SYCLCOMPAT_PROFILING_ENABLED codepath (#14574) - -commit https://github.com/intel/llvm/commit/7b538cdc4ecb33e88682eb1b36be33b73ac68caf - [SYCL][COMPAT] fixed atomic_compare_exchange_strong not using addressSpace template parameter (#13821) - -commit https://github.com/intel/llvm/commit/d66b0baed24483da96fa135082e7c544498ce2d9 - [SYCL][COMPAT] Add inline in max and min functions (#13708) +- Fixed `atomic_compare_exchange_strong` not using address space template + parameter. intel/llvm#13821 +- Fixed compilation issues when `SYCL_COMPAT_PROFILING_ENABLED` is defined. intel/llvm#14574 commit https://github.com/intel/llvm/commit/e40283b1234e0846d1a19be537948e865a31f360 Task sequence revert (#14359) @@ -887,10 +828,6 @@ commit https://github.com/intel/llvm/commit/4b993a7b32f7743980bce646765a1b427b09 commit https://github.com/intel/llvm/commit/ea7ba1b965302277fc23ef48dba83b10e6c734e9 [ESIMD] Restore the lowering of lsc_load_stateless in sycl-post-link (#13104) -commit https://github.com/intel/llvm/commit/2053be298d1bf2417ad0b2efaf0d9360650ed491 - [SYCL][COMPAT] Reverted nd_barrier atomic_ref to acq_rel (NVPTX) (#13641) - do we ned to mention this? do we need to drop some other item? - commit https://github.com/intel/llvm/commit/267a03cd1ba5eaa55db95800712f978b93842bc5 [SYCL] [NATIVECPU] Select right libclc file for native cpu (#13478) ??? @@ -1858,17 +1795,6 @@ Date: Tue Apr 9 17:47:00 2024 +0100 OpenCL adapter changes from https://github.com/oneapi-src/unified-runtime/pull/1496. -commit https://github.com/intel/llvm/commit/c74a14414b1ae8070421ee07b037bd8e9b1e704a -Author: Neil R. Spruit -Date: Mon Apr 8 01:54:49 2024 -0700 - - [UR][L0] Fix DeviceInfo global mem free to report unsupported given MemCount==0 (#13209) - - pre-commit https://github.com/intel/llvm/commit/PR for - https://github.com/oneapi-src/unified-runtime/pull/1486 - - Signed-off-by: Neil R. Spruit - commit https://github.com/intel/llvm/commit/d86a50045bbbe488869991be49cbfe3213809d72 [UR][CL] Atomic order memory capability for Intel FPGA driver (#13041) Potentially user-visible fix. @@ -1878,14 +1804,6 @@ commit https://github.com/intel/llvm/commit/2e2010e2cc4acf1375cf88ce65d3a5cb8cbc Does it fix any actual issues in some negative cases where we previosly reported a wrong error if device is not available? -commit https://github.com/intel/llvm/commit/3288a66d48d5aee7412ad12118794f28e6634550 -Author: aarongreig -Date: Mon Apr 1 10:04:11 2024 +0100 - - [UR] Pull in UR changes to add exec error status to events. (#13127) - - UR PR: https://github.com/oneapi-src/unified-runtime/pull/1467 - commit https://github.com/intel/llvm/commit/93a1abb42f352eff587cd1a081e90089c232339b Author: Piotr Balcer Date: Wed Mar 27 12:11:36 2024 +0100 @@ -1915,15 +1833,9 @@ Date: Thu Mar 21 10:28:46 2024 +0000 commit https://github.com/intel/llvm/commit/43f096308b03fa4c5a7f6845461a133d6cfaceae Author: Hugh Delaney -Date: Wed Mar 20 07:04:37 2024 +0000 - [UR] CI for UR PR refactor-guess-local-worksize (#12663) - https://github.com/oneapi-src/unified-runtime/pull/1326 - - --------- - - Co-authored-by: Kenneth Benzie (Benie) + Could be a bugfix? commit https://github.com/intel/llvm/commit/1f9bf7a731b16d6d0d017c35245991ca95d0aef7 Author: Artur Gainullin From f043cdb4263d130752a17b5569619a2d0ba52756 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Thu, 5 Sep 2024 08:23:59 -0700 Subject: [PATCH 06/30] Further updates --- sycl/ReleaseNotes.md | 195 +++++++++++++++---------------------------- 1 file changed, 69 insertions(+), 126 deletions(-) diff --git a/sycl/ReleaseNotes.md b/sycl/ReleaseNotes.md index eab0b29f5731..2ba9ad3373ae 100644 --- a/sycl/ReleaseNotes.md +++ b/sycl/ReleaseNotes.md @@ -51,7 +51,9 @@ Release notes for commit range - Implemented [`sycl_ext_oneapi_profiling_tag`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_profiling_tag.asciidoc) extension. intel/llvm#12838 - Implemented [`sycl_ext_oneapi_forward_progress`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/proposed/sycl_ext_oneapi_forward_progress.asciidoc) extension. intel/llvm#13389 - Implemented [`sycl_ext_oneapi_private_alloca`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_private_alloca.asciidoc) extension. intel/llvm#12966 intel/llvm#13490 intel/llvm#13181 -- Implemented [`sycl_ext_oneapi_enqueue_functions`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_enqueue_functions.asciidoc) extension. intel/llvm#13512 +- Implemented + [`sycl_ext_oneapi_enqueue_functions`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_enqueue_functions.asciidoc) + extension. intel/llvm#13512 intel/llvm#13924 - Added support for `get_backend_info` API into various SYCL classes (`platform`, `context`, etc.). intel/llvm#12906 - Implemented [`sycl_ext_oneapi_group_load_store`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/proposed/sycl_ext_oneapi_group_load_store.asciidoc). Please note that the implementation is naive and does not expose any special @@ -73,6 +75,8 @@ Release notes for commit range - using `-fsycl-dead-args-optimization` (ON by default) can lead to failures - `info::kernel::num_args` won't return the right result for free function kernels +- Added experimental ESIMD function `fma` which results in a guaranteed fused + multiply-add operation performed. intel/llvm#13366 ### SYCLcompat library @@ -181,6 +185,23 @@ commit https://github.com/intel/llvm/commit/c8ae6c68943b9635cd9822f3c9ee7b5cc8d9 - Enchanced address sanitizer to detect incorrect uses of USM deallocation functions (like calling `sycl::free` on a pointer that was not allocated as a USM pointer). intel/llvm#12882 +- Enchanced `-fintelfpga` flag: when it is used together with `-fp-module=fast` + it also implies that `-vpfp-relaxed` will be passed to backend (device) + compiler. intel/llvm#13651 +- Implemented support for `memory_order::seq_cst` on CUDA backend, resolving + intel/llvm#11208. intel/llvm#12516 +- Fixed a bug with `shift_group_[right|left]`, `permute_by_xor` and + `select_from_group` algorithms would return invalid values if used with + `half` data type on AMD devices. intel/llvm#13016 +- Implementation of optional kernel features mechanism has been extended to also + support AOT compilation if so-called "special" targets are passed to + `-fsycl-targets` (see corresponding + [documentation](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/UsersManual.md#generic-options)). + Please note that this functionality relies on the compile knowing which + targets support which optional kernel features and that databaseis not yet + fully complete. In particular, data for Lunar Lake and Battlemage Intel GPUs + is still missing. intel/llvm#14590 intel/llvm#14188 intel/lvm#12727 + intel/llvm#14757 intel/llvm#13486 intel/llvm#13974 ### SYCL Library @@ -228,7 +249,7 @@ commit https://github.com/intel/llvm/commit/c8ae6c68943b9635cd9822f3c9ee7b5cc8d9 extension to return `unknown` enumerator on an unsupported HW. intel/llvm#14190 - Extended list of known Intel GPU architectures available through [`sycl_ext_oneapi_device_architecture`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_device_architecture.asciidoc) - extension. intel/llvm#13520 + extension. intel/llvm#13520 intel/llvm#14582 - Extended mechanism to clear in-memory cache in heavy tests to also work on `opencl` backend. intel/llvm#14119 - Moved bit shift and rotate ESIMD functions out of `experimental` namespace. @@ -242,6 +263,19 @@ commit https://github.com/intel/llvm/commit/c8ae6c68943b9635cd9822f3c9ee7b5cc8d9 - Updated [`sycl_ext_intel_device_info`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/supported/sycl_ext_intel_device_info.md) extension implementation to throw synchronious exception with `feature_not_supported` error code. intel/llvm#14788 +- Reduced startup overhead on `libsycl.so` loading by outlining SYCL JIT + compiler (used for kernel fusion feature) into a standalone library which is + dynamically loaded on the first use. intel/llvm#13433 +- Deprecated `this_kernel::get_root_group` in favor of + `this_work_item::get_root_group`. intel/llvm#13304 +- Relaxed diagnostic about using virtual functions in SYCL kernels: now it is + only emitted if a call is perfomed using virtual call mechanism, but it is not + emitted for non-virtual calls of virtual functions. See also + KhronosGroup/SYCL-Docs#565. intel/llvm#114051 intel/llvm#14141 +- ESIMD API `inv` was extended to support `double` arguments. intel/llvm#13838 +- Enchanced validation (via `static_assert` mechanism) of template arguments of + ESIMD `rdregion` and `wrregion` APIs. intel/llvm#13158 + commit https://github.com/intel/llvm/commit/c5b174d8507cad1328b3121e650120e85f1da213 [SYCL] Implement latest version of sycl_ext_oneapi_free_function_queries (#13257) @@ -260,7 +294,9 @@ commit https://github.com/intel/llvm/commit/ebb3b4a21b3b0e977f44434781729df7de83 - Updated [`sycl_ext_oneapi_bindless_images`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_bindless_images.asciidoc) extension to support cubemap images. intel/llvm#12996 - Updated [`sycl_ext_oneapi_bindless_images`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_bindless_images.asciidoc) extension to support sampled image arrays. intel/llvm#14237 - Updated [`sycl_ext_oneapi_bindless_images`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_bindless_images.asciidoc) extension to support support default-construction of `image_descriptor`. intel/llvm#13781 -- Updated [ESIMD functions documentation](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/supported/sycl_ext_intel_esimd/sycl_ext_intel_esimd_functions.md) to list restictions for `atomic_update` functions. intel/llvm#13202 +- Updated [ESIMD functions documentation](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/supported/sycl_ext_intel_esimd/sycl_ext_intel_esimd_functions.md) + to list restictions for `atomic_update`, `gather` and `scatter` functions. + intel/llvm#13202 intel/llvm#13196 - Updated [`sycl_ext_oneapi_bfloat16_math_functions`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_bfloat16_math_functions.asciidoc) extension to support vectors of `bfloat16` to be passed to math functions. intel/llvm#14002 - Updated [`sycl_ext_intel_device_info`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/supported/sycl_ext_intel_device_info.md) @@ -280,6 +316,15 @@ commit https://github.com/intel/llvm/commit/ebb3b4a21b3b0e977f44434781729df7de83 - Introduced versioning and release process [documenation](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/syclcompat/README.md#versioning) for SYCLcompat. intel/llvm#14457 +- Exended our + [contribution guidelines](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/developer/ContributeToDPCPP.md#unified-runtime-updates) + to document update process for Unified Runtime component. +- Updated + [`sycl_ext_oneapi_work_group_static`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/proposed/sycl_ext_oneapi_work_group_static.asciidoc) + extension (it is not `work_group_specific` anymore). intel/llvm#14271 + +commit https://github.com/intel/llvm/commit/0678c5ce0fe3af6363bd4b374ffaedb800a5b1e1 + [SYCL][Joint matrix] clarify the range of the prefetch templated arguments (#13796) commit https://github.com/intel/llvm/commit/ffc0de03f900da2d0262ea8ec41ac3847a1edbcc [SYCL][Graph][Doc] Remove outdated limitation from spec (#13163) @@ -373,6 +418,15 @@ commit https://github.com/intel/llvm/commit/89132855d4312536f5f40792194b6251d4cd commit https://github.com/intel/llvm/commit/8847c110c78684a86ec7e62d7255f1bb9c6efd4f [SYCL][NATIVECPU][libclc]Mark opencl_c_generic_address_space as unsupported on Native CPU (#13109) +commit https://github.com/intel/llvm/commit/a25d27bc9fbb2925519e966b9e7043be04274b27 + [SYCL][NATIVECPU][LIBCLC] Implement missing builtins for half type (#13829) +commit https://github.com/intel/llvm/commit/47a03418ac74f3a5492213afc192569eae1393ec + [SYCL][LIBCLC][NATIVECPU] Add aarch64 target triple for Native CPU (#13911) +commit https://github.com/intel/llvm/commit/0ce40f46ef4e2f5e8eed75e28352a90c9b8ecbaf + [SYCL] [NATIVECPU] Implement generic atomic store for generic target (#13428) +commit https://github.com/intel/llvm/commit/267a03cd1ba5eaa55db95800712f978b93842bc5 + [SYCL] [NATIVECPU] Select right libclc file for native cpu (#13478) + Waiting for feedback from Pietro on these five. commit https://github.com/intel/llvm/commit/07e3bcf9f3be46234deb471e25d94b5692353688 [SYCL][ESIMD] Use LSC for unsupported surface index block stores (#13150) @@ -386,18 +440,12 @@ commit https://github.com/intel/llvm/commit/03233e57e5585813ec2c0dbc7a10ceb4a6d1 commit https://github.com/intel/llvm/commit/6cb77fcfb37ffb445ab62ea1545422dc52128da1 [SYCL] Add -fPIC for Intel math function host code (#13800) -commit https://github.com/intel/llvm/commit/434d5edfae78307969ade6764e5bafeb17ce5073 - [SYCL] Remove redundant detail::empty_properties_t (#13777) - commit https://github.com/intel/llvm/commit/84bae21d3f63f04ca50bfffc5203909ba3fd95a6 Implement missing overloads for generic AS in generic target (#13938) commit https://github.com/intel/llvm/commit/da379ecfa649a520f49f8adfb97e73c72ff3fb06 [SYCL] Add support for multiple missing math ops (#13714) -commit https://github.com/intel/llvm/commit/0d1dd2d2b1e8655b96940edecef84447866e87bc - [SYCL] Add a module flag for device compilations (#13880) - commit https://github.com/intel/llvm/commit/29b4d855fa1a378e89182795e0d368304c40c3f6 [SYCL][CUDA] Enable support of msvc math functions for nvptx target. (#14007) @@ -407,10 +455,6 @@ commit https://github.com/intel/llvm/commit/9f1cee573782772f8d062f6490128c3ee6fa commit https://github.com/intel/llvm/commit/db54535fb389331b167807a5d8f1ed16b5695474 [AMDGPU][SYCL] Make unsafe atomic fadd opt in (#13955) -commit https://github.com/intel/llvm/commit/4e36825beabb4b4a7435470ac633768dcbd7b376 - [SYCL] Record aspect names when computing device requirements (#13974) - likely not user-visible and needs to be merged with other optional kernel features AOT items - commit https://github.com/intel/llvm/commit/a35f862445b5666c63469cda2656b0a9946df25c [SYCL][Graph] fix the address pointer in graph print (#13595) @@ -420,10 +464,6 @@ commit https://github.com/intel/llvm/commit/c1b17e00f9b5c51db1f8385435d7a591224b commit https://github.com/intel/llvm/commit/e7defabdcc3d5b460cfc593822156836b874f092 [SYCL] Use `std::array` as storage for `sycl::vec` on device (#14130) -commit https://github.com/intel/llvm/commit/0e24ac5677d8d91aed2fcc72d52d9d6b40f5985a -commit https://github.com/intel/llvm/commit/ea2111c1a022a1bd7a818ef9796d70d22f3b92d0 - [SYCL] Re-implement diagnostics about virtual calls (#14141) - commit https://github.com/intel/llvm/commit/c2ebf84fd7ffcc8f40dd9eef2aed163437792cd5 [SYCL] Make `vec` conversion operator to scalar non-template (#14668) @@ -436,33 +476,28 @@ commit https://github.com/intel/llvm/commit/fe1859085b621ea901cd8da8165992312241 commit https://github.com/intel/llvm/commit/9e4768ca9849e7188221c0e2894282730e3b1bde [SYCL][libclc] Add generic addrspace overloads of math builtins (#13015) +commit https://github.com/intel/llvm/commit/51ffc04f0f317e0395c678e1fecd654df51db955 + [SYCL][libclc] Add generic addrspace overloads of vload/vstore builtins (#13092) +commit https://github.com/intel/llvm/commit/75300ab1ceee835e07086925d990f74107a84a1d + [SYCL][libclc] Add generic fp16 math builtins for generic SPIR-V target (#13361) +commit https://github.com/intel/llvm/commit/bdaf1e27310dc2218a95f05731a422a32ea5a658 + [libclc] Separate out generic AS support macros (#13792) +commit https://github.com/intel/llvm/commit/628ede6edf2448c531bea7f818dc6819d9e7393f + [libclc] Fix UB in double->int conversions (#13546) + Waiting for feedback from frasercrmk on these commit https://github.com/intel/llvm/commit/183832b9cebd471586c0ed251876972939442327 [SYCL][PI] Add PI_ERROR_UNSUPPORTED_FEATURE error code (#13036) -commit https://github.com/intel/llvm/commit/c1e2957be8db95425f1c17df258a0830c83dcf47 - [CUDA][LIBCLC] Implement RC11 seq_cst for PTX6.0 (#12516) - -commit https://github.com/intel/llvm/commit/73be194fc27cd20968c264afdb71befc181d51ec - [SYCL] Add support for optional kernel features in AOT x86_64 compilation (#14590) -commit https://github.com/intel/llvm/commit/f51e43b2f0616934116626dc48c83282a84090ce - [SYCL] Add more aspect information for intel_gpu_* in device config file (#14188) - commit https://github.com/intel/llvm/commit/7a9d3b1e9483b69baa0b8c6f1097016efd52854c [SYCL][NVPTX] Do not decompose SYCL functor unless necessary (#14434) -commit https://github.com/intel/llvm/commit/3561c9bb854d35eeb9fc4da3550334faaf316a4f - [SYCL] Add support of more Intel GPU arch versions to sycl_ext_oneapi_device_architecture (#14582) - commit https://github.com/intel/llvm/commit/e51002c81cdf32f383104907cca820e4ed3452ba [SYCL] Enable intel joint matrix on GNR. (#14436) commit https://github.com/intel/llvm/commit/21c2e1c2213171d12acb5e6c41a713db30a0d5d4 [SYCL] Make swizzle mutating operators const friends (#13012) -commit https://github.com/intel/llvm/commit/da02e023e60d89824aad440c4f7bb558e70501a4 - [SYCL] Workaround for seg fault in `vec::convert<>` for OpenCL CPU at O0 (#14498) - commit https://github.com/intel/llvm/commit/17ee3e24e2874690f7526dcda9d8bc4679fe7edc [SYCL][NATIVECPU] Add device library and initial subgroup support (#13979) @@ -472,29 +507,15 @@ commit https://github.com/intel/llvm/commit/0b9fc099f63feadb5e476c5862de3d8fa977 commit https://github.com/intel/llvm/commit/0ccb0b7d3dd614707f82ea8f99790e2d3b08496d [SYCL][ABI-Break] Improve Queue fill (#13788) -commit https://github.com/intel/llvm/commit/005622d177c9a17dc9defefd507921daf7affc28 - [SYCL][Doc] Update work-group-specific extension (#14271) - commit https://github.com/intel/llvm/commit/93fef86cd4fb8e18c126365c404eea1ed0f1a7fa [SYCL][Graph] Permit empty & barrier nodes in WGU (#14236) -commit https://github.com/intel/llvm/commit/02ac8a414c1fd9b209d139c100cd1bbeae3729d2 - [SYCL][LIBCLC][NATIVECPU] Add checks for fp16 and fp64 in Native CPU libclc (#14242) -commit https://github.com/intel/llvm/commit/a25d27bc9fbb2925519e966b9e7043be04274b27 - [SYCL][NATIVECPU][LIBCLC] Implement missing builtins for half type (#13829) - commit https://github.com/intel/llvm/commit/4151c799ef36f2912fab3f6b9e305240ef4ff327 [SYCL][Graph] Wait instead of flush dep events in update command (#14167) -commit https://github.com/intel/llvm/commit/47a03418ac74f3a5492213afc192569eae1393ec - [SYCL][LIBCLC][NATIVECPU] Add aarch64 target triple for Native CPU (#13911) - commit https://github.com/intel/llvm/commit/f2cd2a80e7277fc62d8802673ce6ab2fac6fcbd0 [SYCL] Disable in-order queue barrier optimization while profiling (#14123) -commit https://github.com/intel/llvm/commit/e34b7fffedbe9ff73d41b172eb48c189170f99f9 - [Doc] Document Unified Runtime update process (#14097) - commit https://github.com/intel/llvm/commit/2e1f14adb3bf6d9e9c55e4b0ced9e1ece2172a4a [SYCL] Fix UB and alignment issues in the SYCL default sorter (#13975) @@ -504,15 +525,6 @@ commit https://github.com/intel/llvm/commit/4222b4ccd6dc499248c8bf026bcdd0f20700 commit https://github.com/intel/llvm/commit/5b6cc5eb7bb2106ff426815702d89569e166c4f9 [SYCL][Matrix] Enable SPV_KHR_cooperative_matrix extension (#13923) -commit https://github.com/intel/llvm/commit/03b994ead80bb381d59b1390f255119b8d211a1f - [SYCL] Add code location information to enqueue free functions (#13924) - -commit https://github.com/intel/llvm/commit/58382507f0c7bd8a5c21e3b7e1d3360f0835f26a - [ESIMD]Add support for double data type to inv API (#13838) - -commit https://github.com/intel/llvm/commit/0ce40f46ef4e2f5e8eed75e28352a90c9b8ecbaf - [SYCL] [NATIVECPU] Implement generic atomic store for generic target (#13428) - commit https://github.com/intel/llvm/commit/ccca3b73769bfd8a27eff9956630fe86a2e4832d [SYCL] Optimize SG group_store via BlockWriteINTEL in simple cases (#13734) commit https://github.com/intel/llvm/commit/48a0ff5b4b5bc21dedab37380c4ac93676277f91 @@ -526,49 +538,15 @@ commit https://github.com/intel/llvm/commit/ece19f298b1029121da17a423b801bc2a926 commit https://github.com/intel/llvm/commit/7d55eb8a8419dac64f065bbf84125ed1d78dc992 [SYCL][Docs] Behavioral changes to in-order queue events extension (#13624) -commit https://github.com/intel/llvm/commit/771ffa4e967f3058c500c87297c2d1a7be156a9b - [SYCL] Remove get_child_group() (#13482) - -commit https://github.com/intel/llvm/commit/e1119d9d2753dc9165e10c2e8c11e222cc549ba9 - [SYCL][ESIMD] Add more compile time checks to rdregion and wrregion API (#13158) - commit https://github.com/intel/llvm/commit/ef6d2bb3caf36eaa1149369f8aee1578d6e31a6e [SYCL][ESIMD] Add support for transposed prefetch for 1/2 byte elements (#13452) -commit https://github.com/intel/llvm/commit/5a07640e1ce68584a60b1a0450526e928340d1e0 - [SYCL][ESIMD] Add native FMA function (#13366) - commit https://github.com/intel/llvm/commit/0c0b58686a79c8d9a8ef547a96b5c1642480e591 [XPTI][INFRA] Sample E2E data collection timing test for XPTI (#13045) -commit https://github.com/intel/llvm/commit/24699750a7f816b7ad4ebe19342210693e20a9f3 - [ESIMD][NFC][DOC] Add 'restrictions' section to gather/scatter() doc (#13196) - -commit https://github.com/intel/llvm/commit/8867d446c360048b62064828693f4d50c945a55c - [spir-v][clang] Allow spirv32/spirv64 as target triples for sycl offloading (#13083) -commit https://github.com/intel/llvm/commit/b8f394203ec4436ddd31f72193c4c1a52e3747df - [SYCL] Fix device libraries and SYCL headers with spirv64 target (#13288) -commit https://github.com/intel/llvm/commit/8bc909e01ece4e177ae25168995be21f0d37abc6 - [SYCL][libdevice] Build for spirv64 on Linux (#13302) -commit https://github.com/intel/llvm/commit/363fceff578dcfa5a488b89f71f259da80aad2d7 - [SYCL][ESIMD] Don't override target triple to genx64 (#13445) -commit https://github.com/intel/llvm/commit/9bb2b343de3308994892961b0b48838ce7f2e91d - [SYCL][ClangLinkerWrapper] Fix SYCL binary creation with spirv64 triple (#14686) -commit https://github.com/intel/llvm/commit/f8926a63ce5a1634cb0533f4ab8eab2b6898caac - [SYCL][Libdevice] Build for spirv64 on Windows (#13649) -commit https://github.com/intel/llvm/commit/d0744751abe535c1470ca8833d5dd3b3d1a72c6b - [SPIR-V][Headers] Enable programs that include system headers on Windows for SPIRV32 and SPIRV64 targets (#13548) - commit https://github.com/intel/llvm/commit/38e663ecd37de513d8e31afdfdf245cf8c9d17f0 [SYCL] Declare __devicelib_assert_read only when fallback assert is enabled (#13241) -commit https://github.com/intel/llvm/commit/c821dc934dc7934b0209b5d3f88a280bbaa7145c - [SYCL] Add support for multiple filtered outputs in sycl-post-link (#12727) - merge with other optional kernel features AOT improvements - -commit https://github.com/intel/llvm/commit/3ea29b2a9028b485b76339e16754e3e74c9cc7a6 - [SYCL] Update root_group extension to use `this_work_item` namespace (#13304) - commit https://github.com/intel/llvm/commit/fb66f1b83559366e541381251de4281bb554613d [SYCL] Replace __builtin_bit_cast with sycl::bit_cast in imf headers (#13313) is it a bugfix? @@ -585,9 +563,6 @@ commit https://github.com/intel/llvm/commit/e9befa2d10f6c23a66ac780df7a1ddda5527 [SYCL][DebugInfo] Switch to nonsemantic-shader-200 for non-FPGA HW on linux (#13107) do we need to mention it? -commit https://github.com/intel/llvm/commit/2a1002b9fac9c4b878c6625c3cfafa61dea07ea2 - [SYCL][JIT] Load SYCL JIT lazily (#13433) - commit https://github.com/intel/llvm/commit/4f5a5f0fba71593888f1737e0b4dbaf49c85e04b [SYCL] Fix WA for ocl query of CL_DEVICE_PROFILE (#13584) @@ -609,9 +584,6 @@ commit https://github.com/intel/llvm/commit/d13fdbe4ee02c39b1939bae7da61392e75ce [Bindless][Exp] Add texture fetch functionality (#12447) or a new feature? -commit https://github.com/intel/llvm/commit/fbd10436a5911b12b8d77ba50397a24e6905e7a3 - [Driver][SYCL]Adding 'aoc -vpfp-relaxed' with -fintelfpga and -fp-model=fast (#13651) - commit https://github.com/intel/llvm/commit/8993f3fc55489023603ceafa631e8f19824979b3 [SYCL][ESIMD] Use old intrinsic for named_barrier_signal for now (#13255) does it revert the patch below? @@ -619,25 +591,10 @@ commit https://github.com/intel/llvm/commit/d4a9254d764a0ff0be8514a6854afda833a2 [SYCL][ESIMD] Use intrinsic for named_barrier_signal (#12982) ??? -commit https://github.com/intel/llvm/commit/51ffc04f0f317e0395c678e1fecd654df51db955 - [SYCL][libclc] Add generic addrspace overloads of vload/vstore builtins (#13092) - ???? - -commit https://github.com/intel/llvm/commit/75300ab1ceee835e07086925d990f74107a84a1d - [SYCL][libclc] Add generic fp16 math builtins for generic SPIR-V target (#13361) - ??? - commit https://github.com/intel/llvm/commit/7271d613156f2268d538f20d92ecd52b1fbc488f [SYCL][Docs] Add deprecation notice to SPV_INTEL_global_variable_decorations (#13772) do we really need to mention SPIR-V specs? -commit https://github.com/intel/llvm/commit/0678c5ce0fe3af6363bd4b374ffaedb800a5b1e1 - [SYCL][Joint matrix] clarify the range of the prefetch templated arguments (#13796) - -commit https://github.com/intel/llvm/commit/bdaf1e27310dc2218a95f05731a422a32ea5a658 - [libclc] Separate out generic AS support macros (#13792) - ???? - commit https://github.com/intel/llvm/commit/24a6b3b2f2d2a160a737fb1162c78f4cce9a8f1d [SYCL] Generate imported symbol files in sycl-post-link (#14189) commit https://github.com/intel/llvm/commit/62ea97e34e9245fb50f5718861da06e5e4425c2e @@ -656,10 +613,6 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad [DeviceSanitizer] Disable handling no return calls (#14652) // bugfix? -commit https://github.com/intel/llvm/commit/e38dcdc8bb547f4b63c7b860c1cd9948c090ffc8 - [SYCL] Add compile target to device image properties (#14757) - Not user visible, need to merged with other optional kernel features AOT patches - ## Bug Fixes ### SYCL Compiler @@ -828,9 +781,6 @@ commit https://github.com/intel/llvm/commit/4b993a7b32f7743980bce646765a1b427b09 commit https://github.com/intel/llvm/commit/ea7ba1b965302277fc23ef48dba83b10e6c734e9 [ESIMD] Restore the lowering of lsc_load_stateless in sycl-post-link (#13104) -commit https://github.com/intel/llvm/commit/267a03cd1ba5eaa55db95800712f978b93842bc5 - [SYCL] [NATIVECPU] Select right libclc file for native cpu (#13478) - ??? commit https://github.com/intel/llvm/commit/5794326b965071a69273a1f653405670b728e66b [SYCL][NATIVECPU][DRIVER] Select remangled libclc variant for Native CPU (#13765) ??? @@ -868,9 +818,6 @@ commit https://github.com/intel/llvm/commit/bcf7d4df6acf33a75c195215afad78113d14 commit https://github.com/intel/llvm/commit/d6340b67391cd8e9e4c7775a3c1ada8f2755bb06 [SYCL][Graph] in-order queue barrier fix (#13193) -commit https://github.com/intel/llvm/commit/628ede6edf2448c531bea7f818dc6819d9e7393f - [libclc] Fix UB in double->int conversions (#13546) - commit https://github.com/intel/llvm/commit/6b2fb665e9aa0bb7f2e034a22a153b7006c19d8a [SYCL] Fix Level-Zero's `sycl::make_device` interop (#13483) @@ -883,10 +830,6 @@ commit https://github.com/intel/llvm/commit/b4e0450207b5a85d5b985de0c0ff6fecdfeb commit https://github.com/intel/llvm/commit/6934bcfb13415dc5bda85876b5cfc361678523f4 [SYCL] Do not attach reqd_work_group_size info when multiple are detected (#13523) -commit https://github.com/intel/llvm/commit/563904b2aebb791adf0e1ad955a43e226c9a6caf - [SYCL] Add aspect names to sycl_used_aspects before cleaning up (#13486) - part of optional kernel features AOT? - commit https://github.com/intel/llvm/commit/82f77d10dd092ea419115f61a7715655f055b7bb [SYCL][Graph] Fix queue recording barrier to different graphs (#14212) @@ -905,9 +848,6 @@ commit https://github.com/intel/llvm/commit/5ad97902643da043233ec21ac203cca329df commit https://github.com/intel/llvm/commit/daaece06ce68544eaae078899c559f571297d8c0 [SYCL][Graph] Fix access modes not being respected (#13011) -commit https://github.com/intel/llvm/commit/b13a3c4c39a356c47cda983350f06000330a42f1 - [libclc][hip] Fix half shuffles and reenable reduction test (#13016) - commit https://github.com/intel/llvm/commit/0360e6af2a353210d508633a60ff02327094f7e7 [SYCL] Follow up fixes for group_sort extension (#14591) @@ -978,6 +918,9 @@ of some classes to use so-called preview implementation. intel/llvm#13257 - Moved `slm_allocator` ESIMD APIs into `experimental` namespace. intel/llvm#13901 - Removed deprecated `usm_system_allocator` aspect. intel/llvm#13279 +- Removed `get_child_group` API from experimental + [`sycl_ext_oneapi_root_group`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_root_group.asciidoc) + extension. intel/llvm#13482 Breaking changes were also made to compiler flags: From 1343ce8c7d0013f615c6eaf494bff5e731ac89ab Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Mon, 9 Sep 2024 05:34:42 -0700 Subject: [PATCH 07/30] Further updates --- sycl/ReleaseNotes.md | 169 ++++++++++++++----------------------------- 1 file changed, 53 insertions(+), 116 deletions(-) diff --git a/sycl/ReleaseNotes.md b/sycl/ReleaseNotes.md index 2ba9ad3373ae..c240d2d96361 100644 --- a/sycl/ReleaseNotes.md +++ b/sycl/ReleaseNotes.md @@ -53,7 +53,7 @@ Release notes for commit range - Implemented [`sycl_ext_oneapi_private_alloca`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_private_alloca.asciidoc) extension. intel/llvm#12966 intel/llvm#13490 intel/llvm#13181 - Implemented [`sycl_ext_oneapi_enqueue_functions`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_enqueue_functions.asciidoc) - extension. intel/llvm#13512 intel/llvm#13924 + extension. intel/llvm#13512 intel/llvm#13924 intel/llvm#14743 - Added support for `get_backend_info` API into various SYCL classes (`platform`, `context`, etc.). intel/llvm#12906 - Implemented [`sycl_ext_oneapi_group_load_store`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/proposed/sycl_ext_oneapi_group_load_store.asciidoc). Please note that the implementation is naive and does not expose any special @@ -154,6 +154,9 @@ commit https://github.com/intel/llvm/commit/d06724a7c304d393500b7edbb84f5c7e59f6 [SYCL][Graph] Specify API for explicit update using indices (#12486) commit https://github.com/intel/llvm/commit/2bc8b5bc8cbc44cf8ef1deb095c10450348904d8 [SYCL][Graph] Implementation of explicit update with indices (#12840) +commit https://github.com/intel/llvm/commit/b4e0450207b5a85d5b985de0c0ff6fecdfebf0da + [SYCL][Graph] Export missing graph node symbols (#13744) + bugfix for the above commit https://github.com/intel/llvm/commit/c8ae6c68943b9635cd9822f3c9ee7b5cc8d98acc [ESIMD][NFC][DOC] Add load/store/prefetch_2d functions, L1/L2 hint combinations(#13218) @@ -165,8 +168,10 @@ commit https://github.com/intel/llvm/commit/c8ae6c68943b9635cd9822f3c9ee7b5cc8d9 - Improved compilation flow around intergation footer when no 3rd-party host compiler is used. New compilation flow creates less temporary files and therefore should result in a slightly faster compilation. intel/llvm#13607 intel/llvm#14402 -- Added support for `truncf`, `sinpif`, `rsqrtf` and `exp10f` functions in SYCL - kernels are part of [C-CXX-StandardLibrary](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/supported/C-CXX-StandardLibrary.rst) extension. intel/llvm#14132 +- Added support for `truncf`, `sinpif`, `rsqrtf`, `exp10f`, `ceilf`, + `copysignf`, `cospif`, `fmaxf` and `fminf` functions in SYCL kernels as part of + [C-CXX-StandardLibrary](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/supported/C-CXX-StandardLibrary.rst) + extension. intel/llvm#14132 intel/llvm#13714 - Added support for more IMF functions as part of [C-CXX-StandardLibrary](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/supported/C-CXX-StandardLibrary.rst) extension. intel/llvm#13786 - Added `-fsystem-debug` command line option to complement existing `-fno-system-debug`. intel/llvm#13256 @@ -228,7 +233,8 @@ commit https://github.com/intel/llvm/commit/c8ae6c68943b9635cd9822f3c9ee7b5cc8d9 SYCL 1.2.1 exception sub-classes everywhere. intel/llvm#14484 intel/llvm#14545 intel/llvm#14520 intel/llvm#14485 intel/llvm#14510 intel/llvm#14483 intel/llvm#14487 intel/llvm#14488 -- Added support for `sycl::vec::convert` to/from `vec`. intel/llvm#14105 +- Added support for `sycl::vec::convert` to/from `vec`. + intel/llvm#14105 intel/llvm#14085 - Deprecated `marray::operator++/--`. intel/llvm#13443 - Deprecated `accessor::get_multi_ptr` for non-device accessors. intel/llvm#13443 - Moved ESIMD named barrier APIs out of `experimental` namespace. intel/llvm#13704 @@ -275,6 +281,12 @@ commit https://github.com/intel/llvm/commit/c8ae6c68943b9635cd9822f3c9ee7b5cc8d9 - ESIMD API `inv` was extended to support `double` arguments. intel/llvm#13838 - Enchanced validation (via `static_assert` mechanism) of template arguments of ESIMD `rdregion` and `wrregion` APIs. intel/llvm#13158 +- Aligned mutating swizzle operators with the SYCL 2020 specification by making + it a `friend` instead of member function. intel/llvm#13012 +- Aligned `vec` conversion operator to a scalar with the SYCL 2020 specification + by making it a non-template. intel/llvm#14668 +- Removed deprecation warnings from math built-ins that accept raw pointers to + align with the SYCL 2020 spec changes. intel/llvm#13238 intel/llvm#13893 commit https://github.com/intel/llvm/commit/c5b174d8507cad1328b3121e650120e85f1da213 @@ -360,7 +372,6 @@ commit https://github.com/intel/llvm/commit/13c9d0ef964b17dd3e2c297b1ceb2ecb8ea2 commit https://github.com/intel/llvm/commit/fb561b9f336f8f9c286a1125631dedf1b5fb1e4b [SYCL][Bindless][Doc][ABI-Break] Add const qualifiers to copies (#14140) - commit https://github.com/intel/llvm/commit/0eeae2ac96ea179099dd5d57c241260ccfe65f73 [SYCL][Graph] Update design doc for copy optimization and add test (#13051) @@ -368,18 +379,10 @@ commit https://github.com/intel/llvm/commit/4acca904c0e07fd6b504f7938f539bc1a0e9 [CLC][AMDGPU] Refactor fence helper to process order semantic explicitly (#12872) ??? -commit https://github.com/intel/llvm/commit/13a7b3ad2f229099fe964016f591f17a66b0ea15 - [SYCL] [libdevice] Add vector overloads of ConvertBFloat16ToFINTEL and ConvertFToBFloat16INTEL (#14085) - commit https://github.com/intel/llvm/commit/0dcad16c36f27e6254e7b831faaad8c6e07f8cfb [SYCL][Bindless] Update spirv fetch-sampled and fetch/write-array (#13946) bugfix?? -commit https://github.com/intel/llvm/commit/af65855fa6b6df0eded078bd3dbe3bf4a6a2b2e3 - [SYCL][ESIMD]Replace use of intrinsics with spirv functions (#13553) - do we even need to mention this? -commit https://github.com/intel/llvm/commit/990b1d1ba053d60a803ae5e750803ae6583119f9 - [ESIMD]Replace use of vc intrinsic with spirv extension for rdtsc API (#13536) commit https://github.com/intel/llvm/commit/1f1be9c642889b7c0fd045b073d411e544dc6007 [SYCL][ESIMD] Move fmax to SPIR-V intrinsic (#14020) this one is also problematic @@ -443,9 +446,6 @@ commit https://github.com/intel/llvm/commit/6cb77fcfb37ffb445ab62ea1545422dc5212 commit https://github.com/intel/llvm/commit/84bae21d3f63f04ca50bfffc5203909ba3fd95a6 Implement missing overloads for generic AS in generic target (#13938) -commit https://github.com/intel/llvm/commit/da379ecfa649a520f49f8adfb97e73c72ff3fb06 - [SYCL] Add support for multiple missing math ops (#13714) - commit https://github.com/intel/llvm/commit/29b4d855fa1a378e89182795e0d368304c40c3f6 [SYCL][CUDA] Enable support of msvc math functions for nvptx target. (#14007) @@ -461,12 +461,6 @@ commit https://github.com/intel/llvm/commit/a35f862445b5666c63469cda2656b0a9946d commit https://github.com/intel/llvm/commit/c1b17e00f9b5c51db1f8385435d7a591224b01e0 [SYCL] Enable CET for wqlibsycl-devicelib-host.a (#14135) -commit https://github.com/intel/llvm/commit/e7defabdcc3d5b460cfc593822156836b874f092 - [SYCL] Use `std::array` as storage for `sycl::vec` on device (#14130) - -commit https://github.com/intel/llvm/commit/c2ebf84fd7ffcc8f40dd9eef2aed163437792cd5 - [SYCL] Make `vec` conversion operator to scalar non-template (#14668) - commit https://github.com/intel/llvm/commit/3fdfbfed1ed0062b9f3848a100093b340183c6a3 [SYCL][NATIVECPU] Support reqd_work_group_size on Native CPU (#13175) @@ -474,30 +468,12 @@ commit https://github.com/intel/llvm/commit/fe1859085b621ea901cd8da8165992312241 [SYCL][NVPTX] Emit reqd_work_group_size attributes as NVVM annotations (#14502) related to above? -commit https://github.com/intel/llvm/commit/9e4768ca9849e7188221c0e2894282730e3b1bde - [SYCL][libclc] Add generic addrspace overloads of math builtins (#13015) -commit https://github.com/intel/llvm/commit/51ffc04f0f317e0395c678e1fecd654df51db955 - [SYCL][libclc] Add generic addrspace overloads of vload/vstore builtins (#13092) -commit https://github.com/intel/llvm/commit/75300ab1ceee835e07086925d990f74107a84a1d - [SYCL][libclc] Add generic fp16 math builtins for generic SPIR-V target (#13361) -commit https://github.com/intel/llvm/commit/bdaf1e27310dc2218a95f05731a422a32ea5a658 - [libclc] Separate out generic AS support macros (#13792) -commit https://github.com/intel/llvm/commit/628ede6edf2448c531bea7f818dc6819d9e7393f - [libclc] Fix UB in double->int conversions (#13546) - Waiting for feedback from frasercrmk on these - -commit https://github.com/intel/llvm/commit/183832b9cebd471586c0ed251876972939442327 - [SYCL][PI] Add PI_ERROR_UNSUPPORTED_FEATURE error code (#13036) - commit https://github.com/intel/llvm/commit/7a9d3b1e9483b69baa0b8c6f1097016efd52854c [SYCL][NVPTX] Do not decompose SYCL functor unless necessary (#14434) commit https://github.com/intel/llvm/commit/e51002c81cdf32f383104907cca820e4ed3452ba [SYCL] Enable intel joint matrix on GNR. (#14436) -commit https://github.com/intel/llvm/commit/21c2e1c2213171d12acb5e6c41a713db30a0d5d4 - [SYCL] Make swizzle mutating operators const friends (#13012) - commit https://github.com/intel/llvm/commit/17ee3e24e2874690f7526dcda9d8bc4679fe7edc [SYCL][NATIVECPU] Add device library and initial subgroup support (#13979) @@ -584,17 +560,6 @@ commit https://github.com/intel/llvm/commit/d13fdbe4ee02c39b1939bae7da61392e75ce [Bindless][Exp] Add texture fetch functionality (#12447) or a new feature? -commit https://github.com/intel/llvm/commit/8993f3fc55489023603ceafa631e8f19824979b3 - [SYCL][ESIMD] Use old intrinsic for named_barrier_signal for now (#13255) - does it revert the patch below? -commit https://github.com/intel/llvm/commit/d4a9254d764a0ff0be8514a6854afda833a268ce - [SYCL][ESIMD] Use intrinsic for named_barrier_signal (#12982) - ??? - -commit https://github.com/intel/llvm/commit/7271d613156f2268d538f20d92ecd52b1fbc488f - [SYCL][Docs] Add deprecation notice to SPV_INTEL_global_variable_decorations (#13772) - do we really need to mention SPIR-V specs? - commit https://github.com/intel/llvm/commit/24a6b3b2f2d2a160a737fb1162c78f4cce9a8f1d [SYCL] Generate imported symbol files in sycl-post-link (#14189) commit https://github.com/intel/llvm/commit/62ea97e34e9245fb50f5718861da06e5e4425c2e @@ -613,9 +578,6 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad [DeviceSanitizer] Disable handling no return calls (#14652) // bugfix? -## Bug Fixes - -### SYCL Compiler - Fixed that using `-fsycl-link-targets` flag would inadvertently trigger some additional device code linking steps. intel/llvm#13004 @@ -629,6 +591,15 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad in unbundling mode" error emitted by the compiler during link step. intel/llvm#13002 - Fixed a bug in address sanitizer which may lead to crashes when an application is launched on OpenCL CPU device. intel/llvm#13262 +- Fixed a bug where calling certain built-in math functions that accepts + pointers (like `fract`, `frexp`, `modf`, etc.) and passing pointer in the + generic address space there would not compile for AMD devices. + intel/llvm#13015 intel/llvm#13092 intel/llvm#13361 intel/llvm#13792 + intel/llvm#13546 +- Fixed a bug where compiling a program that contains kernels with different + `reqd_work_group_size` attributes attached to them using + `-fsycl-device-code-split=none` would result in an exception being thrown at + runtime about mismatching work-group size. intel/llvm#13523 ### SYCL Library @@ -643,6 +614,8 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad throw an exception if device is not a component device. intel/llvm#13868 - Fixed an issue that querying for composite devices may result in some devices returned twice. intel/llvm#14442 +- Fixed a bug that querying for composite devices may result in an exception + being thrown instead of an empty vector being returned. intel/llvm#13931 - Fixed a bug in copy-constructor of `config_2d_mem_access` ESIMD class which would lead to compilation errors. intel/llvm#13632 - Fixed an issue that use of `atomic_ref` would not be detected as a use @@ -667,7 +640,8 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad - Fixed strict alias violations in `sycl::vec::operator[]` implementation that could lead to spurious errors. intel/llvm#13596 - Fixed a bug where a barrier submitted into a command queue with host tasks - could ignore them. intel/llvm#13094 + could ignore them, as well as a few bugs related to synchronization of host + tasks with barriers. intel/llvm#13094 intel/llvm#13863 intel/llvm#13094 - Fixed a compilation issue occurring when `printf` is used on CUDA backend on Windows. intel/llvm#13784 TODO: was it really a compilation issue? @@ -712,24 +686,23 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad - Fixed a bug where querying free device memory of integrated Intel GPUs would return 0 instead of throwing an exception that the feature is not supported for that device. intel/llvm#13209 - -commit https://github.com/intel/llvm/commit/c2a4054980fd4ce4c4d6cfa425cbc71b20d5f450 - [SYCL] Fix barrier with wait list handling (#13863) - is it just a bugfix for intel/llvm#13094? -commit https://github.com/intel/llvm/commit/11046e7d8bf07afebd79f30a528c3cbe5493d8ed - [SYCL] Fix queue fields cleanup for barrier vs host task deps (#14268) - looks like bugfix for intel/llvm#13094 - -commit https://github.com/intel/llvm/commit/1d24713c299aa16113f390c87d4444af5b83a586 - [SYCL] Fix ONEAPI_DEVICE_SELECTOR handling of discard filters. (#13927) - What was happening before this patch? - -commit https://github.com/intel/llvm/commit/775dccb43494b1d38fb84de728446053b11bd05a - [SYCL] Allow empty and unsupported case for component_devices (#13931) - This is later modified in another commit, so those two should be squashed - -commit https://github.com/intel/llvm/commit/33325d4af0b66c33f7a42f0bf584645972a738a8 - [SYCL] Fix enqueue functions taking both kernel and properties (#14743) +- Fixed a heap buffer overwlow in `sycl_ext_oneapi_kernel_compiler_opencl` + extension implementation. intel/llvm#13214 intel/llvm#13448 +- Fixed intel/llvm#12473 about `sycl_ext_oneapi_graph` extension implementation + ignoring access mode of accessors and thus creating unnecessary edges in a + graph. intel/llvm#13011 +- Fixed a bug where a command submission time would not be alwasys recorded in + profiling info when using `sycl_ext_oneapi_graph` extension. intel/llvm#14678 +- Fixed a bug with graph recording where submitting a barrier using the same + queue for two different graphs would result in runtime error "Graph nodes + cannot depend on events from another graph". intel/llvm#14212 +- Fixed intel/llvm#13066 where submitting a barrier into an empty in-order queue + whilst recording a graph results in runtime error "No event has been recorded + for the specified graph node". intel/llvm#13193 +- Fixed a resource leak in graph update implementation. intel/llvm#14029 +- Fixed a bug with invalid handling of discard filters within + `ONEAPI_DEVICE_SELECTOR` env variable caused RT to mistakenly say that the + env variable value is ill-formed. intel/llvm#13927 ### Documentation @@ -764,9 +737,6 @@ commit https://github.com/intel/llvm/commit/a14c0917ad741a3a27b50040e4589b562624 commit https://github.com/intel/llvm/commit/c1ee064428a2d4038021dc3284a4c2f3aa897cb8 [SYCL][Bindless] Fix OpaqueFD/Win32Handle's scope in piextImportExternalMemory/Semaphore (#14266) -commit https://github.com/intel/llvm/commit/493e78be6020ef436634b21d93069467fa6c69e7 - [SYCL][Graph] Fix PI Kernel leak in graph update (#14029) - commit https://github.com/intel/llvm/commit/9ec73a21782de1d11d08e97d63a27fa8b208c1e5 [SYCL] Add work_group_num_dim metadata (#13600) Fixes reqd_work_group_size for HIP @@ -778,9 +748,6 @@ commit https://github.com/intel/llvm/commit/4b993a7b32f7743980bce646765a1b427b09 Revert "[SYCL][Driver] Link with sycl libs at link step of clang-cl -fsycl (#12793)" (#13326) revert commit https://github.com/intel/llvm/commit/seems to be a part of a previous release -commit https://github.com/intel/llvm/commit/ea7ba1b965302277fc23ef48dba83b10e6c734e9 - [ESIMD] Restore the lowering of lsc_load_stateless in sycl-post-link (#13104) - commit https://github.com/intel/llvm/commit/5794326b965071a69273a1f653405670b728e66b [SYCL][NATIVECPU][DRIVER] Select remangled libclc variant for Native CPU (#13765) ??? @@ -804,37 +771,16 @@ commit https://github.com/intel/llvm/commit/0939f39818225ce3e469e08f6a45711b449a [SYCL] Align assert ext name with libdevice implementation (#13312) can likely be ommitted -commit https://github.com/intel/llvm/commit/5ab1f762821abf36412b2b8d0e529285553fa472 - [SYCL][ESIMD] Fix simd_view template argument and add nested simd_view tests (#13231) - commit https://github.com/intel/llvm/commit/3c7f99d891cdd7c929b38b18bf6877c3c8dba163 [SYCL][Graph] Fix potential issue with command buffer commands (#13224) - -commit https://github.com/intel/llvm/commit/c4c456bd74945b0ea2faf9ca54b28bb02f36cd49 - [SYCL] fix for kernel_compiler (#13214) -commit https://github.com/intel/llvm/commit/bcf7d4df6acf33a75c195215afad78113d14ae2d - [SYCL] kernel_compiler opencl query fix (#13448) - -commit https://github.com/intel/llvm/commit/d6340b67391cd8e9e4c7775a3c1ada8f2755bb06 - [SYCL][Graph] in-order queue barrier fix (#13193) + Was it a user-visible issue? Why no test? commit https://github.com/intel/llvm/commit/6b2fb665e9aa0bb7f2e034a22a153b7006c19d8a [SYCL] Fix Level-Zero's `sycl::make_device` interop (#13483) commit https://github.com/intel/llvm/commit/64cb0cf96de28bfd495e577b4dd46c26dbb6b197 [SYCL][Graph] Fix minor issues in graph update code (#13660) - -commit https://github.com/intel/llvm/commit/b4e0450207b5a85d5b985de0c0ff6fecdfebf0da - [SYCL][Graph] Export missing graph node symbols (#13744) - -commit https://github.com/intel/llvm/commit/6934bcfb13415dc5bda85876b5cfc361678523f4 - [SYCL] Do not attach reqd_work_group_size info when multiple are detected (#13523) - -commit https://github.com/intel/llvm/commit/82f77d10dd092ea419115f61a7715655f055b7bb - [SYCL][Graph] Fix queue recording barrier to different graphs (#14212) - -commit https://github.com/intel/llvm/commit/e22cb798f8363f8e2a95a7e6df9a294b34c52fc4 - Fix Basic/image/srgba-read.cpp failure under SYCL_PREFER_UR with ONEAPI_DEVICE_SELECTOR=opencl:cpu (#14233) + bugfix for #13011? commit https://github.com/intel/llvm/commit/1b5c5a8e96502b196c91251fa6513a6ede1257f5 [SYCL] Fix SYCL_EXTERNAL device code when linking with a static lib (#14256) @@ -842,18 +788,9 @@ commit https://github.com/intel/llvm/commit/1b5c5a8e96502b196c91251fa6513a6ede12 commit https://github.com/intel/llvm/commit/d77a348776672316f59c59dc3b11ebf5aa79f936 [SYCL][NVPTX] Emit 'grid_constant' annotations for by-val kernel params (#14332) -commit https://github.com/intel/llvm/commit/5ad97902643da043233ec21ac203cca329df07b2 - [SYCL][Graph] Fix profiling info when bypassing scheduler (#14678) - -commit https://github.com/intel/llvm/commit/daaece06ce68544eaae078899c559f571297d8c0 - [SYCL][Graph] Fix access modes not being respected (#13011) - commit https://github.com/intel/llvm/commit/0360e6af2a353210d508633a60ff02327094f7e7 [SYCL] Follow up fixes for group_sort extension (#14591) -commit https://github.com/intel/llvm/commit/d39563ad1faa1d503c0396a137afd6664756b358 - [SYCL][Clang] Fix address space for virtual table support (#13629) - ## API/ABI Breaking Changes This release is an *ABI* breaking release, meaning that any applications which @@ -891,10 +828,6 @@ of some classes to use so-called preview implementation. - `feature_not_supported`. intel/llvm#14423 - Removed `queue::mem_advice` overload accepting `pi_mem_advice`. intel/llvm#14618 - Removed number of deprecated ESIMD APIs. intel/llvm#14415 -- Removed deprecated overloads of math built-ins accepting raw pointers. intel/llvm#13238 - Is it negated by the following commit? - - commit https://github.com/intel/llvm/commit/efed3bb04f3c43baf3373bf35d8924bbcf91f385 - [SYCL] Allow raw pointers in SYCL math builtins (#13893) - Removed non-standard `sycl::id` -> `sycl::range` conversion operator. intel/llvm#13293 - Removed deprecated APIs from [`sycl_ext_oneapi_bindless_images`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_bindless_images.asciidoc) @@ -902,7 +835,10 @@ of some classes to use so-called preview implementation. - Renamed SYCLcompat function `async_free` to `enqueue_free`. intel/llvm#14015 - Enforced restrictions on first argument of lambdas/functors passed to `parallel_for(range)` and `parallel_for(nd_range)`. intel/llvm#13198 -- Switched `sycl::vec` implemetation to use its preview version. intel/llvm#14317 intel/llvm#13182 +- Switched `sycl::vec` implemetation to use its preview version. New version + uses differnt storage type under the hood which should fix several strict + aliasing rules violations that we had in the implementation. + intel/llvm#14317 intel/llvm#13182 intel/llvm#14130 - Switched `sycl::exception` implementation to us its preview version. intel/llvm#14548 - Switched math built-ins implementation to use their preview version. intel/llvm#13152 - Switched `bfloat16` implementation to use its preview version. intel/llvm#13233 @@ -921,6 +857,7 @@ of some classes to use so-called preview implementation. - Removed `get_child_group` API from experimental [`sycl_ext_oneapi_root_group`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_root_group.asciidoc) extension. intel/llvm#13482 +- Simplified template arguments related to `simd_view` of many ESIMD APIs. intel/llvm#13231 Breaking changes were also made to compiler flags: From 90bcf221a4e8208a69c43e4aad0bf4d6b4f8a559 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Mon, 9 Sep 2024 05:37:32 -0700 Subject: [PATCH 08/30] Drop unnecesssary changes --- sycl/ReleaseNotes.md | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/sycl/ReleaseNotes.md b/sycl/ReleaseNotes.md index c240d2d96361..a346fffba8be 100644 --- a/sycl/ReleaseNotes.md +++ b/sycl/ReleaseNotes.md @@ -5321,7 +5321,7 @@ Release notes for commit range 5976ff0..1fc0e4f # August'20 release notes -Release notes for the commit https://github.com/intel/llvm/commit/range 75b3dc2..5976ff0 +Release notes for the commit range 75b3dc2..5976ff0 ## New features - Implemented basic support for the [Explicit SIMD extension](doc/extensions/experimental/sycl_ext_intel_esimd/sycl_ext_intel_esimd.md) @@ -5502,7 +5502,7 @@ Release notes for the commit https://github.com/intel/llvm/commit/range 75b3dc2. # June'20 release notes -Release notes for the commit https://github.com/intel/llvm/commit/range ba404be..24726df +Release notes for the commit range ba404be..24726df ## New features - Added switch to assume that each amount of work-items in each ND-range @@ -5656,7 +5656,7 @@ Release notes for the commit https://github.com/intel/llvm/commit/range ba404be. # May'20 release notes -Release notes for the commit https://github.com/intel/llvm/commit/range ba404be..67d3d9e +Release notes for the commit range ba404be..67d3d9e ## New features - Implemented [reduction extension](doc/extensions/deprecated/sycl_ext_oneapi_nd_range_reductions.md) @@ -5807,7 +5807,7 @@ Release notes for the commit https://github.com/intel/llvm/commit/range ba404be. # March'20 release notes -Release notes for the commit https://github.com/intel/llvm/commit/range e8f1f29..ba404be +Release notes for the commit range e8f1f29..ba404be ## New features - Initial CUDA backend support [7a9a425] @@ -5979,7 +5979,7 @@ Please, see the runtime installation guide [here](https://github.com/intel/llvm/ # February'20 release notes -Release notes for commit https://github.com/intel/llvm/commit/e8f1f29 +Release notes for commit e8f1f29 ## New features - Added `__builtin_intel_fpga_mem` for the FPGA SYCL device. The built-in is @@ -6116,7 +6116,7 @@ Please, see the runtime installation guide [here](https://github.com/intel/llvm/ # December'19 release notes -Release notes for commit https://github.com/intel/llvm/commit/78d80a1cc628af76f09c53673ada906a3d2f0131 +Release notes for commit 78d80a1cc628af76f09c53673ada906a3d2f0131 ## New features - New attributes for Intel FPGA devices : `num_simd_work_items`, `bank_bits`, @@ -6217,7 +6217,7 @@ Please, see the runtime installation guide [here](https://github.com/intel/llvm/ # November'19 release notes -Release notes for commit https://github.com/intel/llvm/commit/e0a62df4e20eaf4bdff5c7dd46cbde566fbaee90 +Release notes for commit e0a62df4e20eaf4bdff5c7dd46cbde566fbaee90 ## New features @@ -6369,7 +6369,7 @@ Please, see the runtime installation guide [here](https://github.com/intel/llvm/ # October'19 release notes -Release notes for commit https://github.com/intel/llvm/commit/918b285d8dede6ab0561fccc622f71cb858849a6 +Release notes for commit 918b285d8dede6ab0561fccc622f71cb858849a6 ## New features - `cl::sycl::queue::mem_advise` method was implemented [4828db5] @@ -6560,7 +6560,7 @@ Please, see the runtime installation guide [here](https://github.com/intel/llvm/ # September'19 release notes -Release notes for commit https://github.com/intel/llvm/commit/d4efd2ae3a708fc995e61b7da9c7419dac900372 +Release notes for commit d4efd2ae3a708fc995e61b7da9c7419dac900372 ## New features - Added support for `reqd_work_group_size` attribute. [68578d7] @@ -6665,7 +6665,7 @@ Please, see the runtime installation guide [here](https://github.com/intel/llvm/ # August'19 release notes -Release notes for commit https://github.com/intel/llvm/commit/c557eb740d55e828fcf74b28d2b686c928e45318. +Release notes for commit c557eb740d55e828fcf74b28d2b686c928e45318. ## New features - Support for `image accessor` has been landed. @@ -6750,7 +6750,7 @@ Release notes for commit https://github.com/intel/llvm/commit/c557eb740d55e828fc # July'19 release notes -Release notes for commit https://github.com/intel/llvm/commit/64c0262c0f0b9e1b7b2e2dcef57542a3fe3bdb97. +Release notes for commit 64c0262c0f0b9e1b7b2e2dcef57542a3fe3bdb97. ## New features - `cl::sycl::stream` class support has been added. From 52bc4a07b66cf9c488f80ace631d38d6e09aa9af Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Mon, 9 Sep 2024 06:36:56 -0700 Subject: [PATCH 09/30] Further changes --- sycl/ReleaseNotes.md | 95 +++++++++++++++----------------------------- 1 file changed, 32 insertions(+), 63 deletions(-) diff --git a/sycl/ReleaseNotes.md b/sycl/ReleaseNotes.md index a346fffba8be..6afdf1c2f0b4 100644 --- a/sycl/ReleaseNotes.md +++ b/sycl/ReleaseNotes.md @@ -43,6 +43,10 @@ Release notes for commit range - Added support for detecting misaligned data accesses via address sanitizer. intell/llvm#14148 - Added support for emitting multiple error reports via address sanitizer through `-fsanitize-recover=address`. intel/llvm#13948 +- Added initial support for + [dynamic linking](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/design/SharedLibraries.md). + Current implementation lacks support for `kernel_bundle` API and AOT mode. + intel/llvm#14587 intel/llvm#14189 intel/llvm#14103 ### SYCL Library @@ -61,8 +65,6 @@ Release notes for commit range load/store could be done without this extension using simple `for` loop and group barriers. intel/llvm#13043 - Implemented [`sycl_ext_codeplay_enqueue_native_command`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_codeplay_enqueue_native_command.asciidoc) extension. intel/llvm#14136 -- Added initial support for [dynamic linking](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/design/SharedLibraries.md). - Current implementation lacks support for `kernel_bundle` API and AOT mode. intel/llvm#14587 - Added initial support for [`sycl_ext_oneapi_free_function_kernels`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/proposed/sycl_ext_oneapi_free_function_kernels.asciidoc) extension. intel/llvm#13207 intel/llvm#13885 Known limitations: - free function kernels are only supported if defined at file scope @@ -77,6 +79,9 @@ Release notes for commit range kernels - Added experimental ESIMD function `fma` which results in a guaranteed fused multiply-add operation performed. intel/llvm#13366 +- Implemented revision 2 of + [`sycl_ext_oneapi_group_sort`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_group_sort.asciidoc) + extension. intel/llvm14399 intel/llvm#14185 intel/llvm#13942 intel/llvm#13908 ### SYCLcompat library @@ -129,24 +134,12 @@ commit https://github.com/intel/llvm/commit/9876e19f4ff387b35b0c98c7d62e5f50e6de [SYCL][XPTI] 'queue_id' metadata feature refactoring (#13070) bugfix? -commit https://github.com/intel/llvm/commit/3800814750da51d6da852ce404bde91e1dbe02b8 - [SYCL] Key/Value sorting with fixed-size private array input (#14399) - commit https://github.com/intel/llvm/commit/7b3f21527abb904cb5c63e9ea32c7f0d65636436 [SYCL] [ABI-Break] Partial implementation of sycl_ext_oneapi_cuda_cluster_group (#14113) -commit https://github.com/intel/llvm/commit/8e3b8ce77f41d85687ae3bceedf5d1dc6e0e3155 - [SYCL] Add sorting APIs for fixed-size private array input (#14185) - commit https://github.com/intel/llvm/commit/bd97f283c9f982b89a3347754edf184a38762a4a [Bindless][Exp] Windows & DX12 interop. Semaphore ops can take values. (#13860) -commit https://github.com/intel/llvm/commit/3910d0c1393247313c8987b3f68a8d540d940673 - [SYCL] Add support for key/value sorting APIs (#13942) - -commit https://github.com/intel/llvm/commit/5e269c88bcfafd82719d1266a5b8a2bb7b90045d - [SYCL] Initial changes for the second version of sycl_ext_oneapi_group_sort extension (#13908) - commit https://github.com/intel/llvm/commit/55b547e59a28c4c446a797bb8c51a83156609327 [SYCL][ESIMD] Introduce load2d/store2d/prefetch2d API that accepts compile time properties (#13046) @@ -287,6 +280,11 @@ commit https://github.com/intel/llvm/commit/c8ae6c68943b9635cd9822f3c9ee7b5cc8d9 by making it a non-template. intel/llvm#14668 - Removed deprecation warnings from math built-ins that accept raw pointers to align with the SYCL 2020 spec changes. intel/llvm#13238 intel/llvm#13893 +- Added support for 1- and 2-byte data types to ESIMD prefetch APIs. + intel/llvm#13452 +- Enabled `ext_intel_matrix` support for Intel GNR devices. intel/llvm#14436 +- Added initial support for sub-groups on Native CPU backend. intel/llvm#13979 +- Added support for 1x64x16 `bfloat16` matrices on PVC> intel/llvm#13391 commit https://github.com/intel/llvm/commit/c5b174d8507cad1328b3121e650120e85f1da213 @@ -334,6 +332,10 @@ commit https://github.com/intel/llvm/commit/ebb3b4a21b3b0e977f44434781729df7de83 - Updated [`sycl_ext_oneapi_work_group_static`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/proposed/sycl_ext_oneapi_work_group_static.asciidoc) extension (it is not `work_group_specific` anymore). intel/llvm#14271 +- Updated + [`sycl_ext_oneapi_matrix`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_matrix/sycl_ext_oneapi_matrix.asciidoc) + extension to list 1x64x16 `bfloat16` matrix combination available on PVC. + intel/llvm#13587 commit https://github.com/intel/llvm/commit/0678c5ce0fe3af6363bd4b374ffaedb800a5b1e1 [SYCL][Joint matrix] clarify the range of the prefetch templated arguments (#13796) @@ -416,9 +418,6 @@ commit https://github.com/intel/llvm/commit/c65bed1073460fb8d6dbb319f5e7ff2c9c7c commit https://github.com/intel/llvm/commit/d6dfd0c77b2212f4e3e926d2e289bd3dc6e18b49 [SYCL][Graph][DOC] add an edge case for record&replay mode (#12916) -commit https://github.com/intel/llvm/commit/89132855d4312536f5f40792194b6251d4cde819 - [SYCL][Joint Matrix] Add a new overload for joint_matrix_apply to be able to return result into a different matrix (#13151) - commit https://github.com/intel/llvm/commit/8847c110c78684a86ec7e62d7255f1bb9c6efd4f [SYCL][NATIVECPU][libclc]Mark opencl_c_generic_address_space as unsupported on Native CPU (#13109) commit https://github.com/intel/llvm/commit/a25d27bc9fbb2925519e966b9e7043be04274b27 @@ -434,11 +433,9 @@ commit https://github.com/intel/llvm/commit/267a03cd1ba5eaa55db95800712f978b9384 commit https://github.com/intel/llvm/commit/07e3bcf9f3be46234deb471e25d94b5692353688 [SYCL][ESIMD] Use LSC for unsupported surface index block stores (#13150) -commit https://github.com/intel/llvm/commit/ed0619b4caa24af8e78053ecef2e5e808e0e2b08 - [SYCL][Joint Matrix] Support 1x64x16 bf16 combination (#13391) - commit https://github.com/intel/llvm/commit/03233e57e5585813ec2c0dbc7a10ceb4a6d15a71 [SYCL] Add missed intel math functions in sycl_ext_intel_math header (#13762) + Is it user-visible? commit https://github.com/intel/llvm/commit/6cb77fcfb37ffb445ab62ea1545422dc52128da1 [SYCL] Add -fPIC for Intel math function host code (#13800) @@ -471,12 +468,6 @@ commit https://github.com/intel/llvm/commit/fe1859085b621ea901cd8da8165992312241 commit https://github.com/intel/llvm/commit/7a9d3b1e9483b69baa0b8c6f1097016efd52854c [SYCL][NVPTX] Do not decompose SYCL functor unless necessary (#14434) -commit https://github.com/intel/llvm/commit/e51002c81cdf32f383104907cca820e4ed3452ba - [SYCL] Enable intel joint matrix on GNR. (#14436) - -commit https://github.com/intel/llvm/commit/17ee3e24e2874690f7526dcda9d8bc4679fe7edc - [SYCL][NATIVECPU] Add device library and initial subgroup support (#13979) - commit https://github.com/intel/llvm/commit/0b9fc099f63feadb5e476c5862de3d8fa977a655 [SYCL][Graph] Test WGU kernel mismatch (#14379) @@ -494,12 +485,7 @@ commit https://github.com/intel/llvm/commit/f2cd2a80e7277fc62d8802673ce6ab2fac6f commit https://github.com/intel/llvm/commit/2e1f14adb3bf6d9e9c55e4b0ced9e1ece2172a4a [SYCL] Fix UB and alignment issues in the SYCL default sorter (#13975) - -commit https://github.com/intel/llvm/commit/4222b4ccd6dc499248c8bf026bcdd0f207000b35 - [SYCL] Restrict `sycl::vec` and swizzle operations to types mentioned in the SPEC (#13947) - -commit https://github.com/intel/llvm/commit/5b6cc5eb7bb2106ff426815702d89569e166c4f9 - [SYCL][Matrix] Enable SPV_KHR_cooperative_matrix extension (#13923) + May not be user-visible anymore since default sorter was removed commit https://github.com/intel/llvm/commit/ccca3b73769bfd8a27eff9956630fe86a2e4832d [SYCL] Optimize SG group_store via BlockWriteINTEL in simple cases (#13734) @@ -514,18 +500,12 @@ commit https://github.com/intel/llvm/commit/ece19f298b1029121da17a423b801bc2a926 commit https://github.com/intel/llvm/commit/7d55eb8a8419dac64f065bbf84125ed1d78dc992 [SYCL][Docs] Behavioral changes to in-order queue events extension (#13624) -commit https://github.com/intel/llvm/commit/ef6d2bb3caf36eaa1149369f8aee1578d6e31a6e - [SYCL][ESIMD] Add support for transposed prefetch for 1/2 byte elements (#13452) - commit https://github.com/intel/llvm/commit/0c0b58686a79c8d9a8ef547a96b5c1642480e591 [XPTI][INFRA] Sample E2E data collection timing test for XPTI (#13045) commit https://github.com/intel/llvm/commit/38e663ecd37de513d8e31afdfdf245cf8c9d17f0 [SYCL] Declare __devicelib_assert_read only when fallback assert is enabled (#13241) - -commit https://github.com/intel/llvm/commit/fb66f1b83559366e541381251de4281bb554613d - [SYCL] Replace __builtin_bit_cast with sycl::bit_cast in imf headers (#13313) - is it a bugfix? + Is there any particular user-visible bug associated with this? commit https://github.com/intel/llvm/commit/65bdffb1c9d4c474316d3e330fc3c59338e004f6 [SYCL][libclc][NATIVECPU] Implement generic atomic load for generic target (#13249) @@ -542,10 +522,6 @@ commit https://github.com/intel/llvm/commit/e9befa2d10f6c23a66ac780df7a1ddda5527 commit https://github.com/intel/llvm/commit/4f5a5f0fba71593888f1737e0b4dbaf49c85e04b [SYCL] Fix WA for ocl query of CL_DEVICE_PROFILE (#13584) -commit https://github.com/intel/llvm/commit/893059138f61aabeb0e1063549d7f4dd533fdfd1 - [SYCL][Matrix spec] Add 1x64x16 combination for Intel XMX (PVC only) (#13587) - or rather a new feature? - commit https://github.com/intel/llvm/commit/e17632f32fcc160add43742ccdaa6cc80cc1b0c0 [Driver][SYCL] Use LLVM-IR based device libraries for device linking (#13604) commit https://github.com/intel/llvm/commit/67d8ea1cdaef29afd75f7f085f0b6c6d73af81a3 @@ -560,11 +536,6 @@ commit https://github.com/intel/llvm/commit/d13fdbe4ee02c39b1939bae7da61392e75ce [Bindless][Exp] Add texture fetch functionality (#12447) or a new feature? -commit https://github.com/intel/llvm/commit/24a6b3b2f2d2a160a737fb1162c78f4cce9a8f1d - [SYCL] Generate imported symbol files in sycl-post-link (#14189) -commit https://github.com/intel/llvm/commit/62ea97e34e9245fb50f5718861da06e5e4425c2e - [SYCL] Exclude SYCL_EXTERNAL functions from device image with the option -support-dynamic-linking (#14103) - commit https://github.com/intel/llvm/commit/d4f2fe54047a1b415af2402a497f20e918094580 [SYCL][Bindless][Exp] Remove const from non-reference and non-pointer type parameters (#14238) @@ -578,6 +549,9 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad [DeviceSanitizer] Disable handling no return calls (#14652) // bugfix? +## Bug Fixes + +### SYCL Compiler - Fixed that using `-fsycl-link-targets` flag would inadvertently trigger some additional device code linking steps. intel/llvm#13004 @@ -703,6 +677,8 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad - Fixed a bug with invalid handling of discard filters within `ONEAPI_DEVICE_SELECTOR` env variable caused RT to mistakenly say that the env variable value is ill-formed. intel/llvm#13927 +- Fixed incorrect behavior of ESIMD atomic operations on data types smaller than + 4 bytes on Gen12 Intel GPUs. intel/llvm#13340 ### Documentation @@ -760,9 +736,6 @@ commit https://github.com/intel/llvm/commit/f170c63ed329c1fa5271d67e68144ec5d780 [SYCL] Fix kernel shortcut path for inorder queue (#13333) could be related to a commit https://github.com/intel/llvm/commit/made post-March release, i.e. it can probably be squashed with some other line -commit https://github.com/intel/llvm/commit/5332773b17efbf10e1b72cd633c1d7e2b4f75125 - [SYCL][ESIMD] atomic_update with data size less than 4 bytes should use LSC atomics (#13340) - commit https://github.com/intel/llvm/commit/4b14d706d93891cdb5b0e6a8d4b0b027c1d54ab8 [SYCL][DeviceSanitizer] Use -asan-constructor-kind=none to disable ctor/dtor (#13259) was the bug really user-visible? @@ -835,10 +808,13 @@ of some classes to use so-called preview implementation. - Renamed SYCLcompat function `async_free` to `enqueue_free`. intel/llvm#14015 - Enforced restrictions on first argument of lambdas/functors passed to `parallel_for(range)` and `parallel_for(nd_range)`. intel/llvm#13198 -- Switched `sycl::vec` implemetation to use its preview version. New version - uses differnt storage type under the hood which should fix several strict - aliasing rules violations that we had in the implementation. - intel/llvm#14317 intel/llvm#13182 intel/llvm#14130 +- Switched `sycl::vec` implemetation to use its preview version. New version: + - uses differnt storage type under the hood which should fix several strict + aliasing rules violations that we had in the implementation. + intel/llvm#14317 intel/llvm#13182 intel/llvm#14130 + - restricts math operations available to `vec` to those which + are available to `std::byte` (bitwise shifts are affected). + intel/llvm#13947 - Switched `sycl::exception` implementation to us its preview version. intel/llvm#14548 - Switched math built-ins implementation to use their preview version. intel/llvm#13152 - Switched `bfloat16` implementation to use its preview version. intel/llvm#13233 @@ -858,6 +834,7 @@ of some classes to use so-called preview implementation. [`sycl_ext_oneapi_root_group`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_root_group.asciidoc) extension. intel/llvm#13482 - Simplified template arguments related to `simd_view` of many ESIMD APIs. intel/llvm#13231 +- Removed ESIMD `atomic_op::predec`. intel/llvm#14480 Breaking changes were also made to compiler flags: @@ -871,14 +848,6 @@ Breaking changes were also made to compiler flags: - Deprecated `-fsycl-disable-range-rounding` flag in favor of the new `-fsycl-range-rounding`. intel/llvm#12715 -commit https://github.com/intel/llvm/commit/00b9b6d5db3de7257229f5c8b6aba4163a8f8977 - [SYCL][ESIMD][ABI Break] Remove predec atomic op (#14480) - Is it user visible? Is it an API break? - -commit https://github.com/intel/llvm/commit/9457144dd784d786c8f7e994bcb804f123cfb587 - [ABI-Break][SYCL] Remove ESIMD emulator code from pi.cpp (#13234) - // is it really user-visible? - ## Known Issues commit https://github.com/intel/llvm/commit/33c0829f3e3389006662845784980b930faf3b38 From b933e1c209bfd5b7b6ed640d13d34b0eae160bc4 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Mon, 9 Sep 2024 10:15:39 -0700 Subject: [PATCH 10/30] Further changes --- sycl/ReleaseNotes.md | 147 +++++++++++++++++-------------------------- 1 file changed, 59 insertions(+), 88 deletions(-) diff --git a/sycl/ReleaseNotes.md b/sycl/ReleaseNotes.md index 6afdf1c2f0b4..2e5283ebc555 100644 --- a/sycl/ReleaseNotes.md +++ b/sycl/ReleaseNotes.md @@ -82,6 +82,7 @@ Release notes for commit range - Implemented revision 2 of [`sycl_ext_oneapi_group_sort`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_group_sort.asciidoc) extension. intel/llvm14399 intel/llvm#14185 intel/llvm#13942 intel/llvm#13908 + intel/llvm#14591 ### SYCLcompat library @@ -134,15 +135,9 @@ commit https://github.com/intel/llvm/commit/9876e19f4ff387b35b0c98c7d62e5f50e6de [SYCL][XPTI] 'queue_id' metadata feature refactoring (#13070) bugfix? -commit https://github.com/intel/llvm/commit/7b3f21527abb904cb5c63e9ea32c7f0d65636436 - [SYCL] [ABI-Break] Partial implementation of sycl_ext_oneapi_cuda_cluster_group (#14113) - commit https://github.com/intel/llvm/commit/bd97f283c9f982b89a3347754edf184a38762a4a [Bindless][Exp] Windows & DX12 interop. Semaphore ops can take values. (#13860) -commit https://github.com/intel/llvm/commit/55b547e59a28c4c446a797bb8c51a83156609327 - [SYCL][ESIMD] Introduce load2d/store2d/prefetch2d API that accepts compile time properties (#13046) - commit https://github.com/intel/llvm/commit/d06724a7c304d393500b7edbb84f5c7e59f6b319 [SYCL][Graph] Specify API for explicit update using indices (#12486) commit https://github.com/intel/llvm/commit/2bc8b5bc8cbc44cf8ef1deb095c10450348904d8 @@ -151,9 +146,6 @@ commit https://github.com/intel/llvm/commit/b4e0450207b5a85d5b985de0c0ff6fecdfeb [SYCL][Graph] Export missing graph node symbols (#13744) bugfix for the above -commit https://github.com/intel/llvm/commit/c8ae6c68943b9635cd9822f3c9ee7b5cc8d98acc - [ESIMD][NFC][DOC] Add load/store/prefetch_2d functions, L1/L2 hint combinations(#13218) - ## Improvements ### SYCL Compiler @@ -200,6 +192,11 @@ commit https://github.com/intel/llvm/commit/c8ae6c68943b9635cd9822f3c9ee7b5cc8d9 fully complete. In particular, data for Lunar Lake and Battlemage Intel GPUs is still missing. intel/llvm#14590 intel/llvm#14188 intel/lvm#12727 intel/llvm#14757 intel/llvm#13486 intel/llvm#13974 +- Enhanced compiler to annotate SYCL kernel arguments passed by value with + `__grid_constant__` for CUDA backend. intel/llvm#14322 +- Added initial support for sub-groups on Native CPU backend. intel/llvm#13979 +- Added support for `reqd_work_group_size` attribute to Native CPU backend. + intel/llvm#13175 ### SYCL Library @@ -283,19 +280,22 @@ commit https://github.com/intel/llvm/commit/c8ae6c68943b9635cd9822f3c9ee7b5cc8d9 - Added support for 1- and 2-byte data types to ESIMD prefetch APIs. intel/llvm#13452 - Enabled `ext_intel_matrix` support for Intel GNR devices. intel/llvm#14436 -- Added initial support for sub-groups on Native CPU backend. intel/llvm#13979 - Added support for 1x64x16 `bfloat16` matrices on PVC> intel/llvm#13391 - - -commit https://github.com/intel/llvm/commit/c5b174d8507cad1328b3121e650120e85f1da213 - [SYCL] Implement latest version of sycl_ext_oneapi_free_function_queries (#13257) - -commit https://github.com/intel/llvm/commit/398aa20350aa38d76d9e95a8b76e3858c38faae5 - [SYCL] Support shuffle algorithms for non-uniform groups (#12705) - -commit https://github.com/intel/llvm/commit/ebb3b4a21b3b0e977f44434781729df7de83e436 - [SYCL] Remove plugin interface (#14145) - +- Added new overloads of `load_2d`, `store_2d` or `prefetch_2d` ESIMD APIs that + accept compile-time properties. intel/llvm#13046 +- Added support for `shift_group_left`, `shift_group_right`, + `permute_group_by_xor` and `select_from_group` algorithms for non-uniform + groups. intel/llvm#12705 +- Removed Plugin Interface. That is a collection of internal libraries which + implemented unified interface to various lower-level runtimes like OpenCL, + Level Zero, etc. It is now completely replaced by + [Unified Runtime](https://github.com/oneapi-src/unified-runtime/) and this + removal should reduce amount and size of redistributable libraries. +- Enchanced ESIMD `slm_atomic_update` API to also support `fsub` and `fadd` + operations. intel/llvm#13535 +- Lifted some of the restrictions from ESIMD `block_store` API. intel/llvm#13150 +- Improved implementation of `group_store` and `group_load` APIs with intent for + it to have better performance in some cases. intel/llvm#13734 intel/llvm#13673 ### Documentation @@ -307,6 +307,11 @@ commit https://github.com/intel/llvm/commit/ebb3b4a21b3b0e977f44434781729df7de83 - Updated [ESIMD functions documentation](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/supported/sycl_ext_intel_esimd/sycl_ext_intel_esimd_functions.md) to list restictions for `atomic_update`, `gather` and `scatter` functions. intel/llvm#13202 intel/llvm#13196 +- Updated [ESIMD functions documentation](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/supported/sycl_ext_intel_esimd/sycl_ext_intel_esimd_functions.md) + to documentnew overloads of `load_2d`, `store_2d` or `prefetch_2d` APIs that + accept compile-time properties. intel/llvm#13218 +- Updated [ESIMD functions documentation](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/supported/sycl_ext_intel_esimd/sycl_ext_intel_esimd_functions.md) + to document `fence` API. intel/llvm#13135 - Updated [`sycl_ext_oneapi_bfloat16_math_functions`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_bfloat16_math_functions.asciidoc) extension to support vectors of `bfloat16` to be passed to math functions. intel/llvm#14002 - Updated [`sycl_ext_intel_device_info`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/supported/sycl_ext_intel_device_info.md) @@ -335,26 +340,19 @@ commit https://github.com/intel/llvm/commit/ebb3b4a21b3b0e977f44434781729df7de83 - Updated [`sycl_ext_oneapi_matrix`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_matrix/sycl_ext_oneapi_matrix.asciidoc) extension to list 1x64x16 `bfloat16` matrix combination available on PVC. - intel/llvm#13587 - -commit https://github.com/intel/llvm/commit/0678c5ce0fe3af6363bd4b374ffaedb800a5b1e1 - [SYCL][Joint matrix] clarify the range of the prefetch templated arguments (#13796) + Clarified the meaning of `joint_matrix_prefetch` template arguments. + intel/llvm#13587 intel/llvm#13796 +- Added revision 2 of + [`sycl_ext_intel_matrix`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_matrix/sycl_ext_intel_matrix.asciidoc) + extension which introduces load, store and fill matrix operations with + out-of-bounds checks. intel/llvm#11172 commit https://github.com/intel/llvm/commit/ffc0de03f900da2d0262ea8ec41ac3847a1edbcc [SYCL][Graph][Doc] Remove outdated limitation from spec (#13163) -commit https://github.com/intel/llvm/commit/486b3dd1a2b2924e4445f1e36e5c341a09ba784f - [SYCL][Graph][Doc] Tidy of graph extension design doc (#13065) - commit https://github.com/intel/llvm/commit/09c93842ffe51602e118504e4e3229d41b2a4fb2 [SYCL][Graph] Clarify graph enable_profiling property in finalize() (#14067) -commit https://github.com/intel/llvm/commit/ecd3b903f4ddb6b32892f03c326151faa9fa63e8 - [SYCL][Joint Matrix Spec] Add new API for out of bounds fill/load/store (#11172) - -commit https://github.com/intel/llvm/commit/f1e66f5f0f59b958ff352a558dbad8b42df63175 - [ESIMD][NFC][DOC] Add fence to the ESIMD SPEC functions (#13135) - commit https://github.com/intel/llvm/commit/1e2e6baaf86009f0f9067b1146a8ca7923436e60 [SYCL][Bindless] Add image_mem_handle to image_mem_handle devices copies. (#12449) @@ -385,33 +383,18 @@ commit https://github.com/intel/llvm/commit/0dcad16c36f27e6254e7b831faaad8c6e07f [SYCL][Bindless] Update spirv fetch-sampled and fetch/write-array (#13946) bugfix?? -commit https://github.com/intel/llvm/commit/1f1be9c642889b7c0fd045b073d411e544dc6007 - [SYCL][ESIMD] Move fmax to SPIR-V intrinsic (#14020) - this one is also problematic -commit https://github.com/intel/llvm/commit/bcca7a80adf50b04c0991ef48745353ac7829016 - [SYCL][ESIMD] Move a few math operations to SPIR-V intrinsics and support new functions (#13383) - that is a regression, not an improvement :) should be noted in known issues - commit https://github.com/intel/llvm/commit/74602458d5583cf69ca575a9167def51dad15052 [SYCL][Bindless] Replace 'image_channel_order' field in 'image_descriptor' with number of channels (#13745) commit https://github.com/intel/llvm/commit/83db85f1964338d9ce67bb536f8e6c5eebe8893b [SYCL][Bindless] Update and add support for SPV_INTEL_bindless_image extension new revision (#13753) -commit https://github.com/intel/llvm/commit/d2a5e8d095c0176957f5da2c5232d8966f8ff1bf - [SYCL][Matrix] Add generation of spirv.CooperativeMatrixKHR type (#13645) - internal improvement that can be ignored? - commit https://github.com/intel/llvm/commit/82aaf27f6f0cf97ba89b58f88a18b09e23097afc [SYCL][Driver] Refactor device config parsing to better match HIP and CUDA targets (#13617) commit https://github.com/intel/llvm/commit/b11a19b1896cc2f7ab43735aacf265182e22832c [Bindless][SYCL][Doc] Add HintT tparam to cubemap fetch and sample (#13742) -commit https://github.com/intel/llvm/commit/3756fd1b778ae4ab36bd3988bfdf9ba910b779fd - [ESIMD] Enable FADD/FSUB for slm_atomic_update (#13535) - ??? - commit https://github.com/intel/llvm/commit/c65bed1073460fb8d6dbb319f5e7ff2c9c7c9422 [SYCL][Graph] Update begin_recording and end_recording (#13480) @@ -430,9 +413,6 @@ commit https://github.com/intel/llvm/commit/267a03cd1ba5eaa55db95800712f978b9384 [SYCL] [NATIVECPU] Select right libclc file for native cpu (#13478) Waiting for feedback from Pietro on these five. -commit https://github.com/intel/llvm/commit/07e3bcf9f3be46234deb471e25d94b5692353688 - [SYCL][ESIMD] Use LSC for unsupported surface index block stores (#13150) - commit https://github.com/intel/llvm/commit/03233e57e5585813ec2c0dbc7a10ceb4a6d15a71 [SYCL] Add missed intel math functions in sycl_ext_intel_math header (#13762) Is it user-visible? @@ -449,21 +429,15 @@ commit https://github.com/intel/llvm/commit/29b4d855fa1a378e89182795e0d368304c40 commit https://github.com/intel/llvm/commit/9f1cee573782772f8d062f6490128c3ee6fa6911 [SYCL][CUDA] Improve kernel launch error handling for out-of-registers (#12604) -commit https://github.com/intel/llvm/commit/db54535fb389331b167807a5d8f1ed16b5695474 - [AMDGPU][SYCL] Make unsafe atomic fadd opt in (#13955) - commit https://github.com/intel/llvm/commit/a35f862445b5666c63469cda2656b0a9946df25c [SYCL][Graph] fix the address pointer in graph print (#13595) commit https://github.com/intel/llvm/commit/c1b17e00f9b5c51db1f8385435d7a591224b01e0 [SYCL] Enable CET for wqlibsycl-devicelib-host.a (#14135) -commit https://github.com/intel/llvm/commit/3fdfbfed1ed0062b9f3848a100093b340183c6a3 - [SYCL][NATIVECPU] Support reqd_work_group_size on Native CPU (#13175) - commit https://github.com/intel/llvm/commit/fe1859085b621ea901cd8da81659923122417688 [SYCL][NVPTX] Emit reqd_work_group_size attributes as NVVM annotations (#14502) - related to above? + some kind of bugfix? commit https://github.com/intel/llvm/commit/7a9d3b1e9483b69baa0b8c6f1097016efd52854c [SYCL][NVPTX] Do not decompose SYCL functor unless necessary (#14434) @@ -483,15 +457,6 @@ commit https://github.com/intel/llvm/commit/4151c799ef36f2912fab3f6b9e305240ef4f commit https://github.com/intel/llvm/commit/f2cd2a80e7277fc62d8802673ce6ab2fac6fcbd0 [SYCL] Disable in-order queue barrier optimization while profiling (#14123) -commit https://github.com/intel/llvm/commit/2e1f14adb3bf6d9e9c55e4b0ced9e1ece2172a4a - [SYCL] Fix UB and alignment issues in the SYCL default sorter (#13975) - May not be user-visible anymore since default sorter was removed - -commit https://github.com/intel/llvm/commit/ccca3b73769bfd8a27eff9956630fe86a2e4832d - [SYCL] Optimize SG group_store via BlockWriteINTEL in simple cases (#13734) -commit https://github.com/intel/llvm/commit/48a0ff5b4b5bc21dedab37380c4ac93676277f91 - [SYCL] Optimize SG group_load via BlockReadINTEL in simple cases (#13673) - commit https://github.com/intel/llvm/commit/0a1381d286f7c32a256a6dab49917870769f1238 [SYCL][Graph] Add wording about arbitrary C++ code in CGFs (#13699) commit https://github.com/intel/llvm/commit/ece19f298b1029121da17a423b801bc2a9267a8d @@ -500,9 +465,6 @@ commit https://github.com/intel/llvm/commit/ece19f298b1029121da17a423b801bc2a926 commit https://github.com/intel/llvm/commit/7d55eb8a8419dac64f065bbf84125ed1d78dc992 [SYCL][Docs] Behavioral changes to in-order queue events extension (#13624) -commit https://github.com/intel/llvm/commit/0c0b58686a79c8d9a8ef547a96b5c1642480e591 - [XPTI][INFRA] Sample E2E data collection timing test for XPTI (#13045) - commit https://github.com/intel/llvm/commit/38e663ecd37de513d8e31afdfdf245cf8c9d17f0 [SYCL] Declare __devicelib_assert_read only when fallback assert is enabled (#13241) Is there any particular user-visible bug associated with this? @@ -519,9 +481,6 @@ commit https://github.com/intel/llvm/commit/e9befa2d10f6c23a66ac780df7a1ddda5527 [SYCL][DebugInfo] Switch to nonsemantic-shader-200 for non-FPGA HW on linux (#13107) do we need to mention it? -commit https://github.com/intel/llvm/commit/4f5a5f0fba71593888f1737e0b4dbaf49c85e04b - [SYCL] Fix WA for ocl query of CL_DEVICE_PROFILE (#13584) - commit https://github.com/intel/llvm/commit/e17632f32fcc160add43742ccdaa6cc80cc1b0c0 [Driver][SYCL] Use LLVM-IR based device libraries for device linking (#13604) commit https://github.com/intel/llvm/commit/67d8ea1cdaef29afd75f7f085f0b6c6d73af81a3 @@ -679,6 +638,17 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad env variable value is ill-formed. intel/llvm#13927 - Fixed incorrect behavior of ESIMD atomic operations on data types smaller than 4 bytes on Gen12 Intel GPUs. intel/llvm#13340 +- Fixed a bug where SYCL RT could link fallback implementation for `assert` + even though target device supports that functionality natively. + intel/llvm#13312 +- Fixed a bug in `sycl_ext_oneapi_kernel_compiler` extension implementation + where passing certain combinations of options into `build` API would lead to + compilation errors. intel/llvm#14433 +- Fixed a UB caused by improper dealing with alignment in + `sycl_ext_oneapi_group_sort` extension implementation. intel/llvm#13975 +- Fixed a bug where `device::ext_oneapi_cl_profile` could return some extra + symbols for Intel GPU devices, thus violating the format of returned value. + intel/llvm#13584 ### Documentation @@ -704,9 +674,6 @@ commit https://github.com/intel/llvm/commit/e40283b1234e0846d1a19be537948e865a31 This reverts PR #12453 and #13080 not sure which section this should go into -commit https://github.com/intel/llvm/commit/93a0ec4c465ceff1bed641422e23f13ca6b8a7cd - [SYCL] all_props_are_keys_of fix (#14433) - commit https://github.com/intel/llvm/commit/a14c0917ad741a3a27b50040e4589b56262462bc [SYCL][Bindless] Update spirv read/fetch from sampled image and sampled image array (#14493) @@ -740,10 +707,6 @@ commit https://github.com/intel/llvm/commit/4b14d706d93891cdb5b0e6a8d4b0b027c1d5 [SYCL][DeviceSanitizer] Use -asan-constructor-kind=none to disable ctor/dtor (#13259) was the bug really user-visible? -commit https://github.com/intel/llvm/commit/0939f39818225ce3e469e08f6a45711b449a8ad4 - [SYCL] Align assert ext name with libdevice implementation (#13312) - can likely be ommitted - commit https://github.com/intel/llvm/commit/3c7f99d891cdd7c929b38b18bf6877c3c8dba163 [SYCL][Graph] Fix potential issue with command buffer commands (#13224) Was it a user-visible issue? Why no test? @@ -758,12 +721,6 @@ commit https://github.com/intel/llvm/commit/64cb0cf96de28bfd495e577b4dd46c26dbb6 commit https://github.com/intel/llvm/commit/1b5c5a8e96502b196c91251fa6513a6ede1257f5 [SYCL] Fix SYCL_EXTERNAL device code when linking with a static lib (#14256) -commit https://github.com/intel/llvm/commit/d77a348776672316f59c59dc3b11ebf5aa79f936 - [SYCL][NVPTX] Emit 'grid_constant' annotations for by-val kernel params (#14332) - -commit https://github.com/intel/llvm/commit/0360e6af2a353210d508633a60ff02327094f7e7 - [SYCL] Follow up fixes for group_sort extension (#14591) - ## API/ABI Breaking Changes This release is an *ABI* breaking release, meaning that any applications which @@ -835,6 +792,8 @@ of some classes to use so-called preview implementation. extension. intel/llvm#13482 - Simplified template arguments related to `simd_view` of many ESIMD APIs. intel/llvm#13231 - Removed ESIMD `atomic_op::predec`. intel/llvm#14480 +- Dropped interfaces from revision 1 of experimental + `sycl_ext_oneapi_group_sort` extension. intel/llvm14531 Breaking changes were also made to compiler flags: @@ -850,6 +809,19 @@ Breaking changes were also made to compiler flags: ## Known Issues +- On Windows, the Unified Runtime's Level Zero leak check does not work correctly with + the default contexts on Windows. This is because on Windows the release + of the plugin DLLs races against the release of static global variables + (like the default context). +- Intel Graphic Compiler's Vector Compute backend does not support O0 code and + often gets miscompiled, produces wrong answers and crashes. This issue directly + affects ESIMD code at O0. As a temporary workaround, we have optimize ESIMD code + even in O0 mode. + [00749b1e8](https://github.com/intel/llvm/commit/00749b1e8e3085acfdc63108f073a255842533e2) +- [new] Use of some SYCL math built-ins (like `abs` or `clz`) in a program where + `sycl/ext/intel/esimd.hpp` header is included causes compilation errors. + This will be fixed in the next release (intel/llvm#14793) + commit https://github.com/intel/llvm/commit/33c0829f3e3389006662845784980b930faf3b38 Author: Igor Chorążewicz Date: Thu Jul 25 23:00:19 2024 -0700 @@ -1697,7 +1669,6 @@ commit https://github.com/intel/llvm/commit/cf402b8473e9b3a4ee675a6154b80f0d54b1 Strictly speaking, this may have a visible effect for end users since some of queries won't always return `false` anymore. - # Mar'24 release notes Release notes for commit range [f4e0d3177338](https://github.com/intel/llvm/commit/f4ed132f243ab43816ebe826669d978139964df2).. [d2817d6d317db1](https://github.com/intel/llvm/commit/d2817d6d317db1143bb227168e85c409d5ab7c82) From 551b04c7d6066f9fb35e383e514eba954099cb10 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Tue, 10 Sep 2024 11:23:08 -0700 Subject: [PATCH 11/30] Further updates --- sycl/ReleaseNotes.md | 438 +++++++++++++++++++++---------------------- 1 file changed, 214 insertions(+), 224 deletions(-) diff --git a/sycl/ReleaseNotes.md b/sycl/ReleaseNotes.md index 2e5283ebc555..448d87158819 100644 --- a/sycl/ReleaseNotes.md +++ b/sycl/ReleaseNotes.md @@ -5,6 +5,156 @@ Release notes for commit range ... [ebb3b4a21b3b0e](https://github.com/intel/llvm/commit/ebb3b4a21b3b0e977f44434781729df7de83e436) +## TODO + +commit https://github.com/intel/llvm/commit/9876e19f4ff387b35b0c98c7d62e5f50e6de187d + [SYCL][XPTI] 'queue_id' metadata feature refactoring (#13070) + bugfix? + +commit https://github.com/intel/llvm/commit/0eeae2ac96ea179099dd5d57c241260ccfe65f73 + [SYCL][Graph] Update design doc for copy optimization and add test (#13051) + +commit https://github.com/intel/llvm/commit/4acca904c0e07fd6b504f7938f539bc1a0e94ce0 + [CLC][AMDGPU] Refactor fence helper to process order semantic explicitly (#12872) + ??? + +commit https://github.com/intel/llvm/commit/0dcad16c36f27e6254e7b831faaad8c6e07f8cfb + [SYCL][Bindless] Update spirv fetch-sampled and fetch/write-array (#13946) + bugfix?? + +commit https://github.com/intel/llvm/commit/83db85f1964338d9ce67bb536f8e6c5eebe8893b + [SYCL][Bindless] Update and add support for SPV_INTEL_bindless_image extension new revision (#13753) + Do we claim support for bindless images on Intel GPUs? Because if not, then this should be omitted + +commit https://github.com/intel/llvm/commit/82aaf27f6f0cf97ba89b58f88a18b09e23097afc + [SYCL][Driver] Refactor device config parsing to better match HIP and CUDA targets (#13617) + +commit https://github.com/intel/llvm/commit/b11a19b1896cc2f7ab43735aacf265182e22832c + [Bindless][SYCL][Doc] Add HintT tparam to cubemap fetch and sample (#13742) + +commit https://github.com/intel/llvm/commit/c65bed1073460fb8d6dbb319f5e7ff2c9c7c9422 + [SYCL][Graph] Update begin_recording and end_recording (#13480) + +commit https://github.com/intel/llvm/commit/d6dfd0c77b2212f4e3e926d2e289bd3dc6e18b49 + [SYCL][Graph][DOC] add an edge case for record&replay mode (#12916) + +commit https://github.com/intel/llvm/commit/03233e57e5585813ec2c0dbc7a10ceb4a6d15a71 + [SYCL] Add missed intel math functions in sycl_ext_intel_math header (#13762) + Is it user-visible? + +commit https://github.com/intel/llvm/commit/6cb77fcfb37ffb445ab62ea1545422dc52128da1 + [SYCL] Add -fPIC for Intel math function host code (#13800) + +commit https://github.com/intel/llvm/commit/84bae21d3f63f04ca50bfffc5203909ba3fd95a6 + Implement missing overloads for generic AS in generic target (#13938) + +commit https://github.com/intel/llvm/commit/29b4d855fa1a378e89182795e0d368304c40c3f6 + [SYCL][CUDA] Enable support of msvc math functions for nvptx target. (#14007) + +commit https://github.com/intel/llvm/commit/9f1cee573782772f8d062f6490128c3ee6fa6911 + [SYCL][CUDA] Improve kernel launch error handling for out-of-registers (#12604) + +commit https://github.com/intel/llvm/commit/a35f862445b5666c63469cda2656b0a9946df25c + [SYCL][Graph] fix the address pointer in graph print (#13595) + +commit https://github.com/intel/llvm/commit/c1b17e00f9b5c51db1f8385435d7a591224b01e0 + [SYCL] Enable CET for wqlibsycl-devicelib-host.a (#14135) + +commit https://github.com/intel/llvm/commit/fe1859085b621ea901cd8da81659923122417688 + [SYCL][NVPTX] Emit reqd_work_group_size attributes as NVVM annotations (#14502) + some kind of bugfix? + +commit https://github.com/intel/llvm/commit/7a9d3b1e9483b69baa0b8c6f1097016efd52854c + [SYCL][NVPTX] Do not decompose SYCL functor unless necessary (#14434) + +commit https://github.com/intel/llvm/commit/0b9fc099f63feadb5e476c5862de3d8fa977a655 + [SYCL][Graph] Test WGU kernel mismatch (#14379) + +commit https://github.com/intel/llvm/commit/93fef86cd4fb8e18c126365c404eea1ed0f1a7fa + [SYCL][Graph] Permit empty & barrier nodes in WGU (#14236) + +commit https://github.com/intel/llvm/commit/4151c799ef36f2912fab3f6b9e305240ef4ff327 + [SYCL][Graph] Wait instead of flush dep events in update command (#14167) + +commit https://github.com/intel/llvm/commit/f2cd2a80e7277fc62d8802673ce6ab2fac6fcbd0 + [SYCL] Disable in-order queue barrier optimization while profiling (#14123) + +commit https://github.com/intel/llvm/commit/0a1381d286f7c32a256a6dab49917870769f1238 + [SYCL][Graph] Add wording about arbitrary C++ code in CGFs (#13699) +commit https://github.com/intel/llvm/commit/ece19f298b1029121da17a423b801bc2a9267a8d + [SYCL][Graph] Clarify graph in-order and out-of-order properties (#13681) + +commit https://github.com/intel/llvm/commit/7d55eb8a8419dac64f065bbf84125ed1d78dc992 + [SYCL][Docs] Behavioral changes to in-order queue events extension (#13624) + +commit https://github.com/intel/llvm/commit/38e663ecd37de513d8e31afdfdf245cf8c9d17f0 + [SYCL] Declare __devicelib_assert_read only when fallback assert is enabled (#13241) + Is there any particular user-visible bug associated with this? + +commit https://github.com/intel/llvm/commit/65bdffb1c9d4c474316d3e330fc3c59338e004f6 + [SYCL][libclc][NATIVECPU] Implement generic atomic load for generic target (#13249) + new feature? + +commit https://github.com/intel/llvm/commit/05644a470303c2af3385b9533b8d23ebdea99eb7 + [OpenCL] Config dependent-load flag to exclude CWD from DLL search path (#13327) + do we report security issues? + +commit https://github.com/intel/llvm/commit/e17632f32fcc160add43742ccdaa6cc80cc1b0c0 + [Driver][SYCL] Use LLVM-IR based device libraries for device linking (#13604) +commit https://github.com/intel/llvm/commit/67d8ea1cdaef29afd75f7f085f0b6c6d73af81a3 + [Driver][SYCL][FPGA] Use bundled device libraries for FPGA targets (#13693) + +commit https://github.com/intel/llvm/commit/1665cc0dd57266d2677c625725d38973cce3e8d9 + [SYCL][Graph] Enable in-order cmd-list (#13088) + perf optimization + new feature? + +commit https://github.com/intel/llvm/commit/d13fdbe4ee02c39b1939bae7da61392e75ce2c78 + [Bindless][Exp] Add texture fetch functionality (#12447) + or a new feature? + +commit https://github.com/intel/llvm/commit/d4f2fe54047a1b415af2402a497f20e918094580 + [SYCL][Bindless][Exp] Remove const from non-reference and non-pointer type parameters (#14238) + +commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad6cad + [DeviceSanitizer] Disable handling no return calls (#14652) + // bugfix? + +commit https://github.com/intel/llvm/commit/a14c0917ad741a3a27b50040e4589b56262462bc + [SYCL][Bindless] Update spirv read/fetch from sampled image and sampled image array (#14493) + +commit https://github.com/intel/llvm/commit/c1ee064428a2d4038021dc3284a4c2f3aa897cb8 + [SYCL][Bindless] Fix OpaqueFD/Win32Handle's scope in piextImportExternalMemory/Semaphore (#14266) + +commit https://github.com/intel/llvm/commit/14ee7e1cca79cac97ecc41ddc15d5d724011c89a + [SYCL][Bindless][Exp] Remove unneeded function argument causing memory leak in image create functions (#13364) + +commit https://github.com/intel/llvm/commit/4b993a7b32f7743980bce646765a1b427b0996b6 + Revert "[SYCL][Driver] Link with sycl libs at link step of clang-cl -fsycl (#12793)" (#13326) + revert commit https://github.com/intel/llvm/commit/seems to be a part of a previous release + +commit https://github.com/intel/llvm/commit/5794326b965071a69273a1f653405670b728e66b + [SYCL][NATIVECPU][DRIVER] Select remangled libclc variant for Native CPU (#13765) + ??? + +commit https://github.com/intel/llvm/commit/f170c63ed329c1fa5271d67e68144ec5d7808079 + [SYCL] Fix kernel shortcut path for inorder queue (#13333) + could be related to a commit https://github.com/intel/llvm/commit/made post-March release, i.e. it can probably be squashed with some other line + +commit https://github.com/intel/llvm/commit/4b14d706d93891cdb5b0e6a8d4b0b027c1d54ab8 + [SYCL][DeviceSanitizer] Use -asan-constructor-kind=none to disable ctor/dtor (#13259) + was the bug really user-visible? + +commit https://github.com/intel/llvm/commit/3c7f99d891cdd7c929b38b18bf6877c3c8dba163 + [SYCL][Graph] Fix potential issue with command buffer commands (#13224) + Was it a user-visible issue? Why no test? + +commit https://github.com/intel/llvm/commit/64cb0cf96de28bfd495e577b4dd46c26dbb6b197 + [SYCL][Graph] Fix minor issues in graph update code (#13660) + bugfix for #13011? + ++UR commit below + ## New Features ### SYCL Compiler @@ -131,21 +281,6 @@ Release notes for commit range - Added specification for [`sycl_ext_codeplay_enqueue_native_command`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_codeplay_enqueue_native_command.asciidoc) extension. intel/llvm#14136 - Added specification for [`SPV_INTEL_bindless_images`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/design/spirv-extensions/SPV_INTEL_bindless_images.asciidoc) extension. intel/llvm#12927 -commit https://github.com/intel/llvm/commit/9876e19f4ff387b35b0c98c7d62e5f50e6de187d - [SYCL][XPTI] 'queue_id' metadata feature refactoring (#13070) - bugfix? - -commit https://github.com/intel/llvm/commit/bd97f283c9f982b89a3347754edf184a38762a4a - [Bindless][Exp] Windows & DX12 interop. Semaphore ops can take values. (#13860) - -commit https://github.com/intel/llvm/commit/d06724a7c304d393500b7edbb84f5c7e59f6b319 - [SYCL][Graph] Specify API for explicit update using indices (#12486) -commit https://github.com/intel/llvm/commit/2bc8b5bc8cbc44cf8ef1deb095c10450348904d8 - [SYCL][Graph] Implementation of explicit update with indices (#12840) -commit https://github.com/intel/llvm/commit/b4e0450207b5a85d5b985de0c0ff6fecdfebf0da - [SYCL][Graph] Export missing graph node symbols (#13744) - bugfix for the above - ## Improvements ### SYCL Compiler @@ -197,6 +332,13 @@ commit https://github.com/intel/llvm/commit/b4e0450207b5a85d5b985de0c0ff6fecdfeb - Added initial support for sub-groups on Native CPU backend. intel/llvm#13979 - Added support for `reqd_work_group_size` attribute to Native CPU backend. intel/llvm#13175 +- Introduced some extra address space inference for `invoke_simd` API so that + backends are able to generate better code. intel/llvm#14628 +- Improved math built-ins support on Native CPU backend: added support for bf16 + and pointers in generic address space. intel/llvm#13109 intel/llvm#13829 + intel/llvm#13911 intel/llvm#13428 intel/llvm#13478 +- Improved debugging experience on Linux (CPU & GPU) and Windows (CPU AOT only). + intel/llvm#13107 ### SYCL Library @@ -296,6 +438,13 @@ commit https://github.com/intel/llvm/commit/b4e0450207b5a85d5b985de0c0ff6fecdfeb - Lifted some of the restrictions from ESIMD `block_store` API. intel/llvm#13150 - Improved implementation of `group_store` and `group_load` APIs with intent for it to have better performance in some cases. intel/llvm#13734 intel/llvm#13673 +- Added support for graph update functionality. intel/llvm#12840 +- Added support for external semaphores that can take value to bindless images + extension. intel/llvm#13860 +- Added support for device-to-device copying of `image_device_handle`. + intel/llvm#12449 +- Improved performance of `queue::fill` on CUDA backend by making it use 2- and + 4-byte operations instead of only using 1-byte operations. intel/llvm#13788 ### Documentation @@ -339,22 +488,29 @@ commit https://github.com/intel/llvm/commit/b4e0450207b5a85d5b985de0c0ff6fecdfeb extension (it is not `work_group_specific` anymore). intel/llvm#14271 - Updated [`sycl_ext_oneapi_matrix`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_matrix/sycl_ext_oneapi_matrix.asciidoc) - extension to list 1x64x16 `bfloat16` matrix combination available on PVC. - Clarified the meaning of `joint_matrix_prefetch` template arguments. - intel/llvm#13587 intel/llvm#13796 + extension: + - listed 1x64x16 `bfloat16` matrix combination as available on PVC. + intel/llvm#13587 + - clarified the meaning of `joint_matrix_prefetch` template arguments. + intel/llvm#13796 + - added a note about known issue with some CUDA devices. intel/llvm#14178 - Added revision 2 of [`sycl_ext_intel_matrix`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_matrix/sycl_ext_intel_matrix.asciidoc) extension which introduces load, store and fill matrix operations with out-of-bounds checks. intel/llvm#11172 - -commit https://github.com/intel/llvm/commit/ffc0de03f900da2d0262ea8ec41ac3847a1edbcc - [SYCL][Graph][Doc] Remove outdated limitation from spec (#13163) - -commit https://github.com/intel/llvm/commit/09c93842ffe51602e118504e4e3229d41b2a4fb2 - [SYCL][Graph] Clarify graph enable_profiling property in finalize() (#14067) - -commit https://github.com/intel/llvm/commit/1e2e6baaf86009f0f9067b1146a8ca7923436e60 - [SYCL][Bindless] Add image_mem_handle to image_mem_handle devices copies. (#12449) +- Updated + [`sycl_ext_oneapi_group`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_graph.asciidoc) + extension: + - added new functionality to update arguments and ND-range sizes of kernel + nodes within a graph. intel/llvm#12486 + - clarified `enable_profiling` property in `command_graph::finalize` + intel/llvm#14067 +- Updated + [`sycl_ext_oneapi_bindless_images`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_bindless_images.asciidoc) + extension: + - added support external semaphores that can take value. intel/llvm#13860 + - added support for copying `image_mem_handle` between devices via + `ext_oneapi_copy`. intel/llvm#12449 ### SYCLcompat @@ -364,149 +520,8 @@ commit https://github.com/intel/llvm/commit/1e2e6baaf86009f0f9067b1146a8ca792343 `shift_sub_group_left`, `shift_sub_group_right` and `permute_sub_group_by_xor` to support CUDA devices. intel/llvm#13363 - Restricted `memory_order` argument of `atomic_ref` passed to - `experimental::nd_range_barrier` to match supported on a device. intel/llvm#12974 intel/llvm#13641 - -commit https://github.com/intel/llvm/commit/13c9d0ef964b17dd3e2c297b1ceb2ecb8ea2ffe9 - [SYCL][Bindless][Doc][ABI-Break] Rename external semaphore destroy to release (#14535) - -commit https://github.com/intel/llvm/commit/fb561b9f336f8f9c286a1125631dedf1b5fb1e4b - [SYCL][Bindless][Doc][ABI-Break] Add const qualifiers to copies (#14140) - -commit https://github.com/intel/llvm/commit/0eeae2ac96ea179099dd5d57c241260ccfe65f73 - [SYCL][Graph] Update design doc for copy optimization and add test (#13051) - -commit https://github.com/intel/llvm/commit/4acca904c0e07fd6b504f7938f539bc1a0e94ce0 - [CLC][AMDGPU] Refactor fence helper to process order semantic explicitly (#12872) - ??? - -commit https://github.com/intel/llvm/commit/0dcad16c36f27e6254e7b831faaad8c6e07f8cfb - [SYCL][Bindless] Update spirv fetch-sampled and fetch/write-array (#13946) - bugfix?? - -commit https://github.com/intel/llvm/commit/74602458d5583cf69ca575a9167def51dad15052 - [SYCL][Bindless] Replace 'image_channel_order' field in 'image_descriptor' with number of channels (#13745) - -commit https://github.com/intel/llvm/commit/83db85f1964338d9ce67bb536f8e6c5eebe8893b - [SYCL][Bindless] Update and add support for SPV_INTEL_bindless_image extension new revision (#13753) - -commit https://github.com/intel/llvm/commit/82aaf27f6f0cf97ba89b58f88a18b09e23097afc - [SYCL][Driver] Refactor device config parsing to better match HIP and CUDA targets (#13617) - -commit https://github.com/intel/llvm/commit/b11a19b1896cc2f7ab43735aacf265182e22832c - [Bindless][SYCL][Doc] Add HintT tparam to cubemap fetch and sample (#13742) - -commit https://github.com/intel/llvm/commit/c65bed1073460fb8d6dbb319f5e7ff2c9c7c9422 - [SYCL][Graph] Update begin_recording and end_recording (#13480) - -commit https://github.com/intel/llvm/commit/d6dfd0c77b2212f4e3e926d2e289bd3dc6e18b49 - [SYCL][Graph][DOC] add an edge case for record&replay mode (#12916) - -commit https://github.com/intel/llvm/commit/8847c110c78684a86ec7e62d7255f1bb9c6efd4f - [SYCL][NATIVECPU][libclc]Mark opencl_c_generic_address_space as unsupported on Native CPU (#13109) -commit https://github.com/intel/llvm/commit/a25d27bc9fbb2925519e966b9e7043be04274b27 - [SYCL][NATIVECPU][LIBCLC] Implement missing builtins for half type (#13829) -commit https://github.com/intel/llvm/commit/47a03418ac74f3a5492213afc192569eae1393ec - [SYCL][LIBCLC][NATIVECPU] Add aarch64 target triple for Native CPU (#13911) -commit https://github.com/intel/llvm/commit/0ce40f46ef4e2f5e8eed75e28352a90c9b8ecbaf - [SYCL] [NATIVECPU] Implement generic atomic store for generic target (#13428) -commit https://github.com/intel/llvm/commit/267a03cd1ba5eaa55db95800712f978b93842bc5 - [SYCL] [NATIVECPU] Select right libclc file for native cpu (#13478) - Waiting for feedback from Pietro on these five. - -commit https://github.com/intel/llvm/commit/03233e57e5585813ec2c0dbc7a10ceb4a6d15a71 - [SYCL] Add missed intel math functions in sycl_ext_intel_math header (#13762) - Is it user-visible? - -commit https://github.com/intel/llvm/commit/6cb77fcfb37ffb445ab62ea1545422dc52128da1 - [SYCL] Add -fPIC for Intel math function host code (#13800) - -commit https://github.com/intel/llvm/commit/84bae21d3f63f04ca50bfffc5203909ba3fd95a6 - Implement missing overloads for generic AS in generic target (#13938) - -commit https://github.com/intel/llvm/commit/29b4d855fa1a378e89182795e0d368304c40c3f6 - [SYCL][CUDA] Enable support of msvc math functions for nvptx target. (#14007) - -commit https://github.com/intel/llvm/commit/9f1cee573782772f8d062f6490128c3ee6fa6911 - [SYCL][CUDA] Improve kernel launch error handling for out-of-registers (#12604) - -commit https://github.com/intel/llvm/commit/a35f862445b5666c63469cda2656b0a9946df25c - [SYCL][Graph] fix the address pointer in graph print (#13595) - -commit https://github.com/intel/llvm/commit/c1b17e00f9b5c51db1f8385435d7a591224b01e0 - [SYCL] Enable CET for wqlibsycl-devicelib-host.a (#14135) - -commit https://github.com/intel/llvm/commit/fe1859085b621ea901cd8da81659923122417688 - [SYCL][NVPTX] Emit reqd_work_group_size attributes as NVVM annotations (#14502) - some kind of bugfix? - -commit https://github.com/intel/llvm/commit/7a9d3b1e9483b69baa0b8c6f1097016efd52854c - [SYCL][NVPTX] Do not decompose SYCL functor unless necessary (#14434) - -commit https://github.com/intel/llvm/commit/0b9fc099f63feadb5e476c5862de3d8fa977a655 - [SYCL][Graph] Test WGU kernel mismatch (#14379) - -commit https://github.com/intel/llvm/commit/0ccb0b7d3dd614707f82ea8f99790e2d3b08496d - [SYCL][ABI-Break] Improve Queue fill (#13788) - -commit https://github.com/intel/llvm/commit/93fef86cd4fb8e18c126365c404eea1ed0f1a7fa - [SYCL][Graph] Permit empty & barrier nodes in WGU (#14236) - -commit https://github.com/intel/llvm/commit/4151c799ef36f2912fab3f6b9e305240ef4ff327 - [SYCL][Graph] Wait instead of flush dep events in update command (#14167) - -commit https://github.com/intel/llvm/commit/f2cd2a80e7277fc62d8802673ce6ab2fac6fcbd0 - [SYCL] Disable in-order queue barrier optimization while profiling (#14123) - -commit https://github.com/intel/llvm/commit/0a1381d286f7c32a256a6dab49917870769f1238 - [SYCL][Graph] Add wording about arbitrary C++ code in CGFs (#13699) -commit https://github.com/intel/llvm/commit/ece19f298b1029121da17a423b801bc2a9267a8d - [SYCL][Graph] Clarify graph in-order and out-of-order properties (#13681) - -commit https://github.com/intel/llvm/commit/7d55eb8a8419dac64f065bbf84125ed1d78dc992 - [SYCL][Docs] Behavioral changes to in-order queue events extension (#13624) - -commit https://github.com/intel/llvm/commit/38e663ecd37de513d8e31afdfdf245cf8c9d17f0 - [SYCL] Declare __devicelib_assert_read only when fallback assert is enabled (#13241) - Is there any particular user-visible bug associated with this? - -commit https://github.com/intel/llvm/commit/65bdffb1c9d4c474316d3e330fc3c59338e004f6 - [SYCL][libclc][NATIVECPU] Implement generic atomic load for generic target (#13249) - new feature? - -commit https://github.com/intel/llvm/commit/05644a470303c2af3385b9533b8d23ebdea99eb7 - [OpenCL] Config dependent-load flag to exclude CWD from DLL search path (#13327) - do we report security issues? - -commit https://github.com/intel/llvm/commit/e9befa2d10f6c23a66ac780df7a1ddda55279230 - [SYCL][DebugInfo] Switch to nonsemantic-shader-200 for non-FPGA HW on linux (#13107) - do we need to mention it? - -commit https://github.com/intel/llvm/commit/e17632f32fcc160add43742ccdaa6cc80cc1b0c0 - [Driver][SYCL] Use LLVM-IR based device libraries for device linking (#13604) -commit https://github.com/intel/llvm/commit/67d8ea1cdaef29afd75f7f085f0b6c6d73af81a3 - [Driver][SYCL][FPGA] Use bundled device libraries for FPGA targets (#13693) - -commit https://github.com/intel/llvm/commit/1665cc0dd57266d2677c625725d38973cce3e8d9 - [SYCL][Graph] Enable in-order cmd-list (#13088) - perf optimization - new feature? - -commit https://github.com/intel/llvm/commit/d13fdbe4ee02c39b1939bae7da61392e75ce2c78 - [Bindless][Exp] Add texture fetch functionality (#12447) - or a new feature? - -commit https://github.com/intel/llvm/commit/d4f2fe54047a1b415af2402a497f20e918094580 - [SYCL][Bindless][Exp] Remove const from non-reference and non-pointer type parameters (#14238) - -commit https://github.com/intel/llvm/commit/9800153d373eed9bb5d23acf965541ab0a99b316 - [MATRIX][DOC][E2E] Add note on sm version nvidia device issue. (#14178) - -commit https://github.com/intel/llvm/commit/2bac63f5ebd62b29c8fe916a89b8b42ae536d609 - [ESIMD] Infer address space of pointer that are passed through invoke_simd to ESIMD API to generate better code on BE (#14628) - -commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad6cad - [DeviceSanitizer] Disable handling no return calls (#14652) - // bugfix? + `experimental::nd_range_barrier` to match supported on a device. + intel/llvm#12974 intel/llvm#13641 ## Bug Fixes @@ -533,6 +548,9 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad `reqd_work_group_size` attributes attached to them using `-fsycl-device-code-split=none` would result in an exception being thrown at runtime about mismatching work-group size. intel/llvm#13523 +- Fixed a bug where compiling a kernel that is annotated with + `reqd_work_group_size` attribute that has less than 3 arguments for HIP + target caused compiler to crash. intel/llvm#13600 ### SYCL Library @@ -579,7 +597,8 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad on Windows. intel/llvm#13784 TODO: was it really a compilation issue? - Fixed an issue where compiler could emit SPIR-V instructions for reversing - bits in a variable which are not supported by device compilers. intel/llvm#13810 + bits in a variable which are not supported by device compilers. + intel/llvm#13810 intel/llvm#13044 - Fixed a bug where having a default-constructed `local_accessor` as an argument could lead to runtime errors reported about being unable to set kernel argument. The issue manifested itself on Windows and under `-O0` optimization @@ -649,6 +668,15 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad - Fixed a bug where `device::ext_oneapi_cl_profile` could return some extra symbols for Intel GPU devices, thus violating the format of returned value. intel/llvm#13584 +- Fixed a bug where use of `SYCL_EXTERNAL` functions defined within a static + produced by `llvm-ar` would result in runtime error from JIT compilation + about that `SYCL_EXTERNAL` function being unresolved. intel/llvm#14256 +- Fixed a bug where calling `sycl::make_device` using an interop object obtained + from a SYCL `device` object would result in a device that is not equally + comparable with the original `device` object. See related issue + intel/llvm#6055. The bug is fixed for Level Zero backend, but it still may + exibit itself on OpenCL backend when sub-devices are involved. + intel/llvm#13483 ### Documentation @@ -661,7 +689,6 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad - Corrected installation steps for CPU/FPGA low-level runtimes in [Get Started Guide](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/GetStartedGuide.md). intel/llvm#14204 - ### SYCLcompat - Fixed compilation issue on Windows with `syclcompat::cabs`. intel/llvm#13518 @@ -669,58 +696,6 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad parameter. intel/llvm#13821 - Fixed compilation issues when `SYCL_COMPAT_PROFILING_ENABLED` is defined. intel/llvm#14574 -commit https://github.com/intel/llvm/commit/e40283b1234e0846d1a19be537948e865a31f360 - Task sequence revert (#14359) - This reverts PR #12453 and #13080 - not sure which section this should go into - -commit https://github.com/intel/llvm/commit/a14c0917ad741a3a27b50040e4589b56262462bc - [SYCL][Bindless] Update spirv read/fetch from sampled image and sampled image array (#14493) - -commit https://github.com/intel/llvm/commit/c1ee064428a2d4038021dc3284a4c2f3aa897cb8 - [SYCL][Bindless] Fix OpaqueFD/Win32Handle's scope in piextImportExternalMemory/Semaphore (#14266) - -commit https://github.com/intel/llvm/commit/9ec73a21782de1d11d08e97d63a27fa8b208c1e5 - [SYCL] Add work_group_num_dim metadata (#13600) - Fixes reqd_work_group_size for HIP - -commit https://github.com/intel/llvm/commit/14ee7e1cca79cac97ecc41ddc15d5d724011c89a - [SYCL][Bindless][Exp] Remove unneeded function argument causing memory leak in image create functions (#13364) - -commit https://github.com/intel/llvm/commit/4b993a7b32f7743980bce646765a1b427b0996b6 - Revert "[SYCL][Driver] Link with sycl libs at link step of clang-cl -fsycl (#12793)" (#13326) - revert commit https://github.com/intel/llvm/commit/seems to be a part of a previous release - -commit https://github.com/intel/llvm/commit/5794326b965071a69273a1f653405670b728e66b - [SYCL][NATIVECPU][DRIVER] Select remangled libclc variant for Native CPU (#13765) - ??? - -commit https://github.com/intel/llvm/commit/0fde69dbfa18e0c9b477a916477297a832e194a3 - [SYCL] Do not enable SPV_KHR_bit_instructions until downstream tools are ready (#13044) - Perhaps it can be fully omitted, because it may have been "reverted" later - -commit https://github.com/intel/llvm/commit/f170c63ed329c1fa5271d67e68144ec5d7808079 - [SYCL] Fix kernel shortcut path for inorder queue (#13333) - could be related to a commit https://github.com/intel/llvm/commit/made post-March release, i.e. it can probably be squashed with some other line - -commit https://github.com/intel/llvm/commit/4b14d706d93891cdb5b0e6a8d4b0b027c1d54ab8 - [SYCL][DeviceSanitizer] Use -asan-constructor-kind=none to disable ctor/dtor (#13259) - was the bug really user-visible? - -commit https://github.com/intel/llvm/commit/3c7f99d891cdd7c929b38b18bf6877c3c8dba163 - [SYCL][Graph] Fix potential issue with command buffer commands (#13224) - Was it a user-visible issue? Why no test? - -commit https://github.com/intel/llvm/commit/6b2fb665e9aa0bb7f2e034a22a153b7006c19d8a - [SYCL] Fix Level-Zero's `sycl::make_device` interop (#13483) - -commit https://github.com/intel/llvm/commit/64cb0cf96de28bfd495e577b4dd46c26dbb6b197 - [SYCL][Graph] Fix minor issues in graph update code (#13660) - bugfix for #13011? - -commit https://github.com/intel/llvm/commit/1b5c5a8e96502b196c91251fa6513a6ede1257f5 - [SYCL] Fix SYCL_EXTERNAL device code when linking with a static lib (#14256) - ## API/ABI Breaking Changes This release is an *ABI* breaking release, meaning that any applications which @@ -740,6 +715,9 @@ to be launched using newer versions of SYCL runtime library. in applications which were built with pre-C++11 ABI. intel/llvm#13183 intel/llvm#13549 intel/llvm#13560 intel/llvm#13212 intel/llvm#13213 intel/llvm#13447 +- Changed `ext_oneapi_copy` API from experimental + `sycl_ext_oneapi_bindless_images` extension to accept `const`-qualified + types for `Src` parameter. intel/llvm#14140 Several API breaking changes were made as well, mostly compltely dropping support for previosly deprecated APIs and in some cases switching implmentations @@ -762,6 +740,12 @@ of some classes to use so-called preview implementation. - Removed deprecated APIs from [`sycl_ext_oneapi_bindless_images`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_bindless_images.asciidoc) extension implementation. intel/llvm#14555 +- Renamed experimental `destroy_external_semaphore` API from + `sycl_ext_oneapi_bindless_images` extension into `release_external_semaphore`. + intel/llvm#14535 +- Replaced `image_channel_order` field of `image_descriptor` struct with + number of channels in experimental `sycl_ext_oneapi_bindless_images` + extension. intel/llvm#13745 - Renamed SYCLcompat function `async_free` to `enqueue_free`. intel/llvm#14015 - Enforced restrictions on first argument of lambdas/functors passed to `parallel_for(range)` and `parallel_for(nd_range)`. intel/llvm#13198 @@ -821,6 +805,12 @@ Breaking changes were also made to compiler flags: - [new] Use of some SYCL math built-ins (like `abs` or `clz`) in a program where `sycl/ext/intel/esimd.hpp` header is included causes compilation errors. This will be fixed in the next release (intel/llvm#14793) +- [new] When using `sycl_ext_oneapi_matrix` extension it is important for some + devices to use the sm version (Compute Capability) corresponding to the device + that will run the program, i.e. use `-fsycl-targets=nvidia_gpu_sm_xx` during + compilation. This particularly affects matrix operations using `half` data + type. For more information on this issue consult with + https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#wmma-restrictions commit https://github.com/intel/llvm/commit/33c0829f3e3389006662845784980b930faf3b38 Author: Igor Chorążewicz From 2f108abe89acd4adc9bfa3c37901d9964cb97fc0 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Thu, 12 Sep 2024 07:35:28 -0700 Subject: [PATCH 12/30] Further updates --- sycl/ReleaseNotes.md | 192 ++++++++++++++----------------------------- 1 file changed, 61 insertions(+), 131 deletions(-) diff --git a/sycl/ReleaseNotes.md b/sycl/ReleaseNotes.md index 448d87158819..e95d0e236b2b 100644 --- a/sycl/ReleaseNotes.md +++ b/sycl/ReleaseNotes.md @@ -11,147 +11,26 @@ commit https://github.com/intel/llvm/commit/9876e19f4ff387b35b0c98c7d62e5f50e6de [SYCL][XPTI] 'queue_id' metadata feature refactoring (#13070) bugfix? -commit https://github.com/intel/llvm/commit/0eeae2ac96ea179099dd5d57c241260ccfe65f73 - [SYCL][Graph] Update design doc for copy optimization and add test (#13051) - -commit https://github.com/intel/llvm/commit/4acca904c0e07fd6b504f7938f539bc1a0e94ce0 - [CLC][AMDGPU] Refactor fence helper to process order semantic explicitly (#12872) - ??? - -commit https://github.com/intel/llvm/commit/0dcad16c36f27e6254e7b831faaad8c6e07f8cfb - [SYCL][Bindless] Update spirv fetch-sampled and fetch/write-array (#13946) - bugfix?? - -commit https://github.com/intel/llvm/commit/83db85f1964338d9ce67bb536f8e6c5eebe8893b - [SYCL][Bindless] Update and add support for SPV_INTEL_bindless_image extension new revision (#13753) - Do we claim support for bindless images on Intel GPUs? Because if not, then this should be omitted - -commit https://github.com/intel/llvm/commit/82aaf27f6f0cf97ba89b58f88a18b09e23097afc - [SYCL][Driver] Refactor device config parsing to better match HIP and CUDA targets (#13617) - -commit https://github.com/intel/llvm/commit/b11a19b1896cc2f7ab43735aacf265182e22832c - [Bindless][SYCL][Doc] Add HintT tparam to cubemap fetch and sample (#13742) - -commit https://github.com/intel/llvm/commit/c65bed1073460fb8d6dbb319f5e7ff2c9c7c9422 - [SYCL][Graph] Update begin_recording and end_recording (#13480) - -commit https://github.com/intel/llvm/commit/d6dfd0c77b2212f4e3e926d2e289bd3dc6e18b49 - [SYCL][Graph][DOC] add an edge case for record&replay mode (#12916) - -commit https://github.com/intel/llvm/commit/03233e57e5585813ec2c0dbc7a10ceb4a6d15a71 - [SYCL] Add missed intel math functions in sycl_ext_intel_math header (#13762) - Is it user-visible? - -commit https://github.com/intel/llvm/commit/6cb77fcfb37ffb445ab62ea1545422dc52128da1 - [SYCL] Add -fPIC for Intel math function host code (#13800) - -commit https://github.com/intel/llvm/commit/84bae21d3f63f04ca50bfffc5203909ba3fd95a6 - Implement missing overloads for generic AS in generic target (#13938) - commit https://github.com/intel/llvm/commit/29b4d855fa1a378e89182795e0d368304c40c3f6 [SYCL][CUDA] Enable support of msvc math functions for nvptx target. (#14007) -commit https://github.com/intel/llvm/commit/9f1cee573782772f8d062f6490128c3ee6fa6911 - [SYCL][CUDA] Improve kernel launch error handling for out-of-registers (#12604) - -commit https://github.com/intel/llvm/commit/a35f862445b5666c63469cda2656b0a9946df25c - [SYCL][Graph] fix the address pointer in graph print (#13595) - -commit https://github.com/intel/llvm/commit/c1b17e00f9b5c51db1f8385435d7a591224b01e0 - [SYCL] Enable CET for wqlibsycl-devicelib-host.a (#14135) - -commit https://github.com/intel/llvm/commit/fe1859085b621ea901cd8da81659923122417688 - [SYCL][NVPTX] Emit reqd_work_group_size attributes as NVVM annotations (#14502) - some kind of bugfix? - commit https://github.com/intel/llvm/commit/7a9d3b1e9483b69baa0b8c6f1097016efd52854c [SYCL][NVPTX] Do not decompose SYCL functor unless necessary (#14434) -commit https://github.com/intel/llvm/commit/0b9fc099f63feadb5e476c5862de3d8fa977a655 - [SYCL][Graph] Test WGU kernel mismatch (#14379) - -commit https://github.com/intel/llvm/commit/93fef86cd4fb8e18c126365c404eea1ed0f1a7fa - [SYCL][Graph] Permit empty & barrier nodes in WGU (#14236) - -commit https://github.com/intel/llvm/commit/4151c799ef36f2912fab3f6b9e305240ef4ff327 - [SYCL][Graph] Wait instead of flush dep events in update command (#14167) - -commit https://github.com/intel/llvm/commit/f2cd2a80e7277fc62d8802673ce6ab2fac6fcbd0 - [SYCL] Disable in-order queue barrier optimization while profiling (#14123) - -commit https://github.com/intel/llvm/commit/0a1381d286f7c32a256a6dab49917870769f1238 - [SYCL][Graph] Add wording about arbitrary C++ code in CGFs (#13699) -commit https://github.com/intel/llvm/commit/ece19f298b1029121da17a423b801bc2a9267a8d - [SYCL][Graph] Clarify graph in-order and out-of-order properties (#13681) - -commit https://github.com/intel/llvm/commit/7d55eb8a8419dac64f065bbf84125ed1d78dc992 - [SYCL][Docs] Behavioral changes to in-order queue events extension (#13624) - commit https://github.com/intel/llvm/commit/38e663ecd37de513d8e31afdfdf245cf8c9d17f0 [SYCL] Declare __devicelib_assert_read only when fallback assert is enabled (#13241) Is there any particular user-visible bug associated with this? -commit https://github.com/intel/llvm/commit/65bdffb1c9d4c474316d3e330fc3c59338e004f6 - [SYCL][libclc][NATIVECPU] Implement generic atomic load for generic target (#13249) - new feature? - -commit https://github.com/intel/llvm/commit/05644a470303c2af3385b9533b8d23ebdea99eb7 - [OpenCL] Config dependent-load flag to exclude CWD from DLL search path (#13327) - do we report security issues? - -commit https://github.com/intel/llvm/commit/e17632f32fcc160add43742ccdaa6cc80cc1b0c0 - [Driver][SYCL] Use LLVM-IR based device libraries for device linking (#13604) -commit https://github.com/intel/llvm/commit/67d8ea1cdaef29afd75f7f085f0b6c6d73af81a3 - [Driver][SYCL][FPGA] Use bundled device libraries for FPGA targets (#13693) - -commit https://github.com/intel/llvm/commit/1665cc0dd57266d2677c625725d38973cce3e8d9 - [SYCL][Graph] Enable in-order cmd-list (#13088) - perf optimization - new feature? - -commit https://github.com/intel/llvm/commit/d13fdbe4ee02c39b1939bae7da61392e75ce2c78 - [Bindless][Exp] Add texture fetch functionality (#12447) - or a new feature? - -commit https://github.com/intel/llvm/commit/d4f2fe54047a1b415af2402a497f20e918094580 - [SYCL][Bindless][Exp] Remove const from non-reference and non-pointer type parameters (#14238) - -commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad6cad - [DeviceSanitizer] Disable handling no return calls (#14652) - // bugfix? - -commit https://github.com/intel/llvm/commit/a14c0917ad741a3a27b50040e4589b56262462bc - [SYCL][Bindless] Update spirv read/fetch from sampled image and sampled image array (#14493) - -commit https://github.com/intel/llvm/commit/c1ee064428a2d4038021dc3284a4c2f3aa897cb8 - [SYCL][Bindless] Fix OpaqueFD/Win32Handle's scope in piextImportExternalMemory/Semaphore (#14266) - -commit https://github.com/intel/llvm/commit/14ee7e1cca79cac97ecc41ddc15d5d724011c89a - [SYCL][Bindless][Exp] Remove unneeded function argument causing memory leak in image create functions (#13364) - commit https://github.com/intel/llvm/commit/4b993a7b32f7743980bce646765a1b427b0996b6 Revert "[SYCL][Driver] Link with sycl libs at link step of clang-cl -fsycl (#12793)" (#13326) revert commit https://github.com/intel/llvm/commit/seems to be a part of a previous release -commit https://github.com/intel/llvm/commit/5794326b965071a69273a1f653405670b728e66b - [SYCL][NATIVECPU][DRIVER] Select remangled libclc variant for Native CPU (#13765) - ??? - -commit https://github.com/intel/llvm/commit/f170c63ed329c1fa5271d67e68144ec5d7808079 - [SYCL] Fix kernel shortcut path for inorder queue (#13333) - could be related to a commit https://github.com/intel/llvm/commit/made post-March release, i.e. it can probably be squashed with some other line - commit https://github.com/intel/llvm/commit/4b14d706d93891cdb5b0e6a8d4b0b027c1d54ab8 [SYCL][DeviceSanitizer] Use -asan-constructor-kind=none to disable ctor/dtor (#13259) was the bug really user-visible? - -commit https://github.com/intel/llvm/commit/3c7f99d891cdd7c929b38b18bf6877c3c8dba163 - [SYCL][Graph] Fix potential issue with command buffer commands (#13224) - Was it a user-visible issue? Why no test? - -commit https://github.com/intel/llvm/commit/64cb0cf96de28bfd495e577b4dd46c26dbb6b197 - [SYCL][Graph] Fix minor issues in graph update code (#13660) - bugfix for #13011? +commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad6cad + [DeviceSanitizer] Disable handling no return calls (#14652) + // bugfix? +UR commit below @@ -315,9 +194,6 @@ commit https://github.com/intel/llvm/commit/64cb0cf96de28bfd495e577b4dd46c26dbb6 compiler. intel/llvm#13651 - Implemented support for `memory_order::seq_cst` on CUDA backend, resolving intel/llvm#11208. intel/llvm#12516 -- Fixed a bug with `shift_group_[right|left]`, `permute_by_xor` and - `select_from_group` algorithms would return invalid values if used with - `half` data type on AMD devices. intel/llvm#13016 - Implementation of optional kernel features mechanism has been extended to also support AOT compilation if so-called "special" targets are passed to `-fsycl-targets` (see corresponding @@ -326,7 +202,7 @@ commit https://github.com/intel/llvm/commit/64cb0cf96de28bfd495e577b4dd46c26dbb6 targets support which optional kernel features and that databaseis not yet fully complete. In particular, data for Lunar Lake and Battlemage Intel GPUs is still missing. intel/llvm#14590 intel/llvm#14188 intel/lvm#12727 - intel/llvm#14757 intel/llvm#13486 intel/llvm#13974 + intel/llvm#14757 intel/llvm#13486 intel/llvm#13974 intel/llvm#13617 - Enhanced compiler to annotate SYCL kernel arguments passed by value with `__grid_constant__` for CUDA backend. intel/llvm#14322 - Added initial support for sub-groups on Native CPU backend. intel/llvm#13979 @@ -336,9 +212,20 @@ commit https://github.com/intel/llvm/commit/64cb0cf96de28bfd495e577b4dd46c26dbb6 backends are able to generate better code. intel/llvm#14628 - Improved math built-ins support on Native CPU backend: added support for bf16 and pointers in generic address space. intel/llvm#13109 intel/llvm#13829 - intel/llvm#13911 intel/llvm#13428 intel/llvm#13478 + intel/llvm#13911 intel/llvm#13428 intel/llvm#13478 intel/llvm#13249 + intel/llvm#13765 - Improved debugging experience on Linux (CPU & GPU) and Windows (CPU AOT only). - intel/llvm#13107 + intel/llvm#13107 intel/llvm#13938 +- Optimized device code linking process by providing device libraries in LLVM + IR format instead of fat object files which allowed us to skip unbundling + step. Note: FPGA path still uses fat object files. + intel/llvm#13604 intel/llvm#13693 +- Added missing lowering of `reqd_work_group_size` attribute for CUDA devices + so that device compiler can now see the attribute and use it during + compilation. intel/llvm#14502 +- Strengthened security-related compilation flags used to build libraries and + tools which are part of the intel/llvm SYCL implementation. intel/llvm#13327 + intel/llvm#14135 intel/llvm#13800 ### SYCL Library @@ -346,7 +233,8 @@ commit https://github.com/intel/llvm/commit/64cb0cf96de28bfd495e577b4dd46c26dbb6 - Updated [`sycl_ext_oneapi_bindless_images`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_bindless_images.asciidoc) extension implementation to support cubemap images. intel/llvm#12996 - Added ESIMD API for dynamic allocation of named barriers. intel/llvm#13826 - Updated [`sycl_ext_oneapi_bindless_images`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_bindless_images.asciidoc) extension implementation to support sampled image arrays. intel/llvm#14237 -- Added implementation for whole graph update (`executable_command_graph::update`). intel/llvm#13220 +- Added implementation for whole graph update (`executable_command_graph::update`). + intel/llvm#13220 intel/llvm#14379 intel/llvm#14236 - Added a warning about use of the deprecated `` header. intel/llvm#13569 - Made `local_accessor::get_pointer` and `local_accessor::get_multi_ptr` throw `invalid` exception if they are called on host. intel/llvm#13747 @@ -445,6 +333,24 @@ commit https://github.com/intel/llvm/commit/64cb0cf96de28bfd495e577b4dd46c26dbb6 intel/llvm#12449 - Improved performance of `queue::fill` on CUDA backend by making it use 2- and 4-byte operations instead of only using 1-byte operations. intel/llvm#13788 +- Enchanced `sycl_ext_oneapi_graph` extension implementation on Level Zero + backend by taking advantage of copy engines available on some devices to be + able to execute kernels and memory operations in parallel. intel/llvm#13051 +- Expanded list of supported atomic memory fence scopes supported by HIP + devices. intel/llvm#12872 +- Added SYCL wrappers for more Intel Math Functions. SYCL wrappers are in + `sycl::ext::intel::math::` namespace and provide nicer names that do not + start with `__`. intel/llvm#13762 +- Added initial support for bindless images on Intel GPUs (only DG2 and MTL). + It is very limited in functionality that is supported and it only works if + certain environment variables are set. Therefore, it is not yet ready for a + wide adoption, but could be used for very early adoption. intel/llvm#13946 + intel/llvm#13753 intel/llvm#14493 intel/llvm#14266 +- Introduced an optimization to graphs where linear graphs would require less + amount of synchronization. This optimization is backend-specific and only + performed in very limited amount of cases. intel/llvm#13088 +- Implemented new `fetch_image` overload which accepts sampled image and + coordinates. intel/llvm#12447 ### Documentation @@ -505,12 +411,22 @@ commit https://github.com/intel/llvm/commit/64cb0cf96de28bfd495e577b4dd46c26dbb6 nodes within a graph. intel/llvm#12486 - clarified `enable_profiling` property in `command_graph::finalize` intel/llvm#14067 + - clarified how graph edges are recorded in a graph with in-order queues. + intel/llvm#12916 + - clarified how code within command-group functions is handled. + intel/llvm#13699 + - clarified interaction between queue properties and graphs. intel/llvm#13681 + - added `enable_profiling` property. intel/llvm#13088 - Updated [`sycl_ext_oneapi_bindless_images`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_bindless_images.asciidoc) extension: - added support external semaphores that can take value. intel/llvm#13860 - added support for copying `image_mem_handle` between devices via `ext_oneapi_copy`. intel/llvm#12449 + - added `fetch_image` overload which accepts sampled image and coordinates. + intel/llvm#12447 + - added type hint template argument to `fetch_cubemap` and `sample_cubemap` + APIs. intel/llvm#13742 ### SYCLcompat @@ -551,6 +467,9 @@ commit https://github.com/intel/llvm/commit/64cb0cf96de28bfd495e577b4dd46c26dbb6 - Fixed a bug where compiling a kernel that is annotated with `reqd_work_group_size` attribute that has less than 3 arguments for HIP target caused compiler to crash. intel/llvm#13600 +- Fixed a bug with `shift_group_[right|left]`, `permute_by_xor` and + `select_from_group` algorithms would return invalid values if used with + `half` data type on AMD devices. intel/llvm#13016 ### SYCL Library @@ -677,6 +596,14 @@ commit https://github.com/intel/llvm/commit/64cb0cf96de28bfd495e577b4dd46c26dbb6 intel/llvm#6055. The bug is fixed for Level Zero backend, but it still may exibit itself on OpenCL backend when sub-devices are involved. intel/llvm#13483 +- Partially fixed profiling information provided by events from in-order queues + that have barries submitted to them. There are still issues on Level Zero + backend due to certain optimizations that it performs for barriers in + in-order queues. intel/llvm#14123 +- Fixed a memory leak in bindless images extension implementation. + intel/llvm#13364 +- Fixed performance regression when kernels without any dependencies are + submitted into in-order queue. intel/llvm#13333 ### Documentation @@ -778,6 +705,9 @@ of some classes to use so-called preview implementation. - Removed ESIMD `atomic_op::predec`. intel/llvm#14480 - Dropped interfaces from revision 1 of experimental `sycl_ext_oneapi_group_sort` extension. intel/llvm14531 +- Changed return type of `command_graph::begin_recording` and + `command_graph::end_recording` from `void` to `bool` in the experimental + `sycl_ext_oneapi_graph` extension. intel/llvm#13480 Breaking changes were also made to compiler flags: From 0f2ae1e6ad68a13389e93c3c5a8a4f86b5ea5ee9 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Thu, 12 Sep 2024 07:41:29 -0700 Subject: [PATCH 13/30] Fix typos --- sycl/ReleaseNotes.md | 57 ++++++++++++++++++++++---------------------- 1 file changed, 29 insertions(+), 28 deletions(-) diff --git a/sycl/ReleaseNotes.md b/sycl/ReleaseNotes.md index e95d0e236b2b..ee8dc4acf5fb 100644 --- a/sycl/ReleaseNotes.md +++ b/sycl/ReleaseNotes.md @@ -49,12 +49,12 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad all dimensions. intel/llvm#12690 - Added support for the so-called [new offloading model](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/design/OffloadDesign.md). It can be enabled by `--offload-new-driver` command line option and provides - a better infrastucture for us. In the future we expect to leverage that + a better infrastructure for us. In the future we expect to leverage that infrastructure to improve link times by reducing amount of I/O used by the compiler and amount of external processes that it spawns. - Foundation for this work has been performed in previous release timeframe and + Foundation for this work has been performed in previous release time frame and the following list of PRs only includes those done within this release - timeframe. intel/llvm#14252 intel/llvm#13394 intel/llvm#13648 intel/llvm#13687 + time frame. intel/llvm#14252 intel/llvm#13394 intel/llvm#13648 intel/llvm#13687 intel/llvm#14001 intel/llvm#14006 intel/llvm#14101 intel/llvm#14151 intel/llvm#14253 intel/llvm#14177 intel/llvm#13672 intel/llvm#13688 intel/llvm#13579 intel/llvm#13869 intel/llvm#14102 intel/llvm#14541 @@ -135,7 +135,8 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad - Added `match_[any|all]_over_sub_group` APIs. intel/llvm#12973 - Added API to manage kernel libraries loading/unloading. intel/llvm#13053 intel/llvm#13932 - Added `cmul_add` API. intel/llvm#12969 -- Added experimental APIs for maksed operations over sub-groups (`select`, `shift`, etc.). intel/llvm#12972 +- Added experimental APIs for masked operations over sub-groups (`select`, + `shift`, etc.). intel/llvm#12972 - Added various helper APIs: a mechanism to extract arguments from a kernel and its kernel parameters; type casting helper for generic address -> queue pointer; a wrapper to provide better support for logical groups; an enum to @@ -164,7 +165,7 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad ### SYCL Compiler -- Improved compilation flow around intergation footer when no 3rd-party host +- Improved compilation flow around integration footer when no 3rd-party host compiler is used. New compilation flow creates less temporary files and therefore should result in a slightly faster compilation. intel/llvm#13607 intel/llvm#14402 - Added support for `truncf`, `sinpif`, `rsqrtf`, `exp10f`, `ceilf`, @@ -180,16 +181,16 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad are passed into `-fsycl-targets`. intel/llvm#13078 - Reduced list of commands invoked to generate dependencies using `-MD` flag by one command. intel/llvm#13217 -- Enchanced diagnostic emitted if CUDA target triple passed to `-fsycl-targets` +- Enhanced diagnostic emitted if CUDA target triple passed to `-fsycl-targets` is incorrect. intel/llvm#14673 - Reduced size of shadow memory used by address sanitizer to avoid running out of memory in multi-GPU environments. intel/llvm#13857 -- Enchanced address sanitizer to be able to detect out-of-bounds access to +- Enhanced address sanitizer to be able to detect out-of-bounds access to local accessors. intel/llvm#13503 -- Enchanced address sanitizer to detect incorrect uses of USM deallocation +- Enhanced address sanitizer to detect incorrect uses of USM deallocation functions (like calling `sycl::free` on a pointer that was not allocated as a USM pointer). intel/llvm#12882 -- Enchanced `-fintelfpga` flag: when it is used together with `-fp-module=fast` +- Enhanced `-fintelfpga` flag: when it is used together with `-fp-module=fast` it also implies that `-vpfp-relaxed` will be passed to backend (device) compiler. intel/llvm#13651 - Implemented support for `memory_order::seq_cst` on CUDA backend, resolving @@ -199,7 +200,7 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad `-fsycl-targets` (see corresponding [documentation](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/UsersManual.md#generic-options)). Please note that this functionality relies on the compile knowing which - targets support which optional kernel features and that databaseis not yet + targets support which optional kernel features and that database is not yet fully complete. In particular, data for Lunar Lake and Battlemage Intel GPUs is still missing. intel/llvm#14590 intel/llvm#14188 intel/lvm#12727 intel/llvm#14757 intel/llvm#13486 intel/llvm#13974 intel/llvm#13617 @@ -282,7 +283,7 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad intel/llvm#13545 - moved `rdtsc` ESIMD function out of `experimental` namespace. intel/llvm#13417 - Added check for template argument `N` of `media_block_load` ESIMD API. intel/llvm#13668 -- Enchanced deprecation message for `sub_group::barrier` to indicate which API +- Enhanced deprecation message for `sub_group::barrier` to indicate which API should be used instead. intel/llvm#13276 - Added deprecation messages for `image_max_array_size` and `opencl_c_version` device info queries. intel/llvm#13279 @@ -299,7 +300,7 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad emitted for non-virtual calls of virtual functions. See also KhronosGroup/SYCL-Docs#565. intel/llvm#114051 intel/llvm#14141 - ESIMD API `inv` was extended to support `double` arguments. intel/llvm#13838 -- Enchanced validation (via `static_assert` mechanism) of template arguments of +- Enhanced validation (via `static_assert` mechanism) of template arguments of ESIMD `rdregion` and `wrregion` APIs. intel/llvm#13158 - Aligned mutating swizzle operators with the SYCL 2020 specification by making it a `friend` instead of member function. intel/llvm#13012 @@ -321,7 +322,7 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad Level Zero, etc. It is now completely replaced by [Unified Runtime](https://github.com/oneapi-src/unified-runtime/) and this removal should reduce amount and size of redistributable libraries. -- Enchanced ESIMD `slm_atomic_update` API to also support `fsub` and `fadd` +- Enhanced ESIMD `slm_atomic_update` API to also support `fsub` and `fadd` operations. intel/llvm#13535 - Lifted some of the restrictions from ESIMD `block_store` API. intel/llvm#13150 - Improved implementation of `group_store` and `group_load` APIs with intent for @@ -333,7 +334,7 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad intel/llvm#12449 - Improved performance of `queue::fill` on CUDA backend by making it use 2- and 4-byte operations instead of only using 1-byte operations. intel/llvm#13788 -- Enchanced `sycl_ext_oneapi_graph` extension implementation on Level Zero +- Enhanced `sycl_ext_oneapi_graph` extension implementation on Level Zero backend by taking advantage of copy engines available on some devices to be able to execute kernels and memory operations in parallel. intel/llvm#13051 - Expanded list of supported atomic memory fence scopes supported by HIP @@ -360,10 +361,10 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad - Updated [`sycl_ext_oneapi_bindless_images`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_bindless_images.asciidoc) extension to support sampled image arrays. intel/llvm#14237 - Updated [`sycl_ext_oneapi_bindless_images`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_bindless_images.asciidoc) extension to support support default-construction of `image_descriptor`. intel/llvm#13781 - Updated [ESIMD functions documentation](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/supported/sycl_ext_intel_esimd/sycl_ext_intel_esimd_functions.md) - to list restictions for `atomic_update`, `gather` and `scatter` functions. + to list restrictions for `atomic_update`, `gather` and `scatter` functions. intel/llvm#13202 intel/llvm#13196 - Updated [ESIMD functions documentation](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/supported/sycl_ext_intel_esimd/sycl_ext_intel_esimd_functions.md) - to documentnew overloads of `load_2d`, `store_2d` or `prefetch_2d` APIs that + to document new overloads of `load_2d`, `store_2d` or `prefetch_2d` APIs that accept compile-time properties. intel/llvm#13218 - Updated [ESIMD functions documentation](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/supported/sycl_ext_intel_esimd/sycl_ext_intel_esimd_functions.md) to document `fence` API. intel/llvm#13135 @@ -384,9 +385,9 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad - Clarified [`sycl_ext_intel_device_info`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/supported/sycl_ext_intel_device_info.md) extension to specify which exact exceptions are being thrown on errors. intel/llvm#14576 - Introduced versioning and release process - [documenation](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/syclcompat/README.md#versioning) + [documentation](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/syclcompat/README.md#versioning) for SYCLcompat. intel/llvm#14457 -- Exended our +- Extended our [contribution guidelines](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/developer/ContributeToDPCPP.md#unified-runtime-updates) to document update process for Unified Runtime component. - Updated @@ -432,7 +433,7 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad - Added non-`const` `image2d_max` and `image3d_max` getters. intel/llvm#14138 - Introduced versioning scheme for the library. intel/llvm#14457 -- Enchanced masked shuffle functions `select_from_sub_group`, +- Enhanced masked shuffle functions `select_from_sub_group`, `shift_sub_group_left`, `shift_sub_group_right` and `permute_sub_group_by_xor` to support CUDA devices. intel/llvm#13363 - Restricted `memory_order` argument of `atomic_ref` passed to @@ -557,12 +558,12 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad - Fixed a bug where querying free device memory of integrated Intel GPUs would return 0 instead of throwing an exception that the feature is not supported for that device. intel/llvm#13209 -- Fixed a heap buffer overwlow in `sycl_ext_oneapi_kernel_compiler_opencl` +- Fixed a heap buffer overflow in `sycl_ext_oneapi_kernel_compiler_opencl` extension implementation. intel/llvm#13214 intel/llvm#13448 - Fixed intel/llvm#12473 about `sycl_ext_oneapi_graph` extension implementation ignoring access mode of accessors and thus creating unnecessary edges in a graph. intel/llvm#13011 -- Fixed a bug where a command submission time would not be alwasys recorded in +- Fixed a bug where a command submission time would not be always recorded in profiling info when using `sycl_ext_oneapi_graph` extension. intel/llvm#14678 - Fixed a bug with graph recording where submitting a barrier using the same queue for two different graphs would result in runtime error "Graph nodes @@ -594,7 +595,7 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad from a SYCL `device` object would result in a device that is not equally comparable with the original `device` object. See related issue intel/llvm#6055. The bug is fixed for Level Zero backend, but it still may - exibit itself on OpenCL backend when sub-devices are involved. + exhibit itself on OpenCL backend when sub-devices are involved. intel/llvm#13483 - Partially fixed profiling information provided by events from in-order queues that have barries submitted to them. There are still issues on Level Zero @@ -646,9 +647,9 @@ to be launched using newer versions of SYCL runtime library. `sycl_ext_oneapi_bindless_images` extension to accept `const`-qualified types for `Src` parameter. intel/llvm#14140 -Several API breaking changes were made as well, mostly compltely dropping -support for previosly deprecated APIs and in some cases switching implmentations -of some classes to use so-called preview implementation. +Several API breaking changes were made as well, mostly completely dropping +support for previously deprecated APIs and in some cases switching +implementations of some classes to use so-called preview implementation. - Removed `sycl::abs` overload taking floating-point argument. intel/llvm#13286 - Removed `sycl::host_ptr` and `sycl::device_ptr`. intel/llvm#13240 @@ -676,8 +677,8 @@ of some classes to use so-called preview implementation. - Renamed SYCLcompat function `async_free` to `enqueue_free`. intel/llvm#14015 - Enforced restrictions on first argument of lambdas/functors passed to `parallel_for(range)` and `parallel_for(nd_range)`. intel/llvm#13198 -- Switched `sycl::vec` implemetation to use its preview version. New version: - - uses differnt storage type under the hood which should fix several strict +- Switched `sycl::vec` implementation to use its preview version. New version: + - uses different storage type under the hood which should fix several strict aliasing rules violations that we had in the implementation. intel/llvm#14317 intel/llvm#13182 intel/llvm#14130 - restricts math operations available to `vec` to those which @@ -687,7 +688,7 @@ of some classes to use so-called preview implementation. - Switched math built-ins implementation to use their preview version. intel/llvm#13152 - Switched `bfloat16` implementation to use its preview version. intel/llvm#13233 - Switched `sycl::nd_item` implementation to use its preview version. intel/llvm#13197 -- Enforced restriction that `buffer`'s elemennt type must be device copyable. intel/llvm#13200 +- Enforced restriction that `buffer`'s element type must be device copyable. intel/llvm#13200 - Restructured SYCL headers so that `` and `` are not included in there anymore. intel/llvm#11528 - Dropped support for `SYCL_DEVICE_FILTER` environment variable. intel/llvm#13192 From 1fe2c554292d3a9d90f952c02ff4b8e3dfaab706 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Fri, 13 Sep 2024 02:09:49 -0700 Subject: [PATCH 14/30] Apply feedback from @frasercrmck --- sycl/ReleaseNotes.md | 51 ++++++++++++++++++++++---------------------- 1 file changed, 26 insertions(+), 25 deletions(-) diff --git a/sycl/ReleaseNotes.md b/sycl/ReleaseNotes.md index ee8dc4acf5fb..100198ea8ab2 100644 --- a/sycl/ReleaseNotes.md +++ b/sycl/ReleaseNotes.md @@ -38,8 +38,8 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad ### SYCL Compiler -- Added `-fsycl-range-rounding` command line option which allows to - control range rounding feature. In comparison with previously available +- Added `-fsycl-range-rounding` command line option which allows control over + the range rounding feature. In comparison with the previously available `-fsycl-disable-range-rounding` command line option and `__SYCL_DISABLE_PARALLEL_FOR_RANGE_ROUNDING__` macro the new flag also allows to _force_ range rounding which will complete disable generation of @@ -50,10 +50,10 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad - Added support for the so-called [new offloading model](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/design/OffloadDesign.md). It can be enabled by `--offload-new-driver` command line option and provides a better infrastructure for us. In the future we expect to leverage that - infrastructure to improve link times by reducing amount of I/O used by the - compiler and amount of external processes that it spawns. - Foundation for this work has been performed in previous release time frame and - the following list of PRs only includes those done within this release + infrastructure to improve link times by reducing the amount of I/O used by the + compiler and the amount of external processes that it spawns. + The foundation for this work has been performed in previous release time frame + and the following list of PRs only includes those done within this release time frame. intel/llvm#14252 intel/llvm#13394 intel/llvm#13648 intel/llvm#13687 intel/llvm#14001 intel/llvm#14006 intel/llvm#14101 intel/llvm#14151 intel/llvm#14253 intel/llvm#14177 intel/llvm#13672 intel/llvm#13688 @@ -62,7 +62,7 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad allow us to improve link time by reducing amount of external processes and temporary files used by the compiler. **Do we need to list PRs here?** There were many of them and some of them were merged in scope of a previous release. -- Added `-fsycl-fp64-conv-emu` command line option which allows to enable +- Added `-fsycl-fp64-conv-emu` command line option which allows the enabling of partial (only conversion operations are supported) emulation of `double` data type. This mode is only supported by Intel GPUs. intel/llvm#13912 - Introduced `__PTX_VERSION__` macro that corresponds to the PTX version used @@ -290,9 +290,9 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad - Updated [`sycl_ext_intel_device_info`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/supported/sycl_ext_intel_device_info.md) extension implementation to throw synchronious exception with `feature_not_supported` error code. intel/llvm#14788 -- Reduced startup overhead on `libsycl.so` loading by outlining SYCL JIT - compiler (used for kernel fusion feature) into a standalone library which is - dynamically loaded on the first use. intel/llvm#13433 +- Reduced startup overhead of `libsycl.so` by outlining SYCL JIT compiler (used + for kernel fusion feature) into a standalone library which is dynamically + loaded on the first use. intel/llvm#13433 - Deprecated `this_kernel::get_root_group` in favor of `this_work_item::get_root_group`. intel/llvm#13304 - Relaxed diagnostic about using virtual functions in SYCL kernels: now it is @@ -328,7 +328,7 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad - Improved implementation of `group_store` and `group_load` APIs with intent for it to have better performance in some cases. intel/llvm#13734 intel/llvm#13673 - Added support for graph update functionality. intel/llvm#12840 -- Added support for external semaphores that can take value to bindless images +- Added support for external semaphores that can take a value to bindless images extension. intel/llvm#13860 - Added support for device-to-device copying of `image_device_handle`. intel/llvm#12449 @@ -421,7 +421,7 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad - Updated [`sycl_ext_oneapi_bindless_images`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_bindless_images.asciidoc) extension: - - added support external semaphores that can take value. intel/llvm#13860 + - added support external semaphores that can take a value. intel/llvm#13860 - added support for copying `image_mem_handle` between devices via `ext_oneapi_copy`. intel/llvm#12449 - added `fetch_image` overload which accepts sampled image and coordinates. @@ -444,8 +444,8 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad ### SYCL Compiler -- Fixed that using `-fsycl-link-targets` flag would inadvertently trigger some - additional device code linking steps. intel/llvm#13004 +- Fixed a bug where using `-fsycl-link-targets` flag would inadvertently trigger + some additional device code linking steps. intel/llvm#13004 - Fixed a bug that when AOT-compiling for Intel GPUs the compiler would pass some PVC-specific flags even if target device is not a PVC. intel/llvm#13794 - Fixed a bug with incorrect file extensions being emitted in AOT compilation @@ -455,9 +455,9 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad `-fsycl-link` would result in "number of output files and targets should match in unbundling mode" error emitted by the compiler during link step. intel/llvm#13002 - Fixed a bug in address sanitizer which may lead to crashes when an application - is launched on OpenCL CPU device. intel/llvm#13262 -- Fixed a bug where calling certain built-in math functions that accepts - pointers (like `fract`, `frexp`, `modf`, etc.) and passing pointer in the + is launched on the OpenCL CPU device. intel/llvm#13262 +- Fixed a bug where calling certain built-in math functions that accept + pointers (like `fract`, `frexp`, `modf`, etc.) and passing pointers in the generic address space there would not compile for AMD devices. intel/llvm#13015 intel/llvm#13092 intel/llvm#13361 intel/llvm#13792 intel/llvm#13546 @@ -475,8 +475,8 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad ### SYCL Library - Fixed a situation when querying - `sycl::ext::oneapi::experimental::info::device` could result in exception - being thrown instead of empty vector being returned. intel/llvm#13968 + `sycl::ext::oneapi::experimental::info::device` could result in an exception + being thrown instead of an empty vector being returned. intel/llvm#13968 - Fixed `esimd::atan` implementation under `-ffast-math` flag. intel/llvm#13186 - Fixed an issue that component devices were not considered to be a descendent from their composite devices when creating a queue. intel/llvm#13513 @@ -491,7 +491,8 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad would lead to compilation errors. intel/llvm#13632 - Fixed an issue that use of `atomic_ref` would not be detected as a use of `atomic64` aspect leading to errors due to speculative compilation. intel/llvm#14052 -- Fixed `ctanh` and `cexp` returning incorrect value in some edge cases. intel/llvm#14329 +- Fixed `ctanh` and `cexp` returning incorrect values in some edge cases. + intel/llvm#14329 - Fixed a bug where values passed to `-Xs` option through `build_options` property were not passed down to device compiler when using [`sycl_ext_oneapi_kernel_compiler`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_kernel_compiler.asciidoc) @@ -504,8 +505,8 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad `-fno-sycl-unnamed-lambda` would lead to a compilation error about unnamed lambdas being unsupported. intel/lvm#14614 - Fixed an issue on CUDA & AMDGPU backends where `multi_ptr` relational - operators taking `std::nullptr_t` would produce different result comparing - to standard C++ helpers like `std::less`. intel/llvm#13201 + operators taking `std::nullptr_t` would produce different results to their + corresponding standard C++ helpers like `std::less`. intel/llvm#13201 - Fixed a compilation issue with `-fpreview-breaking-changes` flag when `windows.h` is included caued by conflict with `min`/`max` macro. intel/llvm#14260 - Fixed strict alias violations in `sycl::vec::operator[]` @@ -516,7 +517,7 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad - Fixed a compilation issue occurring when `printf` is used on CUDA backend on Windows. intel/llvm#13784 TODO: was it really a compilation issue? -- Fixed an issue where compiler could emit SPIR-V instructions for reversing +- Fixed an issue where the compiler could emit SPIR-V instructions for reversing bits in a variable which are not supported by device compilers. intel/llvm#13810 intel/llvm#13044 - Fixed a bug where having a default-constructed `local_accessor` as an argument @@ -569,8 +570,8 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad queue for two different graphs would result in runtime error "Graph nodes cannot depend on events from another graph". intel/llvm#14212 - Fixed intel/llvm#13066 where submitting a barrier into an empty in-order queue - whilst recording a graph results in runtime error "No event has been recorded - for the specified graph node". intel/llvm#13193 + whilst recording a graph would result in the in runtime error "No event has + been recorded for the specified graph node". intel/llvm#13193 - Fixed a resource leak in graph update implementation. intel/llvm#14029 - Fixed a bug with invalid handling of discard filters within `ONEAPI_DEVICE_SELECTOR` env variable caused RT to mistakenly say that the From 4e126782cfa90333cccfa4519f488fa0db6c5a00 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Fri, 13 Sep 2024 02:19:16 -0700 Subject: [PATCH 15/30] Address question about oneapi_group_load_store --- sycl/ReleaseNotes.md | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/sycl/ReleaseNotes.md b/sycl/ReleaseNotes.md index 100198ea8ab2..128bc908cd84 100644 --- a/sycl/ReleaseNotes.md +++ b/sycl/ReleaseNotes.md @@ -89,10 +89,11 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad extension. intel/llvm#13512 intel/llvm#13924 intel/llvm#14743 - Added support for `get_backend_info` API into various SYCL classes (`platform`, `context`, etc.). intel/llvm#12906 - Implemented [`sycl_ext_oneapi_group_load_store`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/proposed/sycl_ext_oneapi_group_load_store.asciidoc). - Please note that the implementation is naive and does not expose any special - HW capabilities, it won't provide any performance benefit over how a group - load/store could be done without this extension using simple `for` loop and - group barriers. intel/llvm#13043 + Please note that the implementation exposes native block read/write HW + capabilities only if the operation can be directly mapped to a single block + operation. In other cases, it uses a naive implementation in form of a simple + 'for' loop and group barriers. intel/llvm#13043 intel/llvm#13734 + intel/llvm#13673 - Implemented [`sycl_ext_codeplay_enqueue_native_command`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_codeplay_enqueue_native_command.asciidoc) extension. intel/llvm#14136 - Added initial support for [`sycl_ext_oneapi_free_function_kernels`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/proposed/sycl_ext_oneapi_free_function_kernels.asciidoc) extension. intel/llvm#13207 intel/llvm#13885 Known limitations: @@ -325,8 +326,6 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad - Enhanced ESIMD `slm_atomic_update` API to also support `fsub` and `fadd` operations. intel/llvm#13535 - Lifted some of the restrictions from ESIMD `block_store` API. intel/llvm#13150 -- Improved implementation of `group_store` and `group_load` APIs with intent for - it to have better performance in some cases. intel/llvm#13734 intel/llvm#13673 - Added support for graph update functionality. intel/llvm#12840 - Added support for external semaphores that can take a value to bindless images extension. intel/llvm#13860 From 08997d4314baab9d10f3b3bdc68b394ae49a40d6 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Fri, 13 Sep 2024 02:22:37 -0700 Subject: [PATCH 16/30] Apply feedback from @przemektmalon --- sycl/ReleaseNotes.md | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/sycl/ReleaseNotes.md b/sycl/ReleaseNotes.md index 128bc908cd84..20127fe6e1ac 100644 --- a/sycl/ReleaseNotes.md +++ b/sycl/ReleaseNotes.md @@ -327,8 +327,8 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad operations. intel/llvm#13535 - Lifted some of the restrictions from ESIMD `block_store` API. intel/llvm#13150 - Added support for graph update functionality. intel/llvm#12840 -- Added support for external semaphores that can take a value to bindless images - extension. intel/llvm#13860 +- Added support for external semaphore wait and signal operations that can take + a value to bindless images extension. intel/llvm#13860 - Added support for device-to-device copying of `image_device_handle`. intel/llvm#12449 - Improved performance of `queue::fill` on CUDA backend by making it use 2- and @@ -420,7 +420,11 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad - Updated [`sycl_ext_oneapi_bindless_images`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_bindless_images.asciidoc) extension: - - added support external semaphores that can take a value. intel/llvm#13860 + - added support for external semaphore wait and signal operations that can + take a value. intel/llvm#13860 + - added support for importing externally allocated buffers and semaphores + through Win32 NT handles, as well as DirectX 12 resources and fences. + intel/llvm#13860 - added support for copying `image_mem_handle` between devices via `ext_oneapi_copy`. intel/llvm#12449 - added `fetch_image` overload which accepts sampled image and coordinates. From 4dfea21f74ea84cd81abd20b140f02bb85bda622 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Fri, 13 Sep 2024 02:25:11 -0700 Subject: [PATCH 17/30] Apply feedback from @mdtoguchi --- sycl/ReleaseNotes.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/sycl/ReleaseNotes.md b/sycl/ReleaseNotes.md index 20127fe6e1ac..e9b60cdda4c7 100644 --- a/sycl/ReleaseNotes.md +++ b/sycl/ReleaseNotes.md @@ -42,16 +42,16 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad the range rounding feature. In comparison with the previously available `-fsycl-disable-range-rounding` command line option and `__SYCL_DISABLE_PARALLEL_FOR_RANGE_ROUNDING__` macro the new flag also allows - to _force_ range rounding which will complete disable generation of + to _force_ range rounding which will completely disable the generation of non-rounded kernels, thus improving binary size. intel/llvm#12715 - Added `-fsycl-exp-range-rounding` command line option that enables experimental range rounding mode in which range rounding is performed across all dimensions. intel/llvm#12690 - Added support for the so-called [new offloading model](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/design/OffloadDesign.md). - It can be enabled by `--offload-new-driver` command line option and provides - a better infrastructure for us. In the future we expect to leverage that - infrastructure to improve link times by reducing the amount of I/O used by the - compiler and the amount of external processes that it spawns. + It can be enabled by the `--offload-new-driver` command line option and + provides an improved infrastructure for us. In the future we expect to + leverage that infrastructure to improve link times by reducing the amount of + I/O used by the compiler and the amount of external processes that it spawns. The foundation for this work has been performed in previous release time frame and the following list of PRs only includes those done within this release time frame. intel/llvm#14252 intel/llvm#13394 intel/llvm#13648 intel/llvm#13687 @@ -191,9 +191,9 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad - Enhanced address sanitizer to detect incorrect uses of USM deallocation functions (like calling `sycl::free` on a pointer that was not allocated as a USM pointer). intel/llvm#12882 -- Enhanced `-fintelfpga` flag: when it is used together with `-fp-module=fast` - it also implies that `-vpfp-relaxed` will be passed to backend (device) - compiler. intel/llvm#13651 +- Enhanced `-fintelfpga` flag. When used together with `-fp-module=fast` it + also implies that `-vpfp-relaxed` will be passed to backend (device) compiler. + intel/llvm#13651 - Implemented support for `memory_order::seq_cst` on CUDA backend, resolving intel/llvm#11208. intel/llvm#12516 - Implementation of optional kernel features mechanism has been extended to also From e5c50203759a4c6c0ed6aa14674e20634f6dd09a Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Fri, 13 Sep 2024 02:26:26 -0700 Subject: [PATCH 18/30] Apply feedback from @sarnex --- sycl/ReleaseNotes.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sycl/ReleaseNotes.md b/sycl/ReleaseNotes.md index e9b60cdda4c7..220c9b845534 100644 --- a/sycl/ReleaseNotes.md +++ b/sycl/ReleaseNotes.md @@ -282,7 +282,7 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad `opencl` backend. intel/llvm#14119 - Moved bit shift and rotate ESIMD functions out of `experimental` namespace. intel/llvm#13545 -- moved `rdtsc` ESIMD function out of `experimental` namespace. intel/llvm#13417 +- Moved `rdtsc` ESIMD function out of `experimental` namespace. intel/llvm#13417 - Added check for template argument `N` of `media_block_load` ESIMD API. intel/llvm#13668 - Enhanced deprecation message for `sub_group::barrier` to indicate which API should be used instead. intel/llvm#13276 From cf3fb5e9e9be153e83c3e058e053012aa1b3a5f0 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Fri, 13 Sep 2024 02:29:28 -0700 Subject: [PATCH 19/30] Apply feedback from @Alcpz --- sycl/ReleaseNotes.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/sycl/ReleaseNotes.md b/sycl/ReleaseNotes.md index 220c9b845534..e47f3def7762 100644 --- a/sycl/ReleaseNotes.md +++ b/sycl/ReleaseNotes.md @@ -123,7 +123,6 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad - Added APIs for performing arithmetic operations on 33-bit extended values. intel/llvm#13006 - Added APIs for performing bitwise operations on 33-bit extended values. intel/llvm#13727 - Added `device_count` and `get_device_id` utility APIs. intel/llvm#14013 -- Added experimental `launch` API overloads that accept sub-group size. intel/llvm#13767 - Added `wait` and `wait_and_throw` free functions. intel/llvm#13029 - Added vectorized comparison `extend_vcompare[2|4]` APIs. intel/llvm#14079 - Added vectorized math `extend_v*2` APIs. intel/llvm#13953 @@ -134,7 +133,8 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad - Added `filter_device` and `list_devices` APIs. intel/llvm#14016 - Added `funnelshift_*` APIs. intel/llvm#13825 - Added `match_[any|all]_over_sub_group` APIs. intel/llvm#12973 -- Added API to manage kernel libraries loading/unloading. intel/llvm#13053 intel/llvm#13932 +- Added APIs to manage kernel libraries loading/unloading. intel/llvm#13053 + intel/llvm#13932 - Added `cmul_add` API. intel/llvm#12969 - Added experimental APIs for masked operations over sub-groups (`select`, `shift`, etc.). intel/llvm#12972 @@ -146,9 +146,9 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad `max`, `fmin_nan`, `fmax_nan`, `pow`, `relu`; wrappers are needed to support variety of combinations of argument types compared to `sycl::` counterparts of those functions. intel/llvm#13005 -- Added `SYCLCOMPT_CHECK_ERROR` macro which is an error handling utility for +- Added `SYCLCOMPAT_CHECK_ERROR` macro which is an error handling utility for expressions that throw exceptions. -- Added `image1d_max`, `image2d_max` and `image3d_max` device info getters +- Added `image1d_max`, `image2d_max` and `image3d_max` `device_info` getters and setters. intel/llvm#13973 - Added `get_major_version` and `get_minor_version` free functions. intel/llvm#14011 - Expanded list of properties available through `device_info` class. intel/llvm#13050 From 8b9264c7b212610675e3e16bdaf1d6af049c8cc1 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Fri, 13 Sep 2024 02:50:01 -0700 Subject: [PATCH 20/30] Apply missed comments --- sycl/ReleaseNotes.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/sycl/ReleaseNotes.md b/sycl/ReleaseNotes.md index e47f3def7762..c7f52eb2f76c 100644 --- a/sycl/ReleaseNotes.md +++ b/sycl/ReleaseNotes.md @@ -242,8 +242,8 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad `invalid` exception if they are called on host. intel/llvm#13747 - Extended detection of nested `queue` operations to support shortcut methods. intel/llvm#13659 - Added overloads of various ESIMD APIs (`atomic_update`, `block_[load|store]` - and some other) which allow to omit some template arguments, thus simplifying - the interface. intel/llvm#14043 intel/llvm#14065 intel/llvm#14000 + and some other) which allow thee omission of some template arguments, thus + simplifying the interface. intel/llvm#14043 intel/llvm#14065 intel/llvm#14000 intel/llvm#14024 intel/llvm#13978 intel/llvm#13964 intel/llvm#13977 intel/llvm#13956 intel/llvm#13941 intel/llvm#13920 - Updated [`sycl_ext_oneapi_bfloat16_math_functions`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_bfloat16_math_functions.asciidoc) @@ -570,7 +570,7 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad - Fixed a bug where a command submission time would not be always recorded in profiling info when using `sycl_ext_oneapi_graph` extension. intel/llvm#14678 - Fixed a bug with graph recording where submitting a barrier using the same - queue for two different graphs would result in runtime error "Graph nodes + queue for two different graphs would result in the runtime error "Graph nodes cannot depend on events from another graph". intel/llvm#14212 - Fixed intel/llvm#13066 where submitting a barrier into an empty in-order queue whilst recording a graph would result in the in runtime error "No event has From 88aa3cb76f11ccda3b254a6f2345e18e288a76fe Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Fri, 13 Sep 2024 03:26:58 -0700 Subject: [PATCH 21/30] Process some of UR commits --- sycl/ReleaseNotes.md | 419 ++----------------------------------------- 1 file changed, 12 insertions(+), 407 deletions(-) diff --git a/sycl/ReleaseNotes.md b/sycl/ReleaseNotes.md index c7f52eb2f76c..a1d4caa6da9b 100644 --- a/sycl/ReleaseNotes.md +++ b/sycl/ReleaseNotes.md @@ -235,8 +235,10 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad - Updated [`sycl_ext_oneapi_bindless_images`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_bindless_images.asciidoc) extension implementation to support cubemap images. intel/llvm#12996 - Added ESIMD API for dynamic allocation of named barriers. intel/llvm#13826 - Updated [`sycl_ext_oneapi_bindless_images`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_bindless_images.asciidoc) extension implementation to support sampled image arrays. intel/llvm#14237 -- Added implementation for whole graph update (`executable_command_graph::update`). - intel/llvm#13220 intel/llvm#14379 intel/llvm#14236 +- Added implementation for whole graph update + (`executable_command_graph::update`). + intel/llvm#13220 intel/llvm#14379 intel/llvm#14236 intel/llvm#14111 + intel/llvm#13987 - Added a warning about use of the deprecated `` header. intel/llvm#13569 - Made `local_accessor::get_pointer` and `local_accessor::get_multi_ptr` throw `invalid` exception if they are called on host. intel/llvm#13747 @@ -333,6 +335,7 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad intel/llvm#12449 - Improved performance of `queue::fill` on CUDA backend by making it use 2- and 4-byte operations instead of only using 1-byte operations. intel/llvm#13788 + intel/llvm#13779 - Enhanced `sycl_ext_oneapi_graph` extension implementation on Level Zero backend by taking advantage of copy engines available on some devices to be able to execute kernels and memory operations in parallel. intel/llvm#13051 @@ -351,6 +354,8 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad performed in very limited amount of cases. intel/llvm#13088 - Implemented new `fetch_image` overload which accepts sampled image and coordinates. intel/llvm#12447 +- Extended address sanitizer support to cover Intel GPU devices besides CPU + devices. intel/llvm#13450 ### Documentation @@ -609,6 +614,11 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad intel/llvm#13364 - Fixed performance regression when kernels without any dependencies are submitted into in-order queue. intel/llvm#13333 +- Fixed a bug where profiling info timestamps could be zeros on Level Zero + backend. intel/llvm#14360 +- Fixed a bug where using multiple queues with `immediate_command_list` and + `no_immediate_command_list` properties could result in a crash. + intel/llvm#14341 ### Documentation @@ -747,79 +757,6 @@ Breaking changes were also made to compiler flags: type. For more information on this issue consult with https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#wmma-restrictions -commit https://github.com/intel/llvm/commit/33c0829f3e3389006662845784980b930faf3b38 -Author: Igor Chorążewicz -Date: Thu Jul 25 23:00:19 2024 -0700 - - [UR] Bump UR version and enable dynamic linking with UMF (#13343) - - Testing PR for: https://github.com/oneapi-src/unified-runtime/pull/1430 - - --------- - - Co-authored-by: Krzysztof Swiecicki - Co-authored-by: Steffen Larsen - -commit https://github.com/intel/llvm/commit/450683b6fa1d1be1b9391905f43073b7a9555aa1 -Author: Yang Zhao -Date: Thu Jul 25 00:02:46 2024 +0800 - - [SYCL][DeviceSanitizer] Support GPU DG2 Device (#13450) - - UR: https://github.com/oneapi-src/unified-runtime/pull/1521 - - - Add MemToShadow_DG2 - - Enable lit tests for GPU, decrease the global workgoup size in some - tests due to the limit of GPU memory - - Although, the "_DG2" suffix might be misleading: DG2 present all 48bits - virtual address devices, and PVC present all 58bits virtual address - devices. - - --------- - - Co-authored-by: Wenju He - Co-authored-by: Kenneth Benzie (Benie) - -commit https://github.com/intel/llvm/commit/bd97280007ee79bf118fdbade3d9cb14721b9014 -Author: aarongreig -Date: Wed Jul 24 07:20:14 2024 +0100 - - [UR] Bump main tag to 9b209642 (#14553) - - * https://github.com/oneapi-src/unified-runtime/pull/1791 - * https://github.com/oneapi-src/unified-runtime/pull/1856 - * https://github.com/oneapi-src/unified-runtime/pull/1861 - - --------- - - Co-authored-by: Kenneth Benzie (Benie) - -commit https://github.com/intel/llvm/commit/5667218ed6be6dee8877efcc2fbcfc2ecd515cff -Author: Kenneth Benzie (Benie) -Date: Tue Jul 16 18:05:26 2024 +0100 - - [UR] Bump main tag to 7e38af77 (#14552) - - * https://github.com/oneapi-src/unified-runtime/pull/1826 - * https://github.com/oneapi-src/unified-runtime/pull/1852 - * https://github.com/oneapi-src/unified-runtime/pull/1849 - * https://github.com/oneapi-src/unified-runtime/pull/1828 - * https://github.com/oneapi-src/unified-runtime/pull/1772 - * https://github.com/oneapi-src/unified-runtime/pull/1862 - -commit https://github.com/intel/llvm/commit/44861fec406fff7a20bd4791c4288d71828912cc -Author: Callum Fare -Date: Thu Jul 11 15:15:12 2024 +0100 - - [UR] Bump UR and implement changes to bindless image handle types (#14516) - - https://github.com/oneapi-src/unified-runtime/pull/1829 - - --------- - - Co-authored-by: Kenneth Benzie (Benie) - commit https://github.com/intel/llvm/commit/c30769b122d99eb4d05bcb78f15e593491fe31ae Author: Neil R. Spruit [UR][L0] Use Intel Level Zero Driver Version String extension (#14426) @@ -832,57 +769,18 @@ Author: Ross Brunton https://github.com/oneapi-src/unified-runtime/pull/1458 Seems to be an internal UR bugfix/improvement -commit https://github.com/intel/llvm/commit/13ae57f97cfb45cbcee8db6155ac8b0f7b7fbb82 -Author: Kenneth Benzie (Benie) -Date: Wed Jul 10 10:53:12 2024 +0100 - - [UR] Bump main tag to 9d3bce6a (#14499) - - https://github.com/oneapi-src/unified-runtime/pull/1822 - commit https://github.com/intel/llvm/commit/db4d83e3969a5f7b5313aa5fb8466dd2ebbf9283 Author: Neil R. Spruit [UR][L0] Fix Queue get info and fix Queue release decrement (#14411) https://github.com/oneapi-src/unified-runtime/pull/1814 Could be an actual bugfix -commit https://github.com/intel/llvm/commit/78ae397aab9b2040be945ee2f7f73d93404ffa06 -Author: Artur Gainullin -Date: Tue Jul 9 02:37:27 2024 -0700 - - [UR] Uplift UR tag to the fix in the L0 adapter regarding event timestamps (#14360) - - UR PR: https://github.com/oneapi-src/unified-runtime/pull/1806 - -commit https://github.com/intel/llvm/commit/ac556f9273e479c033e7dc76248fdb6861377ce7 -Author: Fábio -Date: Mon Jul 8 16:23:41 2024 +0100 - - [UR] Update main tag for L0 CommandBuffer refactor (#14240) - - Co-authored-by: Kenneth Benzie (Benie) - commit https://github.com/intel/llvm/commit/eb03091539daa68a582ceab950379ca482e118d9 Author: Neil R. Spruit [UR][L0] Fix Device Info return code to report unsupported enumeration (#14407) https://github.com/oneapi-src/unified-runtime/pull/1809 ??? -commit https://github.com/intel/llvm/commit/577c349c5f3b1c893160de2470aff5ee3f87f0bc -Author: Neil R. Spruit -Date: Fri Jul 5 04:30:49 2024 -0700 - - [UR][L0] Fix immediate command list use in Command Queues (#14341) - - pre-commit https://github.com/intel/llvm/commit/PR for - https://github.com/oneapi-src/unified-runtime/pull/1802 - - --------- - - Signed-off-by: Neil R. Spruit - Co-authored-by: Kenneth Benzie (Benie) - Co-authored-by: Aaron Greig - commit https://github.com/intel/llvm/commit/f2bd076eb55a2cc79de2e9d4748967ed3cb13c9b Author: Wu Yingcong [UR] fix use-after-free problems (#13855) @@ -902,61 +800,6 @@ Date: Tue Jun 25 07:03:05 2024 -0700 Signed-off-by: Neil R. Spruit -commit https://github.com/intel/llvm/commit/088a9475e7c5f39ecb2b74f79a479380c9dd64be -Author: aarongreig -Date: Fri Jun 21 13:52:08 2024 +0100 - - [UR] Pull in changes from UR PR #805 (#12270) - - https://github.com/oneapi-src/unified-runtime/pull/805 - - --------- - - Co-authored-by: Kenneth Benzie (Benie) - -commit https://github.com/intel/llvm/commit/350b56fda217ffc4677c5a3443a7844e13ac209d -Author: Hugh Delaney -Date: Fri Jun 21 10:30:11 2024 +0100 - - [UR] Update main tag to 975313cb (#14225) - - https://github.com/oneapi-src/unified-runtime/pull/1774 - - --------- - - Co-authored-by: Kenneth Benzie (Benie) - -commit https://github.com/intel/llvm/commit/ab77ba800e6b36d0217dea053d125435f0a0b2db -Author: Kenneth Benzie (Benie) -Date: Tue Jun 18 17:51:02 2024 +0100 - - [UR] Bump L0 tag to 2a31795d (#14213) - - https://github.com/oneapi-src/unified-runtime/pull/1623 - -commit https://github.com/intel/llvm/commit/174f7510328f49d6f24c578b226acea085489082 -Author: Steffen Larsen -Date: Mon Jun 17 15:01:45 2024 +0200 - - [UR] Bump main tag to 33eb5ea8 (#13950) - - Pull in changes from - https://github.com/oneapi-src/unified-runtime/pull/1678. - - --------- - - Signed-off-by: Larsen, Steffen - Co-authored-by: Kenneth Benzie (Benie) - - -commit https://github.com/intel/llvm/commit/5e9e7a73ce11182af6ceafc1e91996b6c79f7180 -Author: aarongreig -Date: Mon Jun 17 10:30:32 2024 +0100 - - [UR] Pull in change to make urPlatformCreateWithNativeHandle take an adapter. (#14012) - - Co-authored-by: Kenneth Benzie (Benie) - commit https://github.com/intel/llvm/commit/579484f0ae9e5e30b9c9bd468799e1688d5de890 Author: Neil R. Spruit Date: Fri Jun 14 05:45:42 2024 -0700 @@ -983,174 +826,6 @@ Date: Wed Jun 12 16:46:31 2024 +0100 Co-authored-by: Kenneth Benzie (Benie) -commit https://github.com/intel/llvm/commit/1a885ecacc468ab324c812ab47b4af7f3b086e52 -Author: Artur Gainullin -Date: Wed Jun 12 07:30:29 2024 -0700 - - [UR] Update UR tag to include L0 loader related changes (#14109) - - Co-authored-by: Kenneth Benzie (Benie) - -commit https://github.com/intel/llvm/commit/7c530e154021d103259c8437233e7ba13ce98146 -Author: aarongreig -Date: Wed Jun 12 13:15:51 2024 +0100 - - [UR] Bump main tag to 78d02039 (#12269) - - https://github.com/oneapi-src/unified-runtime/pull/1128 - - --------- - - Co-authored-by: Kenneth Benzie (Benie) - -commit https://github.com/intel/llvm/commit/3935e06bc2e3794b7eac715c069e28c30aeaee9c -Author: Ewan Crawford -Date: Mon Jun 10 12:46:11 2024 +0100 - - [SYCL][Graph] Combined L0 Graph Update fixes (#14111) - - Bumps the L0 adapter UR commit https://github.com/intel/llvm/commit/to include several merged fixes to the L0 - adapter for implementing the SYCL-Graph update feature: - - * [Use fence rather than event for sync in L0 command-buffer - update](https://github.com/oneapi-src/unified-runtime/pull/1629) - * [Fix lifetime of pointer used in L0 - update](https://github.com/oneapi-src/unified-runtime/pull/1721) - * [Fix L0 Event leak without return sync - point](https://github.com/oneapi-src/unified-runtime/pull/1706) - -commit https://github.com/intel/llvm/commit/fcfe36b705fa715b4813de95565bbba9a5b88223 -Author: Kenneth Benzie (Benie) -Date: Mon Jun 10 10:16:54 2024 +0100 - - [UR] Bump main tag to f06bc02a (#14047) - - Includes the following: - - * https://github.com/oneapi-src/unified-runtime/pull/1653 - * https://github.com/oneapi-src/unified-runtime/pull/1568 - * https://github.com/oneapi-src/unified-runtime/pull/1634 - * https://github.com/oneapi-src/unified-runtime/pull/1669 - -commit https://github.com/intel/llvm/commit/0cec12826baea60a15483081b0feece49013049f -Author: Kenneth Benzie (Benie) -Date: Wed Jun 5 11:20:25 2024 +0100 - - [UR] Bump HIP tag to 399430da (#14037) - -commit https://github.com/intel/llvm/commit/2838f40382bedddbda0a5f20ebeeba86310044da -Author: Ewan Crawford -Date: Wed Jun 5 09:20:03 2024 +0100 - - [SYCL][Graph][L0] Correctly report when device supports update (#13987) - - Bump UR L0 commit https://github.com/intel/llvm/commit/to - https://github.com/oneapi-src/unified-runtime/pull/1694 so that the SYCL - device aspect for supporting update in graphs is correctly reported for - L0 devices. Currently, support can be incorrectly reported. - - --------- - - Co-authored-by: Kenneth Benzie (Benie) - -commit https://github.com/intel/llvm/commit/20991b1c2ee906148706aa1e7ae62c1084834799 -Author: Kenneth Benzie (Benie) -Date: Wed Jun 5 08:48:18 2024 +0100 - - [UR] Bump CUDA tag to 0e38fda0 (#14030) - -commit https://github.com/intel/llvm/commit/781b75abfd1dac36a2c68fbc13bd6f1bb845d35b -Author: Wu Yingcong -Date: Tue Jun 4 06:09:03 2024 -0700 - - [UR] Test for unified runtime PR (#12902) - - UR PR: https://github.com/oneapi-src/unified-runtime/pull/1385 - - --------- - - Co-authored-by: Kenneth Benzie (Benie) - -commit https://github.com/intel/llvm/commit/f2a2de3b6e735ee4a54ecc212b648f370e47abbc -Author: Ewan Crawford -Date: Thu May 30 14:28:35 2024 +0100 - - [SYCL][Graph] Add debug logging for L0 Graph kernel update (#13892) - - Bumps UR to Level Zero adapter change from - https://github.com/oneapi-src/unified-runtime/pull/1654 - - --------- - - Co-authored-by: Kenneth Benzie (Benie) - -commit https://github.com/intel/llvm/commit/1fa2ac88a1fb3a5eba0315c03faa03c2d8e3c5f7 -Author: Kenneth Benzie (Benie) -Date: Thu May 30 12:46:50 2024 +0100 - - [UR][HIP] Implement kernel set spec constant query (#13809) - - https://github.com/oneapi-src/unified-runtime/pull/1604 - -commit https://github.com/intel/llvm/commit/e147f3673c77e566a63a1d4d57d6f5da0153cbdb -Author: Konrad Kusiak -Date: Thu May 30 12:46:39 2024 +0100 - - [UR] Modify fill emulation to work for patterns which are not powers of 2 (#13779) - - https://github.com/oneapi-src/unified-runtime/pull/1603 - Follow-up fix: - commit https://github.com/intel/llvm/commit/f34a65012c21192d6f90c10a893cffb35a250dff - Author: Konrad Kusiak - https://github.com/oneapi-src/unified-runtime/pull/1412 - - This patch is needed for #13788 - -commit https://github.com/intel/llvm/commit/8086df575d7f622017521fcd2f8b2b90fdd49d39 -Author: Neil R. Spruit -Date: Thu May 30 02:57:41 2024 -0700 - - [UR][L0] Fix Multi Device Event Cache for shared Root Device (#13917) - - pre-commit https://github.com/intel/llvm/commit/PR for - https://github.com/oneapi-src/unified-runtime/pull/1667 - - Signed-off-by: Neil R. Spruit - -commit https://github.com/intel/llvm/commit/16e0670ab6e2425a20e13aec2c7f5896fd4eabfc -Author: Ross Brunton -Date: Fri May 24 14:25:29 2024 +0100 - - [UR][OpenCL] Bump UR OpenCL adapter for invalid kernel args (#13658) - - For UR merge request - https://github.com/oneapi-src/unified-runtime/pull/1501 - - --------- - - Co-authored-by: Kenneth Benzie (Benie) - -commit https://github.com/intel/llvm/commit/7fa793bc7d17b9447ac0726bd01eb33680432d38 -Author: Kenneth Benzie (Benie) -Date: Fri May 24 13:33:08 2024 +0100 - - [UR] Bump L0 tag to e4287455 (#13910) - -commit https://github.com/intel/llvm/commit/f05c1c82d07a81050db4931eef6b8d02d359a325 -Author: Hugh Delaney -Date: Wed May 22 14:37:14 2024 +0100 - - [UR] CUDA multi device ctx (#13616) - - https://github.com/oneapi-src/unified-runtime/pull/1565 - - For UR multi device context, buffer interop is now deprecated since a - buffer refers to multiple device pointers. - - --------- - - Co-authored-by: Kenneth Benzie (Benie) - commit https://github.com/intel/llvm/commit/5a09c6a15279484434df299d9164d94b96d3507a Author: Kenneth Benzie (Benie) Date: Wed May 22 10:42:06 2024 +0100 @@ -1159,32 +834,6 @@ Date: Wed May 22 10:42:06 2024 +0100 https://github.com/oneapi-src/unified-runtime/pull/1401 -commit https://github.com/intel/llvm/commit/ba19132218050c4791c5aa82316cc10e38986f75 -Author: Hugh Delaney -Date: Thu May 16 15:40:30 2024 +0100 - - [UR][HIP] Get Device From Queue (#13575) - - https://github.com/oneapi-src/unified-runtime/pull/1553 - - --------- - - Co-authored-by: Kenneth Benzie (Benie) - -commit https://github.com/intel/llvm/commit/99d7097c10ae92805c6a100ffb544bdf0630c063 -Author: Neil R. Spruit -Date: Thu May 16 07:06:42 2024 -0700 - - [UR][L0] ensure a valid kernel handle for the device when reading max wg (#13797) - - pre-commit https://github.com/intel/llvm/commit/PR for - https://github.com/oneapi-src/unified-runtime/pull/1611 - - --------- - - Signed-off-by: Neil R. Spruit - Co-authored-by: Kenneth Benzie (Benie) - commit https://github.com/intel/llvm/commit/34292bbc89f71233ef687652c33c52b55a38839e Author: Neil R. Spruit Date: Wed May 15 07:43:11 2024 -0700 @@ -1197,18 +846,6 @@ Date: Wed May 15 07:43:11 2024 -0700 Signed-off-by: Neil R. Spruit Co-authored-by: Kenneth Benzie (Benie) -commit https://github.com/intel/llvm/commit/9bf81044bfbe229b6846c96819a470e62065469a -Author: Ewan Crawford -Date: Wed May 15 15:05:06 2024 +0100 - - [CUDA][SYCL] Bump UR CUDA Tag (#13746) - - https://github.com/oneapi-src/unified-runtime/pull/1596 - - --------- - - Co-authored-by: Kenneth Benzie (Benie) - commit https://github.com/intel/llvm/commit/a5da94d1fb9a46f0a8334db500f26d30b62c1c02 Author: Neil R. Spruit Date: Fri May 10 02:20:00 2024 -0700 @@ -1382,22 +1019,6 @@ Date: Tue Apr 16 11:17:23 2024 -0700 Fix this by making CHECK-NOT only match output generated by UR_L0_DEBUG. -commit https://github.com/intel/llvm/commit/9958a742ab498b89fb5c49ccbe94fe6f9a7a6bf6 -Author: Kenneth Benzie (Benie) -Date: Tue Apr 16 18:00:03 2024 +0100 - - [UR] Bump CUDA tag to 3f5f5688 (#13399) - - https://github.com/oneapi-src/unified-runtime/pull/1510 - -commit https://github.com/intel/llvm/commit/c959b5313c6c74f206609b526272206a4b144315 -Author: Hugh Delaney -Date: Tue Apr 16 07:56:06 2024 -0500 - - [UR] Bump HIP tag to 15233fd2 (#13020) - - https://github.com/oneapi-src/unified-runtime/pull/1437 - commit https://github.com/intel/llvm/commit/684cd90e22fe67d4a524be92c69e026cca262f1c Author: aarongreig Date: Tue Apr 16 13:17:43 2024 +0100 @@ -1409,14 +1030,6 @@ Date: Tue Apr 16 13:17:43 2024 +0100 https://github.com/oneapi-src/unified-runtime/pull/1494 https://github.com/oneapi-src/unified-runtime/pull/1507 -commit https://github.com/intel/llvm/commit/e6d9d4c6bfabae78c29aa3b376e568974860a219 -Author: Kenneth Benzie (Benie) -Date: Tue Apr 16 10:29:06 2024 +0100 - - [UR] Bump CUDA tag to 1333d4a0 (#13398) - - https://github.com/oneapi-src/unified-runtime/pull/1342 - commit https://github.com/intel/llvm/commit/a884a54914f9e9cf052591d70eb5cac20a25a210 Author: Neil R. Spruit Date: Mon Apr 15 01:58:14 2024 -0700 @@ -1435,14 +1048,6 @@ Date: Mon Apr 15 01:58:14 2024 -0700 Signed-off-by: Neil R. Spruit Co-authored-by: Aaron Greig -commit https://github.com/intel/llvm/commit/71358f095be30b1cccd8c39a5ac2224fab9491b5 -Author: Kenneth Benzie (Benie) -Date: Mon Apr 15 09:49:02 2024 +0100 - - [UR] Bump CUDA tag to 68e525a4 (#13376) - - https://github.com/oneapi-src/unified-runtime/pull/1317 - commit https://github.com/intel/llvm/commit/d7c5a9c6b2c9edb52f14adca5c84c3c3e3419d7b Author: Konrad Kusiak Date: Mon Apr 15 08:43:44 2024 +0100 From 82e564153684b5597ce075032cbbc8b2218a2d3f Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Tue, 17 Sep 2024 03:37:31 -0700 Subject: [PATCH 22/30] Resolve some of TODOs --- sycl/ReleaseNotes.md | 29 ++++++++++++++--------------- 1 file changed, 14 insertions(+), 15 deletions(-) diff --git a/sycl/ReleaseNotes.md b/sycl/ReleaseNotes.md index a1d4caa6da9b..5807e9a25cef 100644 --- a/sycl/ReleaseNotes.md +++ b/sycl/ReleaseNotes.md @@ -7,24 +7,9 @@ Release notes for commit range ## TODO -commit https://github.com/intel/llvm/commit/9876e19f4ff387b35b0c98c7d62e5f50e6de187d - [SYCL][XPTI] 'queue_id' metadata feature refactoring (#13070) - bugfix? - commit https://github.com/intel/llvm/commit/29b4d855fa1a378e89182795e0d368304c40c3f6 [SYCL][CUDA] Enable support of msvc math functions for nvptx target. (#14007) -commit https://github.com/intel/llvm/commit/7a9d3b1e9483b69baa0b8c6f1097016efd52854c - [SYCL][NVPTX] Do not decompose SYCL functor unless necessary (#14434) - -commit https://github.com/intel/llvm/commit/38e663ecd37de513d8e31afdfdf245cf8c9d17f0 - [SYCL] Declare __devicelib_assert_read only when fallback assert is enabled (#13241) - Is there any particular user-visible bug associated with this? - -commit https://github.com/intel/llvm/commit/4b993a7b32f7743980bce646765a1b427b0996b6 - Revert "[SYCL][Driver] Link with sycl libs at link step of clang-cl -fsycl (#12793)" (#13326) - revert commit https://github.com/intel/llvm/commit/seems to be a part of a previous release - commit https://github.com/intel/llvm/commit/4b14d706d93891cdb5b0e6a8d4b0b027c1d54ab8 [SYCL][DeviceSanitizer] Use -asan-constructor-kind=none to disable ctor/dtor (#13259) was the bug really user-visible? @@ -76,6 +61,9 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad [dynamic linking](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/design/SharedLibraries.md). Current implementation lacks support for `kernel_bundle` API and AOT mode. intel/llvm#14587 intel/llvm#14189 intel/llvm#14103 +- Added `-fno-sycl-decompose-functor` compiler flag which instructs compiler + to emit less kernel arguments if possible. The flag is experimental and it + only has effect when compiling for CUDA targets. intel/llvm#14434 ### SYCL Library @@ -638,6 +626,17 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad parameter. intel/llvm#13821 - Fixed compilation issues when `SYCL_COMPAT_PROFILING_ENABLED` is defined. intel/llvm#14574 +## Misc + +### SYCL Compiler + +- Reverted changes previously made as a bugfix on Windows to support a separate + compilation scenario where compilation step is performed _without_ the + `-fsycl` flag, but link step _with_ the `-fsycl` flag, expecting the compiler + to do the right thing. However, this is now considered to be a unsupported + scenario, because during link step the compiler doesn't know which version + (debug or release) of the standard library to link. intel/llvm#13326 + ## API/ABI Breaking Changes This release is an *ABI* breaking release, meaning that any applications which From cb732cd089046266503893bb505cb82d58c85346 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Tue, 17 Sep 2024 05:48:17 -0700 Subject: [PATCH 23/30] Record a few more known issues --- sycl/ReleaseNotes.md | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/sycl/ReleaseNotes.md b/sycl/ReleaseNotes.md index 5807e9a25cef..2efdbc2773da 100644 --- a/sycl/ReleaseNotes.md +++ b/sycl/ReleaseNotes.md @@ -755,6 +755,19 @@ Breaking changes were also made to compiler flags: compilation. This particularly affects matrix operations using `half` data type. For more information on this issue consult with https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#wmma-restrictions +- [new] When using `queue` shortcut functions with in-order queues dependencies + between commands submitted to different queues may be ignored: + ```c++ + // q1 long running task + sycl::event e = q1.single_task([=](){ /* ... */ }); + // q2 task + q2.single_task(e, [=](){ /* ... */ }); + ``` + In the example above, the seocnd kernel will start execution *before* the + first completes its execution. A workaround is to explicitly call `.wait()`. + This will be fixed in the next release, see intel/llvm#15412 +- [new] C/C++ math built-ins (like `exp` or `tanh`) can return incorrect + results for some edge-case input when they are called from SYCL kernels. commit https://github.com/intel/llvm/commit/c30769b122d99eb4d05bcb78f15e593491fe31ae Author: Neil R. Spruit From 4b0acad13f716c57b9b1a22a7523e4049c1705dc Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Tue, 17 Sep 2024 11:17:16 -0700 Subject: [PATCH 24/30] Handle more UR commits --- sycl/ReleaseNotes.md | 361 +------------------------------------------ 1 file changed, 7 insertions(+), 354 deletions(-) diff --git a/sycl/ReleaseNotes.md b/sycl/ReleaseNotes.md index 2efdbc2773da..026d8353fc87 100644 --- a/sycl/ReleaseNotes.md +++ b/sycl/ReleaseNotes.md @@ -226,7 +226,7 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad - Added implementation for whole graph update (`executable_command_graph::update`). intel/llvm#13220 intel/llvm#14379 intel/llvm#14236 intel/llvm#14111 - intel/llvm#13987 + intel/llvm#13987 intel/llvm#12724 - Added a warning about use of the deprecated `` header. intel/llvm#13569 - Made `local_accessor::get_pointer` and `local_accessor::get_multi_ptr` throw `invalid` exception if they are called on host. intel/llvm#13747 @@ -344,6 +344,9 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad coordinates. intel/llvm#12447 - Extended address sanitizer support to cover Intel GPU devices besides CPU devices. intel/llvm#13450 +- Updated `info::device::max_mem_alloc_size` query to return total amount of + a device memory for CUDA devices, because they have no limit on size of + memory allocations. intel/llvm#13344 ### Documentation @@ -607,6 +610,9 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad - Fixed a bug where using multiple queues with `immediate_command_list` and `no_immediate_command_list` properties could result in a crash. intel/llvm#14341 +- Fixed a bug where `info::kernel_device_specific::work_group_size` would + return `device`-specific limit ignoring the kernel on Level Zero backend. + intel/llvm#13474 ### Documentation @@ -826,18 +832,6 @@ Date: Fri Jun 14 05:45:42 2024 -0700 Signed-off-by: Neil R. Spruit Co-authored-by: Kenneth Benzie (Benie) -commit https://github.com/intel/llvm/commit/ae79b95cc07ab68fcf706d47851b93e5b299dc87 -Author: Hugh Delaney -Date: Wed Jun 12 16:46:31 2024 +0100 - - [UR] Bump main tag to b13c5e1f (#14042) - - https://github.com/oneapi-src/unified-runtime/pull/1711 - - --------- - - Co-authored-by: Kenneth Benzie (Benie) - commit https://github.com/intel/llvm/commit/5a09c6a15279484434df299d9164d94b96d3507a Author: Kenneth Benzie (Benie) Date: Wed May 22 10:42:06 2024 +0100 @@ -846,318 +840,10 @@ Date: Wed May 22 10:42:06 2024 +0100 https://github.com/oneapi-src/unified-runtime/pull/1401 -commit https://github.com/intel/llvm/commit/34292bbc89f71233ef687652c33c52b55a38839e -Author: Neil R. Spruit -Date: Wed May 15 07:43:11 2024 -0700 - - [UR][L0] Fix timestamp event evict after delete (#13717) - - pre-commit https://github.com/intel/llvm/commit/PR for - https://github.com/oneapi-src/unified-runtime/pull/1592 - - Signed-off-by: Neil R. Spruit - Co-authored-by: Kenneth Benzie (Benie) - -commit https://github.com/intel/llvm/commit/a5da94d1fb9a46f0a8334db500f26d30b62c1c02 -Author: Neil R. Spruit -Date: Fri May 10 02:20:00 2024 -0700 - - [UR][L0] Disable Usage of Driver In order Lists by default (#13715) - - pre-commit https://github.com/intel/llvm/commit/PR for - https://github.com/oneapi-src/unified-runtime/pull/1591 - - Signed-off-by: Neil R. Spruit - -commit https://github.com/intel/llvm/commit/c6be822ba3fbf1dc7c2f89805493400704ad89b5 -Author: Neil R. Spruit -Date: Tue May 7 02:47:56 2024 -0700 - - [UR][L0] Fix the repo tag for the L0 adapter to use the global variable (#13667) - - Signed-off-by: Neil R. Spruit - -commit https://github.com/intel/llvm/commit/dd183bf2a706571e29428a425d3a5f9bb6133a69 -Author: aarongreig -Date: Tue May 7 10:30:00 2024 +0100 - - [UR][L0] Pull in some minor fixes for L0 device queries. (#13424) - - UR PR https://github.com/oneapi-src/unified-runtime/pull/1513 - -commit https://github.com/intel/llvm/commit/85037b20a9131400ce7cddac9c215adf563b6577 -Author: Kenneth Benzie (Benie) -Date: Fri May 3 19:22:38 2024 +0100 - - [UR] Bump L0 tag to fb342f06 (#13646) - - https://github.com/oneapi-src/unified-runtime/pull/1549 - -commit https://github.com/intel/llvm/commit/d1dddccded89ee1b34a120575726022ef8c97634 -Author: Piotr Balcer -Date: Fri May 3 12:57:22 2024 +0200 - - [UR][L0] fix queue locking behavior when creating event lists (#13564) - - Co-authored-by: Kenneth Benzie (Benie) - -commit https://github.com/intel/llvm/commit/1a5595f8e43a12fc361c8868b04f265182259657 -Author: Neil R. Spruit -Date: Thu May 2 11:31:48 2024 -0700 - - [UR][L0] Enable Batching out of order commands without signal events (#13462) - - - pre-commit https://github.com/intel/llvm/commit/PR for - https://github.com/oneapi-src/unified-runtime/pull/1526 - - --------- - - Signed-off-by: Neil R. Spruit - Co-authored-by: Kenneth Benzie (Benie) - -commit https://github.com/intel/llvm/commit/4c7baa7aa553ce5a6f68eeb74851ece279efbd3d -Author: jinge90 -Date: Tue Apr 30 21:20:23 2024 +0800 - - [UR] Intercept urProgramLinkExp in ur ASAN layer (#13048) - - UR part: https://github.com/oneapi-src/unified-runtime/pull/1452 - - --------- - - Signed-off-by: jinge90 - Co-authored-by: Kenneth Benzie (Benie) - -commit https://github.com/intel/llvm/commit/15c9c62bc171c849588fa58029f4c40dc142e80f -Author: Omar Ahmed <30423288+omarahmed1111@users.noreply.github.com> -Date: Tue Apr 30 11:04:50 2024 +0100 - - Testing add validation tests to getInfo tests (#12782) - - Testing PR for [UR - PR](https://github.com/oneapi-src/unified-runtime/pull/1346) - - After correcting hip program info returned for program device to return - context device rather than the binary associated device that have fixed - the kernel fusion cooperative kernels e2e test. - - --------- - - Co-authored-by: Kenneth Benzie (Benie) - -commit https://github.com/intel/llvm/commit/7cd48cbac6ddcb3e950748b76acd42812fab18bb -Author: Ben Tracy -Date: Mon Apr 29 08:37:48 2024 +0100 - - [SYCL][Graph] Bump UR commit https://github.com/intel/llvm/commit/for in-order L0 optimization (#13565) - - - Bumps commit https://github.com/intel/llvm/commit/only and includes minimal pi2ur changes for new - descriptor members - - In-order path not currently used, enable profiling by default (match - previous behaviour) - - UR PR: https://github.com/oneapi-src/unified-runtime/pull/1442 - -commit https://github.com/intel/llvm/commit/95420f09ea81539d8c18fbb7d7406ec82947aeb5 -Author: Winston Zhang -Date: Fri Apr 26 06:56:11 2024 -0700 - - [UR][L0] Testing for counter-based-events implementation in URT draft (#12848) - - commit https://github.com/intel/llvm/commit/tag: 4134bfce72d33e89eebcad11186bdf00310bba83 - URT PR: https://github.com/oneapi-src/unified-runtime/pull/1370 - - --------- - - Signed-off-by: Zhang, Winston - Co-authored-by: Kenneth Benzie (Benie) - -commit https://github.com/intel/llvm/commit/fc94a16a8a97464f96ea07bf77600d6337c00f76 -Author: Neil R. Spruit -Date: Fri Apr 26 03:16:19 2024 -0700 - - [UR][L0] reset command lists on error unknown (#13522) - - pre-commit https://github.com/intel/llvm/commit/PR for - https://github.com/oneapi-src/unified-runtime/pull/1539 - - --------- - - Signed-off-by: Neil R. Spruit - Co-authored-by: Kenneth Benzie (Benie) - -commit https://github.com/intel/llvm/commit/719207dbac44ebb1bcae96eca992276171172120 -Author: Ewan Crawford -Date: Wed Apr 24 09:10:44 2024 +0100 - - [SYCL][Graph] Bump UR commit https://github.com/intel/llvm/commit/to OpenCL kernel update (#12724) - - Test the UR commit https://github.com/intel/llvm/commit/that enables updating kernel commands in a - command-buffer in the OpenCL adapter - https://github.com/oneapi-src/unified-runtime/pull/1358 - -commit https://github.com/intel/llvm/commit/96b07cf9c3b8407194d0082b0b30170f4f232a39 -Author: Kenneth Benzie (Benie) -Date: Tue Apr 23 11:14:52 2024 +0100 - - [UR] Bump main tag to 31d0fe15 (#13511) - - * https://github.com/oneapi-src/unified-runtime/pull/1032 - * https://github.com/oneapi-src/unified-runtime/pull/1183 - * https://github.com/oneapi-src/unified-runtime/pull/1243 - -commit https://github.com/intel/llvm/commit/723b7b7b043783f04b6b0ec2195971a5e95f216b -Author: aarongreig -Date: Fri Apr 19 22:15:55 2024 +0100 - - [UR][L0] Update main UR tag to 717791b (#13474) - - This pulls in fixes from: - https://github.com/oneapi-src/unified-runtime/pull/1298 - https://github.com/oneapi-src/unified-runtime/pull/1495 - https://github.com/oneapi-src/unified-runtime/pull/1517 - - Also remove now unnecessary XFAIL from Basic/kernel_max_wg_size.cpp - -commit https://github.com/intel/llvm/commit/8cd2eb0ac2efc65cd109e0bfce02aedd69ce4cf2 -Author: Igor Chorążewicz -Date: Tue Apr 16 11:17:23 2024 -0700 - - [UR][L0] fix ze commands matching in level_zero_eager_init.cpp (#13277) - - If the test is run with UR_L0_LEAKS_DEBUG var set, UR will print ze call - count summary. This summary can cause the test to fail as it will - contain zeCommandQueueCreate, etc. - - Fix this by making CHECK-NOT only match output generated by UR_L0_DEBUG. - -commit https://github.com/intel/llvm/commit/684cd90e22fe67d4a524be92c69e026cca262f1c -Author: aarongreig -Date: Tue Apr 16 13:17:43 2024 +0100 - - [UR][L0] Pull in a batch of L0 fixes (#13400) - - Pulls in fixes - https://github.com/oneapi-src/unified-runtime/pull/1492 - https://github.com/oneapi-src/unified-runtime/pull/1494 - https://github.com/oneapi-src/unified-runtime/pull/1507 - -commit https://github.com/intel/llvm/commit/a884a54914f9e9cf052591d70eb5cac20a25a210 -Author: Neil R. Spruit -Date: Mon Apr 15 01:58:14 2024 -0700 - - [UR][L0][Image] Set ZeImageDesc member of _ur_image in release build … (#13338) - - …for legacy image - - - reenable the image interop test with fix to image interop in release - builds - - precommit https://github.com/intel/llvm/commit/PR for - https://github.com/oneapi-src/unified-runtime/pull/1498 - - --------- - - Signed-off-by: Neil R. Spruit - Co-authored-by: Aaron Greig - -commit https://github.com/intel/llvm/commit/d7c5a9c6b2c9edb52f14adca5c84c3c3e3419d7b -Author: Konrad Kusiak -Date: Mon Apr 15 08:43:44 2024 +0100 - - [UR] [NATIVECPU] CI for: Extended usm fill to bigger patterns than 1 byte (#13263) - - https://github.com/oneapi-src/unified-runtime/pull/1489 - - Co-authored-by: Kenneth Benzie (Benie) - - -commit https://github.com/intel/llvm/commit/16da4ec202cc818b9e79b75ecd9b7e301e3bea53 -Author: Konrad Kusiak -Date: Fri Apr 12 15:33:18 2024 +0100 - - [UR] Bump HIP tag to 1473ed8a (#12898) - - https://github.com/oneapi-src/unified-runtime/pull/1395 - -commit https://github.com/intel/llvm/commit/6ba50805672c72654c8288d33960f36c09cc89bb -Author: Neil R. Spruit -Date: Fri Apr 12 02:07:48 2024 -0700 - - [UR][L0] Fix regular in order command list reuse given inorder queue (#13195) - - pre-commit https://github.com/intel/llvm/commit/PR for - https://github.com/oneapi-src/unified-runtime/pull/1483 - - --------- - - Signed-off-by: Neil R. Spruit - Co-authored-by: Aaron Greig - -commit https://github.com/intel/llvm/commit/1c89e51aa23fbd01eab1a2bba98ffc3598470e93 -Author: Ewan Crawford -Date: Fri Apr 12 10:05:29 2024 +0100 - - [UR] Bump HIP tag to 760eaa38 (#12758) - - Bump UR commit https://github.com/intel/llvm/commit/to include a bugfix for HIP UR adapter dereferencing a - nullptr https://github.com/oneapi-src/unified-runtime/pull/1357 - commit https://github.com/intel/llvm/commit/e404d9984d1587ca130d267c342d10747bc09a1f [SYCL][NATIVECPU] Threadpool implementation for Native CPU (#13176) Native CPU backend improvement to be able to run work-groups in parallel? -commit https://github.com/intel/llvm/commit/1d52f907d28edab7e23f69175a5b00d1bbe0acdc -Author: Fábio -Date: Wed Apr 10 17:56:05 2024 +0100 - - [UR] Bump CUDA tag to 6e76c98a (#12285) - -commit https://github.com/intel/llvm/commit/7cf70ddd403d3262b51d0729cdc8a19e1bec7fab -Author: Kenneth Benzie (Benie) -Date: Wed Apr 10 17:43:57 2024 +0100 - - [UR] Bump HIP tag to 08b3e8fe (#13352) - -commit https://github.com/intel/llvm/commit/a14d0b548e96014c643b00927be128193781769c -Author: Kenneth Benzie (Benie) -Date: Wed Apr 10 16:06:52 2024 +0100 - - [UR] Bump Native CPU tag to e2b5b7fa (#13349) - -commit https://github.com/intel/llvm/commit/60a5c90b5dc4736ff818586072b7c7a270ac40c1 -Author: Georgi Mirazchiyski -Date: Wed Apr 10 16:06:37 2024 +0100 - - [HIP][UR] Fix memory type detection in allocation info queries and USM copy2D (#13059) - - Test CI for https://github.com/oneapi-src/unified-runtime/pull/1455 - - --------- - - Co-authored-by: Aaron Greig - -commit https://github.com/intel/llvm/commit/e3b112bae042f3293d13dd64dc825809a4348dff -Author: Fábio -Date: Wed Apr 10 16:06:20 2024 +0100 - - [UR] Bump CUDA tag to cda0cd94 (#12287) - -commit https://github.com/intel/llvm/commit/090323ea1c1007c12e184f8c990d6a45238529a0 -Author: Kenneth Benzie (Benie) -Date: Wed Apr 10 12:24:30 2024 +0100 - - [UR] Bump CUDA tag to 05b58992 (#13344) - -commit https://github.com/intel/llvm/commit/cb28e0941683b921583553d9c3c5f29add7e42c2 -Author: Kenneth Benzie (Benie) -Date: Tue Apr 9 17:47:00 2024 +0100 - - [UR][OpenCL] Revert urMemBufferCreate extension function lookup error (#13331) - - Revert https://github.com/oneapi-src/unified-runtime/pull/1448, pulls in - OpenCL adapter changes from - https://github.com/oneapi-src/unified-runtime/pull/1496. - commit https://github.com/intel/llvm/commit/d86a50045bbbe488869991be49cbfe3213809d72 [UR][CL] Atomic order memory capability for Intel FPGA driver (#13041) Potentially user-visible fix. @@ -1167,45 +853,12 @@ commit https://github.com/intel/llvm/commit/2e2010e2cc4acf1375cf88ce65d3a5cb8cbc Does it fix any actual issues in some negative cases where we previosly reported a wrong error if device is not available? -commit https://github.com/intel/llvm/commit/93a1abb42f352eff587cd1a081e90089c232339b -Author: Piotr Balcer -Date: Wed Mar 27 12:11:36 2024 +0100 - - [UR][L0] fix a deadlock on a recursive event rwlock (#13112) - -commit https://github.com/intel/llvm/commit/dd78c6e9c0dc6afc6fb5757fb88c4c5b0b0fe5b5 -Author: Raiyan Latif -Date: Fri Mar 22 09:35:09 2024 -0700 - - [UR][L0] Enable default support for L0 in-order lists (#13033) - - Signed-off-by: Raiyan Latif - Co-authored-by: Kenneth Benzie (Benie) - -commit https://github.com/intel/llvm/commit/7c70e59db3ec813021beb970ebd21034586da53e -Author: Ewan Crawford -Date: Thu Mar 21 10:28:46 2024 +0000 - - [SYCL][Graph][HIP] Set minimum ROCm version for graphs (#13035) - - Tests UR PR https://github.com/oneapi-src/unified-runtime/pull/1447 that - only reports support for UR command-buffers on ROCm 5.5.1 and later to - work around HIP driver bugs related to HIP-Graph in earlier version. - - This requirement is also explicitly mentioned in the design doc. - commit https://github.com/intel/llvm/commit/43f096308b03fa4c5a7f6845461a133d6cfaceae Author: Hugh Delaney [UR] CI for UR PR refactor-guess-local-worksize (#12663) https://github.com/oneapi-src/unified-runtime/pull/1326 Could be a bugfix? -commit https://github.com/intel/llvm/commit/1f9bf7a731b16d6d0d017c35245991ca95d0aef7 -Author: Artur Gainullin -Date: Tue Mar 19 14:47:58 2024 -0700 - - [SYCL][Graph][UR] Update UR to support updating kernel commands in command buffers for L0 (#12897) - commit https://github.com/intel/llvm/commit/cf402b8473e9b3a4ee675a6154b80f0d54b198d1 [UR][L0] Support for urUsmP2PPeerAccessGetInfoExp to query p2p access… (#12983) Strictly speaking, this may have a visible effect for end users since some From 79f92950babfc01f52e27ab8d4b7f8c0d3fedaa2 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Tue, 17 Sep 2024 11:23:53 -0700 Subject: [PATCH 25/30] Resolve some TODOs --- sycl/ReleaseNotes.md | 6 ------ 1 file changed, 6 deletions(-) diff --git a/sycl/ReleaseNotes.md b/sycl/ReleaseNotes.md index 026d8353fc87..df174bf726c2 100644 --- a/sycl/ReleaseNotes.md +++ b/sycl/ReleaseNotes.md @@ -496,10 +496,6 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad property were not passed down to device compiler when using [`sycl_ext_oneapi_kernel_compiler`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_kernel_compiler.asciidoc) extension. intel/llvm#14522 -- Fixed a bug in - [`sycl_ext_oneapi_kernel_compiler`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_kernel_compiler.asciidoc) - extension implementation - TODO: add description here. @cperkinsintel . intel/llvm#14490 - Fixed a bug where defining kernel as a named functor whilst using `-fno-sycl-unnamed-lambda` would lead to a compilation error about unnamed lambdas being unsupported. intel/lvm#14614 @@ -515,7 +511,6 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad tasks with barriers. intel/llvm#13094 intel/llvm#13863 intel/llvm#13094 - Fixed a compilation issue occurring when `printf` is used on CUDA backend on Windows. intel/llvm#13784 - TODO: was it really a compilation issue? - Fixed an issue where the compiler could emit SPIR-V instructions for reversing bits in a variable which are not supported by device compilers. intel/llvm#13810 intel/llvm#13044 @@ -530,7 +525,6 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad fail to create necessary directories for the cache to work. intel/llvm#13019 - Fixed a bug where querying a kernel by name from a kernel bundle would crash a program. intel/llvm#13155 - TODO: ask @cperkinsintel for feedback about the wording here. - Fixed an error handling bug where non-blocking `pipe` operations would lead to exceptions being mistakenly thrown. intel/llvm#13166 - Fixed compilation issues happening when non-uniform group built-ins were used From 72105fffcfe05b94b048ab271a43a87c3daacb53 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Tue, 17 Sep 2024 11:43:33 -0700 Subject: [PATCH 26/30] Handle (I mean drop) remaining UR changes --- sycl/ReleaseNotes.md | 91 -------------------------------------------- 1 file changed, 91 deletions(-) diff --git a/sycl/ReleaseNotes.md b/sycl/ReleaseNotes.md index df174bf726c2..51c989dc4eb0 100644 --- a/sycl/ReleaseNotes.md +++ b/sycl/ReleaseNotes.md @@ -17,8 +17,6 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad [DeviceSanitizer] Disable handling no return calls (#14652) // bugfix? -+UR commit below - ## New Features ### SYCL Compiler @@ -769,95 +767,6 @@ Breaking changes were also made to compiler flags: - [new] C/C++ math built-ins (like `exp` or `tanh`) can return incorrect results for some edge-case input when they are called from SYCL kernels. -commit https://github.com/intel/llvm/commit/c30769b122d99eb4d05bcb78f15e593491fe31ae -Author: Neil R. Spruit - [UR][L0] Use Intel Level Zero Driver Version String extension (#14426) - https://github.com/oneapi-src/unified-runtime/pull/1816 - Sounds like improvement to stability of driver version query - -commit https://github.com/intel/llvm/commit/8ddd7291219256f9bcb78328cc85322037736171 -Author: Ross Brunton - [UR] Update to new urProgramLink interface (#13085) - https://github.com/oneapi-src/unified-runtime/pull/1458 - Seems to be an internal UR bugfix/improvement - -commit https://github.com/intel/llvm/commit/db4d83e3969a5f7b5313aa5fb8466dd2ebbf9283 -Author: Neil R. Spruit - [UR][L0] Fix Queue get info and fix Queue release decrement (#14411) - https://github.com/oneapi-src/unified-runtime/pull/1814 - Could be an actual bugfix - -commit https://github.com/intel/llvm/commit/eb03091539daa68a582ceab950379ca482e118d9 -Author: Neil R. Spruit - [UR][L0] Fix Device Info return code to report unsupported enumeration (#14407) - https://github.com/oneapi-src/unified-runtime/pull/1809 - ??? - -commit https://github.com/intel/llvm/commit/f2bd076eb55a2cc79de2e9d4748967ed3cb13c9b -Author: Wu Yingcong - [UR] fix use-after-free problems (#13855) - UR PR: https://github.com/oneapi-src/unified-runtime/pull/1637 - Related to ASAN - -commit https://github.com/intel/llvm/commit/c6428bee93a01009291ee704dca9db6262045aed -Author: Neil R. Spruit -Date: Tue Jun 25 07:03:05 2024 -0700 - - [UR][L0] Fix Handle used in calls to L0 Driver zex apis given multi d… (#14250) - - …rivers - - pre-commit https://github.com/intel/llvm/commit/PR for - https://github.com/oneapi-src/unified-runtime/pull/1778 - - Signed-off-by: Neil R. Spruit - -commit https://github.com/intel/llvm/commit/579484f0ae9e5e30b9c9bd468799e1688d5de890 -Author: Neil R. Spruit -Date: Fri Jun 14 05:45:42 2024 -0700 - - [UR][L0] Maintain Lock of Queue while syncing the Last Command Event (#14150) - - pre-commit https://github.com/intel/llvm/commit/PR for - https://github.com/oneapi-src/unified-runtime/pull/1749 - - --------- - - Signed-off-by: Neil R. Spruit - Co-authored-by: Kenneth Benzie (Benie) - -commit https://github.com/intel/llvm/commit/5a09c6a15279484434df299d9164d94b96d3507a -Author: Kenneth Benzie (Benie) -Date: Wed May 22 10:42:06 2024 +0100 - - [UR][L0] Return device version based on DeviceIpVersion (#13812) - - https://github.com/oneapi-src/unified-runtime/pull/1401 - -commit https://github.com/intel/llvm/commit/e404d9984d1587ca130d267c342d10747bc09a1f - [SYCL][NATIVECPU] Threadpool implementation for Native CPU (#13176) - Native CPU backend improvement to be able to run work-groups in parallel? - -commit https://github.com/intel/llvm/commit/d86a50045bbbe488869991be49cbfe3213809d72 - [UR][CL] Atomic order memory capability for Intel FPGA driver (#13041) - Potentially user-visible fix. - -commit https://github.com/intel/llvm/commit/2e2010e2cc4acf1375cf88ce65d3a5cb8cbc9427 - [UR] Add DEVICE_NOT_AVAILABLE UR error code and PI translation for same. (#13206) - Does it fix any actual issues in some negative cases where we previosly - reported a wrong error if device is not available? - -commit https://github.com/intel/llvm/commit/43f096308b03fa4c5a7f6845461a133d6cfaceae -Author: Hugh Delaney - [UR] CI for UR PR refactor-guess-local-worksize (#12663) - https://github.com/oneapi-src/unified-runtime/pull/1326 - Could be a bugfix? - -commit https://github.com/intel/llvm/commit/cf402b8473e9b3a4ee675a6154b80f0d54b198d1 - [UR][L0] Support for urUsmP2PPeerAccessGetInfoExp to query p2p access… (#12983) - Strictly speaking, this may have a visible effect for end users since some - of queries won't always return `false` anymore. - # Mar'24 release notes Release notes for commit range [f4e0d3177338](https://github.com/intel/llvm/commit/f4ed132f243ab43816ebe826669d978139964df2).. [d2817d6d317db1](https://github.com/intel/llvm/commit/d2817d6d317db1143bb227168e85c409d5ab7c82) From c77b5d3a8c25b88c9fd20d881c49e46073765f85 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Wed, 18 Sep 2024 10:16:09 -0700 Subject: [PATCH 27/30] Resolve remaining TODOs --- sycl/ReleaseNotes.md | 20 +++++--------------- 1 file changed, 5 insertions(+), 15 deletions(-) diff --git a/sycl/ReleaseNotes.md b/sycl/ReleaseNotes.md index 51c989dc4eb0..7e3c5449f718 100644 --- a/sycl/ReleaseNotes.md +++ b/sycl/ReleaseNotes.md @@ -5,18 +5,6 @@ Release notes for commit range ... [ebb3b4a21b3b0e](https://github.com/intel/llvm/commit/ebb3b4a21b3b0e977f44434781729df7de83e436) -## TODO - -commit https://github.com/intel/llvm/commit/29b4d855fa1a378e89182795e0d368304c40c3f6 - [SYCL][CUDA] Enable support of msvc math functions for nvptx target. (#14007) - -commit https://github.com/intel/llvm/commit/4b14d706d93891cdb5b0e6a8d4b0b027c1d54ab8 - [SYCL][DeviceSanitizer] Use -asan-constructor-kind=none to disable ctor/dtor (#13259) - was the bug really user-visible? -commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad6cad - [DeviceSanitizer] Disable handling no return calls (#14652) - // bugfix? - ## New Features ### SYCL Compiler @@ -163,7 +151,7 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad - Added `-fsystem-debug` command line option to complement existing `-fno-system-debug`. intel/llvm#13256 - Improved wording of an error about implicit `this` capture in a kernel. intel/llvm#14100 -- Improved `--save-temps` to work with `-fsycl-host-compiler`. intel/llvm#114751 +- Improved `--save-temps` to work with `-fsycl-host-compiler`. intel/llvm#14751 - Improved error message about missing AMDGPU architecture when several values are passed into `-fsycl-targets`. intel/llvm#13078 - Reduced list of commands invoked to generate dependencies using `-MD` flag @@ -340,8 +328,7 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad performed in very limited amount of cases. intel/llvm#13088 - Implemented new `fetch_image` overload which accepts sampled image and coordinates. intel/llvm#12447 -- Extended address sanitizer support to cover Intel GPU devices besides CPU - devices. intel/llvm#13450 +- Extended address sanitizer support to cover Intel DG2 GPUs. intel/llvm#13450 - Updated `info::device::max_mem_alloc_size` query to return total amount of a device memory for CUDA devices, because they have no limit on size of memory allocations. intel/llvm#13344 @@ -468,6 +455,9 @@ commit https://github.com/intel/llvm/commit/2442ef047a4e9e9c135beed18a92029e1aad - Fixed a bug with `shift_group_[right|left]`, `permute_by_xor` and `select_from_group` algorithms would return invalid values if used with `half` data type on AMD devices. intel/llvm#13016 +- Fixed a bug where compiling a program that contains kernels which make calls + to standard C/C++ math functions would fail when targeting CUDA on Windows. + intel/llvm#14007 ### SYCL Library From aeaa4b3b8535375e5f509df4795d6df16c002393 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Thu, 19 Sep 2024 02:51:50 -0700 Subject: [PATCH 28/30] Cleanup old TODO comment --- sycl/ReleaseNotes.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/sycl/ReleaseNotes.md b/sycl/ReleaseNotes.md index 7e3c5449f718..3302dd4f9174 100644 --- a/sycl/ReleaseNotes.md +++ b/sycl/ReleaseNotes.md @@ -31,8 +31,7 @@ Release notes for commit range intel/llvm#13579 intel/llvm#13869 intel/llvm#14102 intel/llvm#14541 intel/llvm#14541 intel/llvm#13898 intel/llvm#14143 allow us to improve link time by reducing amount of external processes and - temporary files used by the compiler. **Do we need to list PRs here?** There - were many of them and some of them were merged in scope of a previous release. + temporary files used by the compiler. - Added `-fsycl-fp64-conv-emu` command line option which allows the enabling of partial (only conversion operations are supported) emulation of `double` data type. This mode is only supported by Intel GPUs. intel/llvm#13912 From 5bd2e9398077e857f7524d0c5bbc9cca759c78b4 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Thu, 19 Sep 2024 02:58:04 -0700 Subject: [PATCH 29/30] Drop unnecessary "for" --- sycl/ReleaseNotes.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/sycl/ReleaseNotes.md b/sycl/ReleaseNotes.md index 3302dd4f9174..56ec16cc29c6 100644 --- a/sycl/ReleaseNotes.md +++ b/sycl/ReleaseNotes.md @@ -65,8 +65,7 @@ Release notes for commit range Please note that the implementation exposes native block read/write HW capabilities only if the operation can be directly mapped to a single block operation. In other cases, it uses a naive implementation in form of a simple - 'for' loop and group barriers. intel/llvm#13043 intel/llvm#13734 - intel/llvm#13673 + loop and group barriers. intel/llvm#13043 intel/llvm#13734 intel/llvm#13673 - Implemented [`sycl_ext_codeplay_enqueue_native_command`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_codeplay_enqueue_native_command.asciidoc) extension. intel/llvm#14136 - Added initial support for [`sycl_ext_oneapi_free_function_kernels`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/proposed/sycl_ext_oneapi_free_function_kernels.asciidoc) extension. intel/llvm#13207 intel/llvm#13885 Known limitations: From 9eaed5a96d128d354617c1f07eaa75eafc7031a5 Mon Sep 17 00:00:00 2001 From: Alexey Sachkov Date: Thu, 19 Sep 2024 13:16:15 +0200 Subject: [PATCH 30/30] Apply suggestions from code review Co-authored-by: Steffen Larsen --- sycl/ReleaseNotes.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/sycl/ReleaseNotes.md b/sycl/ReleaseNotes.md index 56ec16cc29c6..63b3b8b29630 100644 --- a/sycl/ReleaseNotes.md +++ b/sycl/ReleaseNotes.md @@ -83,7 +83,7 @@ Release notes for commit range multiply-add operation performed. intel/llvm#13366 - Implemented revision 2 of [`sycl_ext_oneapi_group_sort`](https://github.com/intel/llvm/blob/ebb3b4a21b3b0e977f44434781729df7de83e436/sycl/doc/extensions/experimental/sycl_ext_oneapi_group_sort.asciidoc) - extension. intel/llvm14399 intel/llvm#14185 intel/llvm#13942 intel/llvm#13908 + extension. intel/llvm#14399 intel/llvm#14185 intel/llvm#13942 intel/llvm#13908 intel/llvm#14591 @@ -286,7 +286,7 @@ Release notes for commit range - Added support for 1- and 2-byte data types to ESIMD prefetch APIs. intel/llvm#13452 - Enabled `ext_intel_matrix` support for Intel GNR devices. intel/llvm#14436 -- Added support for 1x64x16 `bfloat16` matrices on PVC> intel/llvm#13391 +- Added support for 1x64x16 `bfloat16` matrices on PVC. intel/llvm#13391 - Added new overloads of `load_2d`, `store_2d` or `prefetch_2d` ESIMD APIs that accept compile-time properties. intel/llvm#13046 - Added support for `shift_group_left`, `shift_group_right`,