Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL] Release notes for June'20 DPCPP implementation update #1948

Merged
merged 2 commits into from
Jun 23, 2020
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions sycl/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,10 @@ set(CMAKE_CXX_EXTENSIONS OFF)
option(SYCL_ENABLE_WERROR "Treat all warnings as errors in SYCL project" OFF)
option(SYCL_ADD_DEV_VERSION_POSTFIX "Adds -V postfix to version string" ON)

set(SYCL_MAJOR_VERSION 1)
set(SYCL_MAJOR_VERSION 2)
set(SYCL_MINOR_VERSION 0)
set(SYCL_PATCH_VERSION 0)
set(SYCL_DEV_ABI_VERSION 1)
set(SYCL_DEV_ABI_VERSION 0)
if (SYCL_ADD_DEV_VERSION_POSTFIX)
set(SYCL_VERSION_POSTFIX "-${SYCL_DEV_ABI_VERSION}")
endif()
Expand Down
154 changes: 154 additions & 0 deletions sycl/ReleaseNotes.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,157 @@
# June'20 release notes

Release notes for the commit range ba404be..24726df

## New features
- Added switch to assume that each amount of work-items in each ND-range
dimension if less that 2G (fits in signed integer), which allows underlying
BEs to perform additional optimizations [fdcaeae] [08f8656]
- Added partial support for [Host task with interop capabilities extension](https://github.com/codeplaysoftware/standards-proposals/blob/master/host_task/host_task.md) [ae3fd5c]
- Added support for [SYCL_INTEL_bitcast](doc/extensions/Bitcast/SYCL_INTEL_bitcast.asciidoc)
as a `sycl::detail::bit_cast` [e3da4ef]
- Introduced the Level Zero plugin which enables SYCL working on top of Level0
API. Interoperability is not supportet yet [d32da99]
- Implemented [parallel_for simplification extension](doc/extensions/ParallelForSimpification) [13fe9fb]
- Implemented [SYCL_INTEL_enqueue_barrier extension](doc/extensions/EnqueueBarrier/enqueue_barrier.asciidoc) [da6bfd0]
- Implemented [SYCL_INTEL_accessor_simplification extension](https://github.com/intel/llvm/pull/1498) [1f76efc]
- Implemented OpenCL interoperability API following [SYCL Generalization proposal](https://github.com/KhronosGroup/SYCL-Shared/blob/master/proposals/sycl_generalization.md) [bae0639]

## Improvements
### SYCL Frontend and driver changes
s-kanaev marked this conversation as resolved.
Show resolved Hide resolved
- Now when `-fintelfpga` option is passed, the dependency file is created in
the temporary files location instead of input source file location [7df381a]
- Made `-std=c++17` the default for DPC++ [3192ee7]
- Added support for kernel name types templated using enumerations
[e7020a1][f9226d2][125b05c][07e8d8f]
- Added a diagnostic on attempt to use host built-in functions inside device
code [2a4c1c8]
- Added diagnostics on attempt to use `sycl::accessor` created for
unsupported types in the device code [6da42a0]
- Aligned `sizeof(long double)` between host and device code [87e6240]
- The pragma spelling for SYCL-specific attributes except
`cl::reqd_work_group_size` are rejected now [8fe2846]
- Added template parameter support for `cl::intel_reqd_sub_group_size`
attribute [0ae9729]
- Added support for more math builtins for PTX target [9370549]
- Added support for struct members and pointers in `intelfpga::ivdep`
attribute [358ec04]
- Added support for all builtins from integer and shared categories for PTX
target [f0a4fe2]
- Improved handling of linker inputs for static lib processing [ed2846f]
- Dependency files are not generated by default when compiling using
`-fsycl -fintelfpga` options now [24726df]

### SYCL headers and runtime
- Updated the implementation to align with changes in
[SubGroup extension](doc/extensions/SubGroup/SYCL_INTEL_sub_group.asciidoc) [9d4c284]
- `sycl::ordered_queue` class has been removed [875347a]
- Added support of rounding modes for floating and integer types in
`sycl::vec::convert` [096d0a0]
- Added support for USM vars and placeholder accessors passed to reduction
version of `sycl::handler::parallel_for` [94cb022][2e73da7]
- Added support of `sycl::intel::sub_group::load/store` which take
`sycl::multi_ptr` with `sycl::access::address_space::local_space` [0f5b55b]
- Added a diagnostic on attempt to recompile an AOT compiled program using
`sycl::program` API [b031186]
- Started using custom CUDA context by default as it shows better performance
results [9d45ead]
- Prevented NVIDIA OpenCL platform to be selected by a SYCL application
[7146426]
- Adjusted the diagnostic message on attempt to use local size which is
greater than global size to be more informative [894c10d]
- Added a cache for PI plugins, so subsequent calls for `sycl::device`
creation should be cheaper [03dd60d]
- A SYCL program will be aborted now if program linking is requested when
using L0 plugin. This is done because L0 doesn't support program linking
[d4a5b71]
- Added a diagnostic on attempt to use `sycl::program::set_spec_constant` when
the program is already in compiled or linked state [e2e3d3d]
- Improved `sycl::stream` class implementation on the device side in order to
reduce local memory consumption [b838f0e]

### Documentation
- Added [a table](doc/extensions/README.md) with DPC++ extensions statuses
s-kanaev marked this conversation as resolved.
Show resolved Hide resolved
[dbbc474]
- OpenCL CPU runtime installation instructions in
[GetStartedGuide](doc/GetStartedGuide.md) and the installation script have
been improved [9aa5029]
- The [SYCL_INTEL_sub_group extension document](doc/extensions/SubGroup/SYCL_INTEL_sub_group.asciidoc)
has been updated [010f112]
- Render user API classes on a dedicated page [98b6ee4]

## Bug fixes
### SYCL Frontend and driver changes
- Fixed device code compile options passing which could lead to
`CL_INVALID_COMPILER_OPTIONS` error [57bad9e]
- Fixed a problem which caused problems with creating a queue for FPGA device
s-kanaev marked this conversation as resolved.
Show resolved Hide resolved
as a global inline variable [357e9c8]
- Fixed an issue with that functions which are marked with `SYCL_EXTERNAL` are
not participate in attribute propogation and conflicting attributes checking
[0098eab]
- Fixed an issue which could lead to problems when a kernel name contains a
CVR qualified type [62e2f3b]
- Fixed file processing when using `-fsycl-link`, now the generated object
file can be linked by a non-SYCL enabled compiler/linker [2623abe]

### SYCL headers and runtime
- Fixed an issue with map/unmap events which caused problems with read only
buffer accessors in CUDA backend [bf1b5b6]
- Fixed errors happened when using `sycl::handler::copy` with `const void*`,
`void*` or a `sycl::accessor` for a type with const qualifier [ddc0c9d]
- Fixed an issue with copying memory to itself during `sycl::buffer` copyback
[4bf22cc]
- Fixed a possible deadlock which could happen when simultaneously submitting
and waiting for kernels from multiple threads on Windows [ebace77]
- Fixed a problem which caused device with a negative score to be still
selected [7146426][855d214]
- Fixed memleak which happened when using `sycl::program::get_kernel`
[ccefc93]
- Fixed memory copy being wrongly asynchronous which could cause data races
on CUDA backend [4f0a3df]
- Fixed a race which could happen when waiting for the same event from
multiple threads [5737ad9]
- Fixed errors which happened when using `half` or `double` types in reduction
version of `sycl::handler::parallel_for`
- Fixed `sycl::device::get_info<sycl::info::device::mem_base_addr_align>`
query which was returning incorrect result for CUDA plugin [a6d03f3]
- Fixed incorrect behavior of a `sycl::buffer` created with non-writable host
data(e.g. `const int *`) on CUDA backend [49b6223]
- A bunch of fixes to reduction version of `sycl::handler::parallel_for`:
- Enabled `operator*`, `operator+`, `operator|`, `operator&`, `operator^=`
for corresponding transparent functors used in reduction
- Fixed the case when reduction object is passed as an R-value
- Allowed identity-less constructors for reductions with transparent
functors
- Replaced some `auto` declarations with Reduction::result_type and added
intermediate assignments/casts to avoid type ambiguities caused by using
`sycl::half` type, and which may also be caused by custom/user types as
well
- Fixed compile time known identity values for `MIN` and `MAX` reductions

## API/ABI breakages
- All functions related to `sycl::ordered_queue` have been removed
- Removed symbols corresponding to
`sycl::info::kernel_sub_group::max_sub_group_size_for_ndrange` and
`sycl::info::kernel_sub_group::sub_group_count_for_ndrange` queries

## Known issues
- [new] If there is an attribute `cl::intel_reqd_sub_group_size` with the
same value for kernel and function called from the kernel there still can be
compilation error.
- The format of the object files produced by the compiler can change between
versions. The workaround is to rebuild the application.
- The SYCL library doesn't guarantee stable API/ABI, so applications compiled
with older version of the SYCL library may not work with new one.
The workaround is to rebuild the application.
[ABI policy guide](doc/ABIPolicyGuide.md)
- Using `cl::sycl::program` API to refer to a kernel defined in another
translation unit leads to undefined behavior
- Linkage errors with the following message:
`error LNK2005: "bool const std::_Is_integral<bool>" (??$_Is_integral@_N@std@@3_NB) already defined`
can happen when a SYCL application is built using MS Visual Studio 2019
version below 16.3.0
The workaround is to enable `-std=c++17` for the failing MSVC version.

# May'20 release notes

Release notes for the commit range ba404be..67d3d9e
Expand Down