Skip to content
Merged
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ This document describes an extension that adds pipes to SYCL. Pipes are first i

== Notice

Copyright (c) 2019-2021 Intel Corporation. All rights reserved.
Copyright (c) 2019-2023 Intel Corporation. All rights reserved.

== Status

Expand All @@ -49,10 +49,25 @@ Revision: 3
== Contact
Michael Kinsner, Intel (michael 'dot' kinsner 'at' intel 'dot' com)

== Contributors

Michael Kinsner, Intel +
Shuo Niu, Intel +
Bo Lei, Intel +
Marco Jacques, Intel +
Joe Garvey, Intel +
Aditi Kumaraswamy, Intel +
Robert Ho, Intel +
Sherry Yuan, Intel +
Peter Colberg, Intel +
Zibai Wang, Intel

== Dependencies

This extension is written against the SYCL 2020 specification, Revision 3.

It also depends on the `sycl_ext_oneapi_properties` extension.

The use of blocking pipe reads or writes requires support for https://github.com/KhronosGroup/SPIRV-Registry/blob/master/extensions/INTEL/SPV_INTEL_blocking_pipes.asciidoc[SPV_INTEL_blocking_pipes] if SPIR-V is used by an implementation.

== Overview
Expand Down Expand Up @@ -183,7 +198,7 @@ class pipe {
// Non-blocking
static DataT read( bool &Success );
static void write( const DataT &Data, bool &Success );

// Static members
using value_type = DataT;
size_t min_capacity = MinCapacity;
Expand Down Expand Up @@ -644,15 +659,15 @@ Automated mechanisms are possible to provide uniquification across calls, and co

The Intel FPGA experimental `pipe` class is implemented in `sycl/ext/intel/experimental/pipes.hpp` which is included in `sycl/ext/intel/fpga_extensions.hpp`.

In the experimental API version, read/write methods take in a property list as function argument, which can contain the latency control properties `latency_anchor_id` and/or `latency_constraint`.
In the experimental API version, the device side read/write methods take in a property list as function argument, which can contain the latency control properties `latency_anchor_id` and/or `latency_constraint`.

* `sycl::ext::intel::experimental::latency_anchor_id<N>`, where `N` is an integer: An ID to associate with the current read/write function call, which can then be referenced by other `latency_constraint` properties elsewhere in the program to define relative latency constaints. ID must be unique within the application, and a diagnostic is required if that condition is not met.
* `sycl::ext::intel::experimental::latency_constraint<A, B, C>`: A tuple of three values which cause the current read/write function call to act as an endpoint of a latency constraint relative to a specified `latency_anchor_id` defined by a different instruction.
** `A` is an integer: The ID of the target anchor defined on a different instruction through a `latency_anchor_id` property.
** `B` is an enum value: The type of control from the set {`latency_control_type::exact`, `latency_control_type::max`, `latency_control_type::min`}.
** `C` is an integer: The relative clock cycle difference between the target anchor and the current function call, that the constraint should infer subject to the type of the control (exact, max, min).

=== Synopsis
=== Device side pipe read/write

[source,c++]
----
Expand Down Expand Up @@ -687,9 +702,8 @@ template <int Target, latency_control_type Type, int Cycle>
inline constexpr latency_constraint_key::value_t<Target, Type, Cycle>
latency_constraint;

template <typename Name,
typename DataT,
size_t MinCapacity = 0>
template <class Name, class DataT, int32_t MinCapacity = 0,
class PropertiesT = decltype(oneapi::experimental::properties{})>
class pipe {
// Blocking
static DataT read();
Expand All @@ -716,7 +730,7 @@ class pipe {
} // namespace sycl::ext::intel::experimental
----

=== Usage
=== Latency Control example

[source,c++]
----
Expand Down Expand Up @@ -748,6 +762,69 @@ myQueue.submit([&](handler &cgh) {
});
----

== Host Side pipe read/write

If the read/write member functions of a pipe are called from the host side, a `sycl::queue` is added to the parameters. The `memory_order` parameter is also added to the parameters for future work.

[source,c++,Host pipe read write members,linenums]
----
template <class Name, class DataT, int32_t MinCapacity = 0,
class PropertiesT = decltype(oneapi::experimental::properties{})>
class pipe {
// Blocking
static _dataT read(queue &Q, memory_order Order = memory_order::seq_cst);
static void write(queue &Q, const _dataT &Data, memory_order Order = memory_order::seq_cst);
// Non-blocking
static _dataT read(queue &Q, bool &Success, memory_order Order = memory_order::seq_cst);
static void write(queue &Q, const _dataT &Data, bool &Success, memory_order Order = memory_order::seq_cst);
}
----

== Simple example of host-to-device write&read

[source,c++,First example,linenums]
----
using default_pipe_properties = decltype(sycl::ext::oneapi::experimental::properties(sycl::ext::intel::experimental::uses_valid<true>));

// Classes used to name the kernels
class TestTask;
class H2DPipeID;
class D2HPipeID;

using H2DPipe = sycl::ext::intel::experimental::pipe<H2DPipeID, int, 10, default_pipe_properties>;
using D2HPipe = sycl::ext::intel::experimental::pipe<D2HPipeID, int, 10, default_pipe_properties>;

struct BasicKernel {
void operator()() const {
auto a = H2DPipe::read();
D2HPipe::write(a+1);
}
};

int main() {
queue q(testconfig_selector{});
H2DPipe::write(q, 1);

  q.submit([&](handler &h) {
    h.single_task<TestTask>(BasicKernel{});
  });
auto b = D2HPipe::read(q);
std::cout << b << std::endl; // It should print 2;
}
----

== Issues for experimental API

. Although the memory_order parameter hasn't been used in the implementation, the choice of seq_cst for the default value of the `sycl::memory_order` parameter of the read/write functions is still open for discussion. While seq_cst is more consistent with C++ atomics, it is a change from how pipes work today, which is equivalent to memory_order::relaxed. Another consideration is that SYCL 2020 atomic_ref uses a third approach where the default must be specified as a template parameter of the class itself.
+
--
*RESOLUTION*: Not resolved. Still under discussion.
--

== Future work

. In the future, the `sycl::memory_order` parameter of read/write functions will control how other memory accesses, including regular, non-atomic memory accesses, are to be ordered around the pipe read/write operation. The default memory order is `sycl::memory_order::seq_cst`. Currently, `sycl::memory_order` parameter is defined but not being used in the implementation.

== Feature test macro

This extension provides a feature-test macro as described in the core SYCL
Expand Down Expand Up @@ -776,6 +853,7 @@ extension's APIs the implementation supports.
|2|2019-11-13|Michael Kinsner|Incorporate feedback
|3|2020-04-27|Michael Kinsner|Clarify that pipe operations behave as-if they are relaxed atomic operations. Make SYCL2020 the baseline
|4|2021-12-02|Shuo Niu|Add experimental latency control API
|5|2023-03-27|Zibai Wang|Experimental API change only. Add memory order parameter and compile-time properties. Add host pipe read/write functions.
|========================================

//************************************************************************
Expand Down