Skip to content

[SYCL][FPGA] Minor fix to the fpga_lsu header #2233

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 20, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 25 additions & 25 deletions sycl/doc/extensions/IntelFPGA/FPGALsu.md
Original file line number Diff line number Diff line change
@@ -1,41 +1,41 @@

# FPGA lsu

The Intel FPGA `lsu` class is implemented in `CL/sycl/intel/fpga_lsu.hpp` which
is included in `CL/sycl/intel/fpga_extensions.hpp`.
The Intel FPGA `lsu` class is implemented in `CL/sycl/INTEL/fpga_lsu.hpp` which
is included in `CL/sycl/INTEL/fpga_extensions.hpp`.

The class `cl::sycl::intel::lsu` allows users to explicitly request that the
The class `cl::sycl::INTEL::lsu` allows users to explicitly request that the
implementation of a global memory access is configured in a certain way. The
class has two member functions, `load()` and `store()` which allow loading from
and storing to a `global_ptr`, respectively, and is templated on the following
4 optional paremeters:

1. **`cl::sycl::intel::burst_coalesce<B>`, where `B` is a boolean**: request,
1. **`cl::sycl::INTEL::burst_coalesce<B>`, where `B` is a boolean**: request,
to the extent possible, that a dynamic burst coalescer be implemented when
`load` or `store` are called. The default value of this parameter is `false`.
2. **`cl::sycl::intel::cache<N>`, where `N` is an integer greater or equal to
2. **`cl::sycl::INTEL::cache<N>`, where `N` is an integer greater or equal to
0**: request, to the extent possible, that a read-only cache of the specified
size in bytes be implemented when when `load` is called. It is not allowed to
use that parameter for `store`. The default value of this parameter is `0`.
3. **`cl::sycl::intel::statically_coalesce<N>`, where `B` is a boolean**:
3. **`cl::sycl::INTEL::statically_coalesce<B>`, where `B` is a boolean**:
request, to the extent possible, that `load` or `store` accesses, is allowed to
be statically coalesced with other memory accesses at compile time. The default
value of this parameter is `true`.
4. **`cl::sycl::intel::prefetch<B>`, where `N` is a boolean**: request, to the
4. **`cl::sycl::INTEL::prefetch<B>`, where `B` is a boolean**: request, to the
extent possible, that a prefetcher be implemented when `load` is called. It is
not allowed to use that parameter for `store`. The default value of this
parameter is `false`.

Currently, not every combination of parameters is allowed due to limitations in
the backend. The following rules apply:
1. For `store`, `cl::sycl::intel::cache` must be `0` and
`cl::sycl::intel::prefetch` must be `false`.
2. For `load`, if `cl::sycl::intel::cache` is set to a value greater than `0`,
then `cl::sycl::intel::burst_coalesce` must be set to `true`.
3. For `load`, exactly one of `cl::sycl::intel::prefetch` and
`cl::sycl::intel::burst_coalesce` is allowed to be `true`.
4. For `load`, exactly one of `cl::sycl::intel::prefetch` and
`cl::sycl::intel::cache` is allowed to be `true`.
1. For `store`, `cl::sycl::INTEL::cache` must be `0` and
`cl::sycl::INTEL::prefetch` must be `false`.
2. For `load`, if `cl::sycl::INTEL::cache` is set to a value greater than `0`,
then `cl::sycl::INTEL::burst_coalesce` must be set to `true`.
3. For `load`, exactly one of `cl::sycl::INTEL::prefetch` and
`cl::sycl::INTEL::burst_coalesce` is allowed to be `true`.
4. For `load`, exactly one of `cl::sycl::INTEL::prefetch` and
`cl::sycl::INTEL::cache` is allowed to be `true`.

## Implementation

Expand All @@ -47,7 +47,7 @@ template <class... mem_access_params> class lsu final {
public:
lsu() = delete;

template <typename T> static T &load(sycl::global_ptr<T> Ptr) {
template <typename T> static T load(sycl::global_ptr<T> Ptr) {
check_load();
#if defined(__SYCL_DEVICE_ONLY__) && __has_builtin(__builtin_intel_fpga_mem)
return *__builtin_intel_fpga_mem((T *)Ptr,
Expand Down Expand Up @@ -77,7 +77,7 @@ public:
## Usage

```c++
#include <CL/sycl/intel/fpga_extensions.hpp>
#include <CL/sycl/INTEL/fpga_extensions.hpp>
...
cl::sycl::buffer<int, 1> output_buffer(output_data, 1);
cl::sycl::buffer<int, 1> input_buffer(input_data, 1);
Expand All @@ -91,19 +91,19 @@ Queue.submit([&](cl::sycl::handler &cgh) {
auto output_ptr = output_accessor.get_pointer();

using PrefetchingLSU =
cl::sycl::intel::lsu<cl::sycl::intel::prefetch<true>,
cl::sycl::intel::statically_coalesce<false>>;
cl::sycl::INTEL::lsu<cl::sycl::INTEL::prefetch<true>,
cl::sycl::INTEL::statically_coalesce<false>>;

using BurstCoalescedLSU =
cl::sycl::intel::lsu<cl::sycl::intel::burst_coalesce<false>,
cl::sycl::intel::statically_coalesce<false>>;
cl::sycl::INTEL::lsu<cl::sycl::INTEL::burst_coalesce<false>,
cl::sycl::INTEL::statically_coalesce<false>>;

using CachingLSU =
cl::sycl::intel::lsu<cl::sycl::intel::burst_coalesce<true>,
cl::sycl::intel::cache<1024>,
cl::sycl::intel::statically_coalesce<true>>;
cl::sycl::INTEL::lsu<cl::sycl::INTEL::burst_coalesce<true>,
cl::sycl::INTEL::cache<1024>,
cl::sycl::INTEL::statically_coalesce<true>>;

using PipelinedLSU = cl::sycl::intel::lsu<>;
using PipelinedLSU = cl::sycl::INTEL::lsu<>;

int X = PrefetchingLSU::load(input_ptr); // int X = input_ptr[0]
int Y = CachingLSU::load(input_ptr + 1); // int Y = input_ptr[1]
Expand Down
4 changes: 2 additions & 2 deletions sycl/include/CL/sycl/INTEL/fpga_lsu.hpp
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
//==-------------- fpga_lsu.hpp --- SYCL FPGA Reg Extensions ---------------==//
//==-------------- fpga_lsu.hpp --- SYCL FPGA LSU Extensions ---------------==//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
Expand Down Expand Up @@ -47,7 +47,7 @@ template <class... mem_access_params> class lsu final {
public:
lsu() = delete;

template <typename T> static T &load(sycl::global_ptr<T> Ptr) {
template <typename T> static T load(sycl::global_ptr<T> Ptr) {
check_load();
#if defined(__SYCL_DEVICE_ONLY__) && __has_builtin(__builtin_intel_fpga_mem)
return *__builtin_intel_fpga_mem((T *)Ptr,
Expand Down