Skip to content

Commit cc49ee7

Browse files
[SYCL][NFC] Remove more explicit "cl::" references
Previous patch was implemented by splitting off piece of a bigger patch by completely eliminating "cl" namespace and then addressing the local failures. Local testing didn't cover all possible platforms so these occurrences were left untouched. This is a wider application of grep/sed, but it's still not complete as some instances of "cl" namespace references cannot be eliminated in an NFC change (e.g. everything affecting/affected by mangling as in clang or some tools). The reason I'm committing these changes separately is to ease the review of the actual non-NFC PR later.
1 parent 6485ec4 commit cc49ee7

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

48 files changed

+398
-412
lines changed

clang/include/clang/Basic/AttrDocs.td

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -378,7 +378,7 @@ outlining job:
378378

379379
int foo(int x) { return ++x; }
380380

381-
using namespace cl::sycl;
381+
using namespace sycl;
382382
queue Q;
383383
buffer<int, 1> a(range<1>{1024});
384384
Q.submit([&](handler& cgh) {
@@ -3790,7 +3790,7 @@ cannot be optimized out due to reachability analysis or by any other
37903790
optimization.
37913791

37923792
This attribute allows to pass name and address of the function to a special
3793-
``cl::sycl::intel::get_device_func_ptr`` API call which extracts the device
3793+
``sycl::intel::get_device_func_ptr`` API call which extracts the device
37943794
function pointer for the specified function.
37953795

37963796
.. code-block:: c++

llvm/lib/SYCLLowerIR/LowerWGScope.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -780,7 +780,7 @@ PreservedAnalyses SYCLLowerWGScopePass::run(Function &F,
780780
I = I->getNextNode()) {
781781
auto *AllocaI = dyn_cast<AllocaInst>(I);
782782
// Allocas marked with "work_item_scope" are those originating from
783-
// cl::sycl::private_memory<T> variables, which must be in private memory.
783+
// sycl::private_memory<T> variables, which must be in private memory.
784784
// No shadows/materialization is needed for them because they can be
785785
// updated only within PFWIs
786786
if (AllocaI && !AllocaI->getMetadata(WI_SCOPE_MD))

sycl/doc/EnvironmentVariables.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ compiler and runtime.
88
| Environment variable | Values | Description |
99
| -------------------- | ------ | ----------- |
1010
| `SYCL_BE` (deprecated) | `PI_OPENCL`, `PI_LEVEL_ZERO`, `PI_CUDA` | Force SYCL RT to consider only devices of the specified backend during the device selection. We are planning to deprecate `SYCL_BE` environment variable in the future. The specific grace period is not decided yet. Please use the new env var `SYCL_DEVICE_FILTER` instead. |
11-
| `SYCL_DEVICE_TYPE` (deprecated) | CPU, GPU, ACC, HOST | Force SYCL to use the specified device type. If unset, default selection rules are applied. If set to any unlisted value, this control has no effect. If the requested device type is not found, a `cl::sycl::runtime_error` exception is thrown. If a non-default device selector is used, a device must satisfy both the selector and this control to be chosen. This control only has effect on devices created with a selector. We are planning to deprecate `SYCL_DEVICE_TYPE` environment variable in the future. The specific grace period is not decided yet. Please use the new env var `SYCL_DEVICE_FILTER` instead. |
11+
| `SYCL_DEVICE_TYPE` (deprecated) | CPU, GPU, ACC, HOST | Force SYCL to use the specified device type. If unset, default selection rules are applied. If set to any unlisted value, this control has no effect. If the requested device type is not found, a `sycl::runtime_error` exception is thrown. If a non-default device selector is used, a device must satisfy both the selector and this control to be chosen. This control only has effect on devices created with a selector. We are planning to deprecate `SYCL_DEVICE_TYPE` environment variable in the future. The specific grace period is not decided yet. Please use the new env var `SYCL_DEVICE_FILTER` instead. |
1212
| `SYCL_DEVICE_FILTER` | `backend:device_type:device_num` | See Section [`SYCL_DEVICE_FILTER`](#sycl_device_filter) below. |
1313
| `SYCL_DEVICE_ALLOWLIST` | See [below](#sycl_device_allowlist) | Filter out devices that do not match the pattern specified. `BackendName` accepts `host`, `opencl`, `level_zero` or `cuda`. `DeviceType` accepts `host`, `cpu`, `gpu` or `acc`. `DeviceVendorId` accepts uint32_t in hex form (`0xXYZW`). `DriverVersion`, `PlatformVersion`, `DeviceName` and `PlatformName` accept regular expression. Special characters, such as parenthesis, must be escaped. DPC++ runtime will select only those devices which satisfy provided values above and regex. More than one device can be specified using the piping symbol "\|".|
1414
| `SYCL_DISABLE_PARALLEL_FOR_RANGE_ROUNDING` | Any(\*) | Disables automatic rounding-up of `parallel_for` invocation ranges. |
@@ -107,7 +107,7 @@ variables in production code.</span>
107107
| `SYCL_DEVICELIB_INHIBIT_NATIVE` | String of device library extensions (separated by a whitespace) | Do not rely on device native support for devicelib extensions listed in this option. |
108108
| `SYCL_PROGRAM_COMPILE_OPTIONS` | String of valid OpenCL compile options | Override compile options for all programs. |
109109
| `SYCL_PROGRAM_LINK_OPTIONS` | String of valid OpenCL link options | Override link options for all programs. |
110-
| `SYCL_USE_KERNEL_SPV` | Path to the SPIR-V binary | Load device image from the specified file. If runtime is unable to read the file, `cl::sycl::runtime_error` exception is thrown.|
110+
| `SYCL_USE_KERNEL_SPV` | Path to the SPIR-V binary | Load device image from the specified file. If runtime is unable to read the file, `sycl::runtime_error` exception is thrown.|
111111
| `SYCL_DUMP_IMAGES` | Any(\*) | Dump device image binaries to file. Control has no effect if `SYCL_USE_KERNEL_SPV` is set. |
112112
| `SYCL_HOST_UNIFIED_MEMORY` | Integer | Enforce host unified memory support or lack of it for the execution graph builder. If set to 0, it is enforced as not supported by all devices. If set to 1, it is enforced as supported by all devices. |
113113
| `SYCL_CACHE_TRACE` | Any(\*) | If the variable is set, messages are sent to std::cerr when caching events or non-blocking failures happen (e.g. unable to access cache item file). |

sycl/doc/FAQ.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -128,7 +128,7 @@ C:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\ucrt\crtdbg.h(607,26
128128
> beyond those explicitly mentioned as usable in kernels in this spec.
129129
130130
Replace usage of STD built-ins with SYCL-defined math built-ins. Please, note
131-
that you have to explicitly specify built-in namespace (i.e. `cl::sycl::fmin`).
131+
that you have to explicitly specify built-in namespace (i.e. `sycl::fmin`).
132132
The full list of SYCL math built-ins is provided in section 4.13.3 of the
133133
specification.
134134

sycl/doc/GetStartedGuide.md

Lines changed: 23 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -567,29 +567,29 @@ Creating a file `simple-sycl-app.cpp` with the following C++/SYCL code:
567567
568568
int main() {
569569
// Creating buffer of 4 ints to be used inside the kernel code
570-
cl::sycl::buffer<cl::sycl::cl_int, 1> Buffer(4);
570+
sycl::buffer<sycl::cl_int, 1> Buffer(4);
571571
572572
// Creating SYCL queue
573-
cl::sycl::queue Queue;
573+
sycl::queue Queue;
574574
575575
// Size of index space for kernel
576-
cl::sycl::range<1> NumOfWorkItems{Buffer.size()};
576+
sycl::range<1> NumOfWorkItems{Buffer.size()};
577577
578578
// Submitting command group(work) to queue
579-
Queue.submit([&](cl::sycl::handler &cgh) {
579+
Queue.submit([&](sycl::handler &cgh) {
580580
// Getting write only access to the buffer on a device
581-
auto Accessor = Buffer.get_access<cl::sycl::access::mode::write>(cgh);
581+
auto Accessor = Buffer.get_access<sycl::access::mode::write>(cgh);
582582
// Executing kernel
583583
cgh.parallel_for<class FillBuffer>(
584-
NumOfWorkItems, [=](cl::sycl::id<1> WIid) {
584+
NumOfWorkItems, [=](sycl::id<1> WIid) {
585585
// Fill buffer with indexes
586-
Accessor[WIid] = (cl::sycl::cl_int)WIid.get(0);
586+
Accessor[WIid] = (sycl::cl_int)WIid.get(0);
587587
});
588588
});
589589
590590
// Getting read only access to the buffer on the host.
591591
// Implicit barrier waiting for queue to complete the work.
592-
const auto HostAccessor = Buffer.get_access<cl::sycl::access::mode::read>();
592+
const auto HostAccessor = Buffer.get_access<sycl::access::mode::read>();
593593
594594
// Check the results
595595
bool MismatchFound = false;
@@ -714,36 +714,36 @@ SYCL_BE=PI_CUDA ./simple-sycl-app-cuda.exe
714714
```
715715
716716
**NOTE**: DPC++/SYCL developers can specify SYCL device for execution using
717-
device selectors (e.g. `cl::sycl::cpu_selector`, `cl::sycl::gpu_selector`,
717+
device selectors (e.g. `sycl::cpu_selector`, `sycl::gpu_selector`,
718718
[Intel FPGA selector(s)](extensions/supported/sycl_ext_intel_fpga_device_selector.md)) as
719719
explained in following section [Code the program for a specific
720720
GPU](#code-the-program-for-a-specific-gpu).
721721
722722
### Code the program for a specific GPU
723723
724-
To specify OpenCL device SYCL provides the abstract `cl::sycl::device_selector`
724+
To specify OpenCL device SYCL provides the abstract `sycl::device_selector`
725725
class which the can be used to define how the runtime should select the best
726726
device.
727727
728-
The method `cl::sycl::device_selector::operator()` of the SYCL
729-
`cl::sycl::device_selector` is an abstract member function which takes a
728+
The method `sycl::device_selector::operator()` of the SYCL
729+
`sycl::device_selector` is an abstract member function which takes a
730730
reference to a SYCL device and returns an integer score. This abstract member
731731
function can be implemented in a derived class to provide a logic for selecting
732732
a SYCL device. SYCL runtime uses the device for with the highest score is
733-
returned. Such object can be passed to `cl::sycl::queue` and `cl::sycl::device`
733+
returned. Such object can be passed to `sycl::queue` and `sycl::device`
734734
constructors.
735735
736-
The example below illustrates how to use `cl::sycl::device_selector` to create
736+
The example below illustrates how to use `sycl::device_selector` to create
737737
device and queue objects bound to Intel GPU device:
738738
739739
```c++
740740
#include <sycl/sycl.hpp>
741741
742742
int main() {
743-
class NEOGPUDeviceSelector : public cl::sycl::device_selector {
743+
class NEOGPUDeviceSelector : public sycl::device_selector {
744744
public:
745-
int operator()(const cl::sycl::device &Device) const override {
746-
using namespace cl::sycl::info;
745+
int operator()(const sycl::device &Device) const override {
746+
using namespace sycl::info;
747747
748748
const std::string DeviceName = Device.get_info<device::name>();
749749
const std::string DeviceVendor = Device.get_info<device::vendor>();
@@ -754,9 +754,9 @@ int main() {
754754
755755
NEOGPUDeviceSelector Selector;
756756
try {
757-
cl::sycl::queue Queue(Selector);
758-
cl::sycl::device Device(Selector);
759-
} catch (cl::sycl::invalid_parameter_error &E) {
757+
sycl::queue Queue(Selector);
758+
sycl::device Device(Selector);
759+
} catch (sycl::invalid_parameter_error &E) {
760760
std::cout << E.what() << std::endl;
761761
}
762762
}
@@ -767,10 +767,10 @@ The device selector below selects an NVIDIA device only, and won't execute if
767767
there is none.
768768
769769
```c++
770-
class CUDASelector : public cl::sycl::device_selector {
770+
class CUDASelector : public sycl::device_selector {
771771
public:
772-
int operator()(const cl::sycl::device &Device) const override {
773-
using namespace cl::sycl::info;
772+
int operator()(const sycl::device &Device) const override {
773+
using namespace sycl::info;
774774
const std::string DriverVersion = Device.get_info<device::driver_version>();
775775
776776
if (Device.is_gpu() && (DriverVersion.find("CUDA") != std::string::npos)) {

sycl/doc/MultiTileCardWithLevelZero.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -46,8 +46,8 @@ The root-device in such cases can be partitioned to sub-devices, each correspond
4646
``` C++
4747
try {
4848
vector<device> SubDevices = RootDevice.create_sub_devices<
49-
cl::sycl::info::partition_property::partition_by_affinity_domain>(
50-
cl::sycl::info::partition_affinity_domain::next_partitionable);
49+
sycl::info::partition_property::partition_by_affinity_domain>(
50+
sycl::info::partition_affinity_domain::next_partitionable);
5151
}
5252
```
5353

sycl/doc/design/CompilerAndRuntimeDesign.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -90,7 +90,7 @@ work:
9090
int foo(int x) { return ++x; }
9191
int bar(int x) { throw std::exception{"CPU code only!"}; }
9292
...
93-
using namespace cl::sycl;
93+
using namespace sycl;
9494
queue Q;
9595
buffer<int, 1> a{range<1>{1024}};
9696
Q.submit([&](handler& cgh) {
@@ -103,17 +103,17 @@ Q.submit([&](handler& cgh) {
103103
```
104104
105105
In this example, the compiler needs to compile the lambda expression passed
106-
to the `cl::sycl::handler::parallel_for` method, as well as the function `foo`
106+
to the `sycl::handler::parallel_for` method, as well as the function `foo`
107107
called from the lambda expression for the device.
108108
109109
The compiler must also ignore the `bar` function when we compile the
110110
"device" part of the single source code, as it's unused inside the device
111111
portion of the source code (the contents of the lambda expression passed to the
112-
`cl::sycl::handler::parallel_for` and any function called from this lambda
112+
`sycl::handler::parallel_for` and any function called from this lambda
113113
expression).
114114
115115
The current approach is to use the SYCL kernel attribute in the runtime to
116-
mark code passed to `cl::sycl::handler::parallel_for` as "kernel functions".
116+
mark code passed to `sycl::handler::parallel_for` as "kernel functions".
117117
The runtime library can't mark foo as "device" code - this is a compiler
118118
job: to traverse all symbols accessible from kernel functions and add them to
119119
the "device part" of the code marking them with the new SYCL device attribute.

sycl/doc/design/KernelParameterPassing.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@ int main()
6060

6161
myQueue.submit([&](handler &cgh) {
6262
auto outAcc = outBuf.get_access<access::mode::write>(cgh);
63-
cgh.parallel_for<class Worker>(num_items, [=](cl::sycl::id<1> index) {
63+
cgh.parallel_for<class Worker>(num_items, [=](sycl::id<1> index) {
6464
outAcc[index] = i + s.m;
6565
});
6666
});
@@ -192,7 +192,7 @@ are copied into the array within the local capture object.
192192

193193
myQueue.submit([&](handler &cgh) {
194194
auto outAcc = outBuf.get_access<access::mode::write>(cgh);
195-
cgh.parallel_for<class Worker>(num_items, [=](cl::sycl::id<1> index) {
195+
cgh.parallel_for<class Worker>(num_items, [=](sycl::id<1> index) {
196196
outAcc[index] = array[index.get(0)];
197197
});
198198
});
@@ -264,7 +264,7 @@ of each accessor array element in ascending index value.
264264
in_buffer2.get_access<access::mode::read>(cgh)};
265265
auto outAcc = out_buffer.get_access<access::mode::write>(cgh);
266266

267-
cgh.parallel_for<class Worker>(num_items, [=](cl::sycl::id<1> index) {
267+
cgh.parallel_for<class Worker>(num_items, [=](sycl::id<1> index) {
268268
outAcc[index] = inAcc[0][index] + inAcc[1][index];
269269
});
270270
});
@@ -356,7 +356,7 @@ in a manner similar to other instances of accessor arrays.
356356
};
357357
auto outAcc = out_buffer.get_access<access::mode::write>(cgh);
358358

359-
cgh.parallel_for<class Worker>(num_items, [=](cl::sycl::id<1> index) {
359+
cgh.parallel_for<class Worker>(num_items, [=](sycl::id<1> index) {
360360
outAcc[index] = s.m + s.inAcc[0][index] + s.inAcc[1][index];
361361
});
362362
});

sycl/doc/design/LinkedAllocations.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -9,33 +9,33 @@ Instead, memory is allocated in each context whenever the SYCL memory object
99
is first accessed there:
1010

1111
```
12-
cl::sycl::buffer<int, 1> buf{cl::sycl::range<1>(1)}; // No allocation here
12+
sycl::buffer<int, 1> buf{sycl::range<1>(1)}; // No allocation here
1313
14-
cl::sycl::queue q;
15-
q.submit([&](cl::sycl::handler &cgh){
14+
sycl::queue q;
15+
q.submit([&](sycl::handler &cgh){
1616
// First access to buf in q's context: allocate memory
17-
auto acc = buf.get_access<cl::sycl::access::mode::read_write>(cgh);
17+
auto acc = buf.get_access<sycl::access::mode::read_write>(cgh);
1818
...
1919
});
2020
2121
// First access to buf on host (assuming q is not host): allocate memory
22-
auto acc = buf.get_access<cl::sycl::access::mode::read_write>();
22+
auto acc = buf.get_access<sycl::access::mode::read_write>();
2323
```
2424

2525
In the DPCPP execution graph these allocations are represented by allocation
26-
command nodes (`cl::sycl::detail::AllocaCommand`). A finished allocation
26+
command nodes (`sycl::detail::AllocaCommand`). A finished allocation
2727
command means that the associated memory object is ready for its first use in
2828
that context, but for host allocation commands it might be the case that no
2929
actual memory allocation takes place: either because it is possible to reuse the
3030
data pointer provided by the user:
3131

3232
```
3333
int val;
34-
cl::sycl::buffer<int, 1> buf{&val, cl::sycl::range<1>(1)};
34+
sycl::buffer<int, 1> buf{&val, sycl::range<1>(1)};
3535
3636
// An alloca command is created, but it does not allocate new memory: &val
3737
// is reused instead.
38-
auto acc = buf.get_access<cl::sycl::access::mode::read_write>();
38+
auto acc = buf.get_access<sycl::access::mode::read_write>();
3939
```
4040

4141
Or because a mapped host pointer obtained from a native device memory object

sycl/doc/design/OptionalDeviceFeatures.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -403,9 +403,9 @@ name. The format looks like this:
403403

404404
```
405405
!intel_types_that_use_aspects = !{!0, !1, !2}
406-
!0 = !{!"class.cl::sycl::detail::half_impl::half", i32 8}
407-
!1 = !{!"class.cl::sycl::amx_type", i32 9}
408-
!2 = !{!"class.cl::sycl::other_type", i32 8, i32 9}
406+
!0 = !{!"class.sycl::detail::half_impl::half", i32 8}
407+
!1 = !{!"class.sycl::amx_type", i32 9}
408+
!2 = !{!"class.sycl::other_type", i32 8, i32 9}
409409
```
410410

411411
The value of the `!intel_types_that_use_aspects` metadata is a list of unnamed
@@ -415,8 +415,8 @@ starts with a string giving the name of the type which is followed by a list of
415415
`i32` constants where each constant is a value from `enum class aspect` telling
416416
the numerical value of an aspect from the type's
417417
`[[sycl_detail::uses_aspects()]]` attribute. In the example above, the type
418-
`cl::sycl::detail::half_impl::half` uses an aspect whose numerical value is
419-
`8` and the type `cl::sycl::other_type` uses two aspects `8` and `9`.
418+
`sycl::detail::half_impl::half` uses an aspect whose numerical value is
419+
`8` and the type `sycl::other_type` uses two aspects `8` and `9`.
420420

421421
**NOTE**: The reason we choose this representation is because LLVM IR does not
422422
allow metadata to be attached directly to types. This representation works

0 commit comments

Comments
 (0)