Description
So I have two example snippets of code that write to a single value in a buffer
twice using two kernels and use the accessor
functionality to read on host and write on device (I believe both snippets are legal SYCL code, but please do correct me if I am wrong and making some incorrect assumptions).
The first doesn't work but the second does, the only difference (from a user perspective) is the braces { }
around the submit calls, which I believe forces a wait/synchronization event in SYCL (perhaps I am misunderstanding however). They both work with the old scheduler when -DSCHEDULER_10
is passed to the compiler, which leads me to think that it's less the legality of the two examples and more some incorrect synchronization event.
Tested with following command and unaltered top of the tree (as of May 14th): $ISYCL_BIN_DIR/clang++ -std=c++11 -fsycl scheduler_2_buffer_block.cpp -o scheduler_2_buffer_block -lOpenCL
I tinkered with this for a while, from what I've found:
- If the
get_access
inside the second kernel is just a readaccessor
, it won't block. - It seems to block when waiting on the second kernel to complete, however it doesn't appear to be the kernel blocking it seems to be a dependent event generated from the
clEnqueueUnmapMemObject
invocation frommemory_manager.cpp
(can comment out the contents ofunmap
and the non-working snippet should work). - You can replace the second host side
get_access
with aqueue
wait
and it'll still block - As far as the OpenCL runtime (and I) can tell, the OpenCL events generated aren't erroneous
Before I decide to dig any deeper I thought it might be worth finding out if this is a bug or a misconception/silliness on my end and if you guys are already aware and working on it!
Invalid, blocks when trying to wait for second kernel submit:
int main() {
cl::sycl::queue q;
cl::sycl::buffer<int, 1> ob((int[1]){0}, 1);
q.submit([&](handler &cgh) {
auto wb = ob.get_access<access::mode::read_write>(cgh);
cgh.single_task<class k1>([=]() {
wb[0] += 1;
});
});
auto rb = ob.get_access<access::mode::read>();
std::cout << rb[0] << "\n";
q.submit([&](handler &cgh) {
auto wb = ob.get_access<access::mode::read_write>(cgh);
cgh.single_task<class k2>([=]() {
wb[0] += 1;
});
});
auto rb2 = ob.get_access<access::mode::read>();
std::cout << rb2[0] << "\n";
return 0;
}
Valid, no block:
int main() {
cl::sycl::queue q;
cl::sycl::buffer<int, 1> ob((int[1]){0}, 1);
{
q.submit([&](handler &cgh) {
auto wb = ob.get_access<access::mode::read_write>(cgh);
cgh.single_task<class k1>([=]() {
wb[0] += 1;
});
});
auto rb = ob.get_access<access::mode::read>();
std::cout << rb[0] << "\n";
}
{
q.submit([&](handler &cgh) {
auto wb = ob.get_access<access::mode::read_write>(cgh);
cgh.single_task<class k2>([=]() {
wb[0] += 1;
});
});
auto rb2 = ob.get_access<access::mode::read>();
std::cout << rb2[0] << "\n";
}
return 0;
}