Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Extension] Improve Interoperability with other asynchronous libraries #181

Open
illuhad opened this issue Dec 17, 2019 · 4 comments
Open
Labels
discussion General discussion about something extension design

Comments

@illuhad
Copy link
Collaborator

illuhad commented Dec 17, 2019

Based on a twitter discussion with @ax3l, we should investigate improving the interoperability of SYCL implementations with other asynchronous libraries (such as MPI or some async IO library).
This issue serves to track possible approaches and to discuss the specific requirements for users, so feedback is appreciated :)

At the moment I am thinking in particular of two new features:

  1. Specify command group dependencies on some external events
  2. Add callback mechanisms that trigger as soon as the SYCL implementation realizes that a task is complete (Note: depending on the task submission model implemented by the SYCL implementation, it may be possible that this callback will not be triggered right after a task completes; for example in a batched submission model the SYCL runtime may only notice that tasks are done once the entire batch is complete)

1. Specifying command group dependencies on some external events

I think external events could best be implemented on top of the explicit synchronization mechanism that is part of the Intel USM proposal which introduces sycl::handler::depends_on(sycl::event evt), e.g.:

q.submit([&](cl::sycl::handler& cgh){
  cgh.depends_on(some_sycl_event);
  cgh.parallel_for(...);
});

In analogy, we could introduce an overload sycl::handler::depends_on(sycl::external_event evt), with a new class external_event:

class external_event
{
public:
  external_event();
  // user can optionally provide a function that will be used by the SYCL implementation 
  // to test the state of the event. If this function is not provided, the event can only complete 
  // when   `signal_completion()` is called.
  // As soon as either the test_state function returns true, or the signal_completion() function
  // is called, the SYCL runtime will consider the event as complete.
  // The test_state function will not be invoked again after it has returned true for the first time.
  external_event(std::function<bool ()> test_state);

  void wait();

  // When called, signals to the SYCL runtime that this event has completed.
  void signal_completion();

  // ... plus remaining functions that are present in sycl::event for consistency
};

2. Add callback mechanisms

Probably what would be most consistent with SYCL would be adding a function handler::callback(std::function<void ()>) to specify callbacks for a given command group handler. Combined with external events, this could lool like this:

sycl::external_event evt([]() -> bool {
  return is_my_event_done();
});
q.submit([&](sycl::handler& cgh) {
  cgh.depends_on(evt);
  cgh.callback([q](){
    // Will be called once this command group is done. Could do some additional submits:
    q.submit(...);
  });
  cgh.parallel_for(...);
});
  • semantically, a callback is similar to a single_task kernel running on the host device that depends on the given kernel. However, a callback shall be executed by the SYCL implementation at its earliest covenience, whereas a regular kernel may be delayed in execution (e.g., because of kernel reordering for better overlap of compute/data transfers, waiting for more kernels for batched kernel submission etc).
  • User data can be used in the callback lambda by simply capturing whatever is needed, therefore no parameters are required for its signature.
  • Whether the execution of the callback blocks the device on which its kernel was executed is implementation-defined (?? we could also specify that it never blocks, but this probably requires some more implementation effort)
  • as for depends_on(), the location of the function call to callback() inside the command group is irrelevant.
  • It is allowed to submit other SYCL kernels inside the callback
  • When there are several callback() calls inside a command group all will be invoked even if some of them are called with the same callback lambda or function object. The order in which they will be executed is implementation defined and not guaranteed to be deterministic.
@illuhad illuhad added the discussion General discussion about something label Dec 17, 2019
@ax3l
Copy link
Contributor

ax3l commented Dec 17, 2019

MPI communication and long-running I/O routines are exactly what I face every day, yes!

Just a minor comment that probably works with this proposal: for the user-provided test function, there are some cases such as MPI_Test functions that can be queried an arbitrary amount of time until they return success for the first time (and they must not be queried again after that). One can probably express this somehow as a user-defined state (static or member var?) when implementing this as a test_state function? Just want to mention this little odd use case.

@illuhad
Copy link
Collaborator Author

illuhad commented Dec 17, 2019

Thanks for the hint! I would expect that an implementation of external_event will typically contain a bool variable that specifies if the event has completed in order to support both the signal_completion() function as well as the user-provided test function. Under this assumption, the test function would probably not be called again anyway in typical implementations if the bool variable already indicates completion, which would be the behavior that you require.
In other words, since typical implementations of external_event would already work with such a restriction, I think we might as well just guarantee the user that the test function will not be called again after it returns success.
Edit: Original post with the definition of external_event now also guarantees this behavior.

@ax3l
Copy link
Contributor

ax3l commented Dec 17, 2019

That's a good constrain, thanks!

@illuhad
Copy link
Collaborator Author

illuhad commented Dec 23, 2019

Added a concept for callbacks.

@illuhad illuhad changed the title Improve Interoperability with other asynchronous libraries [Extension] Improve Interoperability with other asynchronous libraries Jul 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion General discussion about something extension design
Projects
None yet
Development

No branches or pull requests

2 participants