Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the error handling and reporting #35

Closed
diptorupd opened this issue Sep 17, 2020 · 5 comments
Closed

Fix the error handling and reporting #35

diptorupd opened this issue Sep 17, 2020 · 5 comments
Assignees
Labels
enhancement New feature or request

Comments

@diptorupd
Copy link
Contributor

diptorupd commented Sep 17, 2020

The current dpctl error handling broke once we transitioned from the C++ API to the C API. The error reporting and handling in dpctl needs complete redesign.

@diptorupd diptorupd self-assigned this Sep 17, 2020
@diptorupd diptorupd mentioned this issue Sep 18, 2020
@diptorupd
Copy link
Contributor Author

We have the except + added to all of our Cython functions. We will need to change all the Cython cdefs and most probably add checks inside the Cython calls whereever we call these functions.

@diptorupd
Copy link
Contributor Author

@PokhodenkoSA @1e-to

There are two broad design goals for error handling for dpCtl:

  1. dpCtl will never raise any exceptions and by design should return a "logical" value back to caller to notify error. E.g. Wherever a dpCtl function returns a pointer, the function should return a nullptr if a valid pointer cannot be returned. The caller needs to be aware of this behavior and add appropriate NULL checks. Possible patterns can be:
Return Type Returned Value
Pointer Nullptr
size_t/int -1
boolean ?
void ?
  • We should add needed try-catch blocks for Sycl functions that can throw.
  • Add Null checks during unwrap operations.
  1. Once basic error handling has been implemented, we need to design the error reporter. I can think of two options:
  • Let callers pass in a error handler call back function (like Sycl’s asynchronous error handler) that can be invoked when dpCtl wants to report an error.
  • Or, we use a global error variable that gets set by every dpCtl function. The error variable can store an error code and error message and can be queried by the caller.

Our solution can be a combination of both approaches?

@diptorupd diptorupd added this to the 1.0 milestone Feb 20, 2021
@oleksandr-pavlyk
Copy link
Contributor

oleksandr-pavlyk commented Mar 1, 2021

All DPCTL C-api functions change to have result_t type argument err passed by reference.

It is set to 0 (OK enum) for normal execution, or to a error-signaling enum otherwise.

Additionally, the DPCLSyclInterface library has global thread-local function to pointer to void (*handler)(int code, const char * c_str) which default to &default_handler:

void default_handler(int code, const char *message) {
    std::cerr << message << std::endl;
    // or fprintf(stderr, "%s\n", message);
}

When used by Python, the handler can be used to call PyErr_SetObject or PyErr_SetString to raise an exception. For example, following IntelPython/dpnp#201:

void python_handler(int code, const char * message) {
   if (PyErr_Occurred()) // if Python error has already been raised, do nothing
         return;
   switch(code) {
   case STD_BAD_ALLOC:
        PyErr_SetString(PyExc_MemoryError, message);
        break;
   case STD_BAD_CAST:
        PyErr_SetString(PyExc_TypeError, message);
        break;
   case SYCL_RUNTIME_ERROR:
        // could use PyErr_SetObject to pass both code and message to the exception constructor
        PyErr_SetString(PyExc_SyclError, message);
        break;
   ....
   default:
        break;
   } 
  return;
}

Cython functions calling DPCTL library will check for err, and check Python error signaling to drill into the details.

Also, all Cython cdef functions should have signature cdef func(...) except *: body per Cython docs.

The error handler function pointer will be used from within SYCL asynchronous error handler, used by DPCTL in SYCL context and SYCL queue constructors.

@PokhodenkoSA @shssf @Alexander-Makaryev @diptorupd @reazulhoque

@oleksandr-pavlyk
Copy link
Contributor

Since DPCTLSyclInterface is a shared library, it should not use global structures (such as global function pointers) to enable two different processes using it from stepping on each other's toes.

@diptorupd pointed out that MPI solves this problem by keeping/passing around a library state (communicator).

Translating this solution to dpctl amounts to modifying all function calls to take a struct representing such a state (which would store the error handler). In all the places where library handles errors, the error handler from the struct would be called).

@diptorupd
Copy link
Contributor Author

As the error handling infrastructure has been totally overhauled in libsyclinterface, the issue is ready to be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants