-
Notifications
You must be signed in to change notification settings - Fork 765
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[UR][L0] Fix L0 teardown checks for stability #17818
Conversation
nrspruit
commented
Apr 2, 2025
- Address the race conditions with L0 Loader teardown timing such that L0 teardown is verified before handle destruction in all cases and uses a L0 loader api to verify stability.
Signed-off-by: Neil R. Spruit <neil.r.spruit@intel.com>
Signed-off-by: Neil R. Spruit <neil.r.spruit@intel.com>
Signed-off-by: Neil R. Spruit <neil.r.spruit@intel.com>
Signed-off-by: Neil R. Spruit <neil.r.spruit@intel.com>
Signed-off-by: Neil R. Spruit <neil.r.spruit@intel.com>
36f9559
to
c33beab
Compare
Signed-off-by: Neil R. Spruit <neil.r.spruit@intel.com>
Signed-off-by: Neil R. Spruit <neil.r.spruit@intel.com>
Signed-off-by: Neil R. Spruit <neil.r.spruit@intel.com>
Signed-off-by: Neil R. Spruit <neil.r.spruit@intel.com>
Signed-off-by: Neil R. Spruit <neil.r.spruit@intel.com>
Signed-off-by: Neil R. Spruit <neil.r.spruit@intel.com>
@intel/bindless-images-reviewers / @intel/sycl-graphs-reviewers please review the minor changes to your code paths. SYCL and oneAPI customers tend to try and release SYCL/L0 objects after atexit has begun, this results in random cases where the memory for L0 will have been released by the OS and we encounter application crash. This change ensures that during resource release we always check to ensure the L0 loader and driver is stable and we don't allow a crash for customers. This utilizes a check added to the Loader to verify the stability of the L0 ecosystem before destroying a handle. NOTE: these changes don't result in memory leaks. When this issue occurs, the OS has already released the memory for these handles. All other times, this check will result in a pass and the release will continue unblocked. |
@intel/llvm-gatekeepers , please let me know if we can move ahead and merge this change. Without this, we have race conditions with various users of SYCL where the application will abort at teardown. |
@intel/bindless-images-reviewers , please review the small changes around image destroy needed to prevent use after free crashes. |
ping @intel/bindless-images-reviewers , please review your section to unblock this review, the changes are small, but this prevents potential crashes during teardown when memory is being freed by SYCL and SYCL users. |
@intel/llvm-gatekeepers please merge when available. |