-
-
Notifications
You must be signed in to change notification settings - Fork 8.8k
Cleanup linalg kernels. #11802
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleanup linalg kernels. #11802
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors the linalg kernel infrastructure by splitting transformation kernels into more specialized functions and moving device dispatching logic from client code into the linalg_op.h header. The refactoring also leverages thrust transform for better memory I/O performance in CUDA kernels.
Key changes:
- Split
ElementWiseTransformHost/DeviceintoTransformIdxKernelandTransformKernelwith clearer semantics - Centralized device dispatching in
linalg_op.husingContext::DispatchDevice - Implemented thrust-based transform for contiguous memory in CUDA kernels
Reviewed Changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| src/common/linalg_op.h | Added new dispatching functions (ElementWiseKernel, TransformIdxKernel, TransformKernel) with device-agnostic interfaces and cpu_impl namespace |
| src/common/linalg_op.cuh | Refactored CUDA implementations to use thrust transform for contiguous tensors and moved functions to cuda_impl namespace |
| src/common/linalg_op.cu | Removed VecScaMul function, now handled by generic TransformKernel |
| tests/cpp/test_learner.cc | Updated to use new cpu_impl::TransformIdxKernel API |
| tests/cpp/data/test_metainfo.h | Updated to use new cpu_impl::ElementWiseKernel API |
| tests/cpp/common/test_stats.cu | Updated to use new cuda_impl::TransformIdxKernel API |
| tests/cpp/common/test_linalg.cu | Updated to use new cuda_impl::TransformIdxKernel API and changed include from .cuh to .h |
| tests/cpp/common/test_linalg.cc | Updated to use new cpu_impl::TransformIdxKernel API |
| src/objective/*.cu | Removed unnecessary conditional includes for linalg_op.cuh and SYCL linalg_op.h |
| src/data/data.cu | Changed parameter type from CUDAContext const* to Context const* and updated to use new kernel APIs |
| src/data/data.cc | Updated to use new cpu_impl::TransformIdxKernel API |
| plugin/sycl/tree/*.{h,cc} | Added ::xgboost::linalg:: namespace qualification for better disambiguation |
| plugin/sycl/common/linalg_op.h | Removed redundant dispatching function that's now in main linalg_op.h |
| plugin/sycl/common/linalg_op.cc | Removed VecScaMul implementation, now handled by generic TransformKernel |
Comments suppressed due to low confidence (1)
src/common/linalg_op.h:1
- The documentation for
TransformIdxKernelis missing thectxparameter. It should document@param ctx Context for device dispatching and thread management.
/**
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 21 out of 21 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Split up two different transform kernels. - Handle the dispatching in the header. - Use thrust transform for transform kernels. Helps with memory IO.
21364d3 to
1e60e1a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 22 out of 22 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
cc @rongou |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
linalg_op.hheader instead of the client code.The objective is to handle more linalg related operations like the ones used in #11786 .