-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move thread local entry into Learner. #5396
Conversation
Extracted from dmlc#5389 . This is an attempt to workaround CUDA context issue in static variable, where the CUDA context can be released before device vector. * Add PredictionEntry to thread local entry. This eliminates one copy of prediction vector. * Don't define CUDA C API in a namespace.
Codecov Report
@@ Coverage Diff @@
## master #5396 +/- ##
=======================================
Coverage 84.07% 84.07%
=======================================
Files 11 11
Lines 2411 2411
=======================================
Hits 2027 2027
Misses 384 384 Continue to review full report at Codecov.
|
@@ -167,6 +184,8 @@ class Learner : public Model, public Configurable, public rabit::Serializable { | |||
virtual std::vector<std::string> DumpModel(const FeatureMap& fmap, | |||
bool with_stats, | |||
std::string format) const = 0; | |||
|
|||
virtual XGBAPIThreadLocalEntry& GetThreadLocal() const = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could make Learner a concrete class. I don't see us subclassing different learners any time soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I don't like Learner, as linear and tree are so different.
@@ -105,6 +105,17 @@ class Transform { | |||
return Span<T const> {_vec->ConstHostPointer(), | |||
static_cast<typename Span<T>::index_type>(_vec->Size())}; | |||
} | |||
// Recursive sync host |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What made you do this? Just curious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Train hist with cupy data. I modified a test in this PR.
Extracted from #5389 .
This is an attempt to workaround CUDA context issue in static variable, where
the CUDA context can be released before device vector.
Fix training with GPU data on multi-threaded environment. Calling
HostVector
inTransform
causes race condition.Add PredictionEntry to thread local entry.
This eliminates one copy of prediction vector.