Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move thread local entry into Learner. #5396

Merged
merged 11 commits into from
Mar 7, 2020

Conversation

trivialfis
Copy link
Member

@trivialfis trivialfis commented Mar 6, 2020

Extracted from #5389 .

  • This is an attempt to workaround CUDA context issue in static variable, where
    the CUDA context can be released before device vector.

  • Fix training with GPU data on multi-threaded environment. Calling HostVector in Transform causes race condition.

  • Add PredictionEntry to thread local entry.

This eliminates one copy of prediction vector.

  • Don't define CUDA C API in a namespace.

Extracted from dmlc#5389 .

This is an attempt to workaround CUDA context issue in static variable, where
the CUDA context can be released before device vector.

* Add PredictionEntry to thread local entry.

This eliminates one copy of prediction vector.

* Don't define CUDA C API in a namespace.
@trivialfis trivialfis closed this Mar 6, 2020
@trivialfis trivialfis reopened this Mar 6, 2020
@trivialfis trivialfis changed the title Global return buffer Move thread local entry into Learner. Mar 6, 2020
@mli
Copy link
Member

mli commented Mar 6, 2020

Codecov Report

Merging #5396 into master will not change coverage by %.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #5396   +/-   ##
=======================================
  Coverage   84.07%   84.07%           
=======================================
  Files          11       11           
  Lines        2411     2411           
=======================================
  Hits         2027     2027           
  Misses        384      384           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8d06878...f23b9c2. Read the comment docs.

@@ -167,6 +184,8 @@ class Learner : public Model, public Configurable, public rabit::Serializable {
virtual std::vector<std::string> DumpModel(const FeatureMap& fmap,
bool with_stats,
std::string format) const = 0;

virtual XGBAPIThreadLocalEntry& GetThreadLocal() const = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could make Learner a concrete class. I don't see us subclassing different learners any time soon.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I don't like Learner, as linear and tree are so different.

@@ -105,6 +105,17 @@ class Transform {
return Span<T const> {_vec->ConstHostPointer(),
static_cast<typename Span<T>::index_type>(_vec->Size())};
}
// Recursive sync host
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What made you do this? Just curious.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Train hist with cupy data. I modified a test in this PR.

@trivialfis trivialfis merged commit 0dd97c2 into dmlc:master Mar 7, 2020
@trivialfis trivialfis deleted the global-return-buffer branch March 7, 2020 07:38
@trivialfis trivialfis mentioned this pull request Apr 21, 2020
12 tasks
@lock lock bot locked as resolved and limited conversation to collaborators Jun 24, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants