-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[native] [jvm-packages] allow rebuild prediction cache when it is not initialized #5272
Conversation
@trivialfis would you please help to review? (I will remove those std::cout right before merge) |
@trams the last fix for cp perf is here |
We face a similar issue. My thought is to cache all DMatrix , instead only the ones in constructors. |
it sounds like what I did here, I added a parameter (to avoid breaking current behavior) |
I'm refactoring the prediction cache. If you don't mind me being slow ( I'm working from home with a laptop ) I will take it from here. |
sure, that's awesome! |
take care BTW given the current situation |
Yup. Thanks! |
I saw your PR, yeah, that's kind of must-do, I found my approach here is problematic in terms of the scope of shared_ptr which is very hard to manage with the current architecture |
I expect that PR to simplify the management of cache a lot. Once merged we can refactor the existing interfaces together. I don't mind breaking changes for this caching behavior. For memory usage concern, I plan to utilize weak_ptr in the future for detecting destructed DMatrix objects and hence cleaning up the cache. |
Weak pointer sounds perfect, then we just add everything to the cache without worrying. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for submitting this change!
I like your approach. You just want to recreate the cache if the cache entry is missing.
I am a little bit worried about the implementation but mostly cause I do not really understand 1 line of code (see comments)
P.S. Full disclosure. I am no longer part of Criteo and I can't test this change on their dataset anymore
@@ -202,6 +202,7 @@ class LearnerImpl : public Learner { | |||
tparam_.UpdateAllowUnknown(args); | |||
mparam_.UpdateAllowUnknown(args); | |||
generic_parameters_.UpdateAllowUnknown(args); | |||
std::cout << "all_to_prediction_cache:" << generic_parameters_.adding_all_to_cache << "\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just do not forget to remove it before merging
return; | ||
} else { | ||
std::cout << "adding dmatrix to cache\n"; | ||
(*cache_)[dmat].data = static_cast<std::shared_ptr<DMatrix>>(dmat); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a bit confused by this line of code. It was quite a while when I last dived into this part of code
(Full disclaimer, I recently switched companies and I am no longer work in Criteo on xgboost :( )
My C++ is a little bit rusty in regard to static_cast in this context so please feel free to correct me
I am reading this https://en.cppreference.com/w/cpp/language/static_cast and I am trying to figure out which case is here.
A constructor of shared_ptr is marked as explicit so it can't be an implicit object creation but I do not understand then what it can be? shared_ptr is not a child of DMatrix.
Are you creating a new shared_ptr from the pointer dmat? If so I strongly suggest against. It usually leads to errors.
Cause I thought about similar fix but I rejected it cause I did not figure out how to retrieve already existing shared_pt from just DMatrix* (cause I want to have the copy of that reference counting pointer)
What I had in mind is changing the interface of Predict to pass shared_ptr in the first place instead of just a raw pointer. That would permit us to recreate a cache
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I consulted with my C++ friends and they pointed to https://stackoverflow.com/questions/32713083/explicit-constructor-and-static-cast
So static_cast here actually will call a constructor and it will create a second separate shared_ptr which won't know about any other shared_ptr's which manage this pointer.
Sooner or later this will result in double free cause each of these smart pointers will try to free the resource.
In the end I suggest to wait @trivialfis refactoring. It should enable you (or us) to solve this problem
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah that's why the current unit test broken
@alois-bissuel, |
@CodingCat Could you please help taking a look into #5220 ? |
@CodingCat I believe the issue is resolved in #5220 as it caches all DMatrix. Can you test it out? |
the major purpose is to fix the performance issue when training based on a checkpoint
We need to call prediction in 3 places, UpdateOneIter for residual, evaluation and update prediction cache. Because of the existence of pred cache, we only really perform prediction when update prediction cache and leverage the produced results in the other two places.
the performance degradation comes from the fact that when we train based on a checkpoint we lost the prediction cache. As s result, we will call in UpdateOneIter and EvalOneIter (since we lost cache, we will not call when updating cache)
this PR fixes the issue by adding DMatrix to cache when cache has been lost (this behavior only happens in JVM and only when checkpoint is set)
relevant issues
#3946
#4786
#4774