-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RF: support filtering of const features #1725
RF: support filtering of const features #1725
Conversation
dadb000
to
252980c
Compare
/intelci: run |
/intelci: restart |
/intelci: run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you have failed for daal4py examples for random forest in private and public CI
@@ -492,7 +492,8 @@ protected: | |||
_minSamplesSplit(2), | |||
_minWeightLeaf(0.), | |||
_minImpurityDecrease(-daal::services::internal::EpsilonVal<algorithmFPType>::get() * x->getNumberOfRows()), | |||
_maxLeafNodes(0) | |||
_maxLeafNodes(0), | |||
_useConstFeatures(false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you need this variable? you change it nowhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to save this variable, it might be dissuaded to use previous approach in some cases. This variable doesn't changed in any place in this PR.
_aConstFeatureIdx.reset(maxFeatures * 2); // first maxFeatures elements are used for saving indices of constant features, | ||
// the other part are used for saving levels of this features | ||
DAAL_CHECK_MALLOC(_aConstFeatureIdx.get()); | ||
PRAGMA_IVDEP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use memset for _aConstFeatureIdx
services::service_memset_seq<IndexType, cpu>(_aConstFeatureIdx, 0, maxFeatures * 2);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, thanks, it was fixed
if (_hostApp.isCancelled(s, n)) return nullptr; | ||
|
||
if (!_par.memorySavingMode && !_useConstFeatures) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand correctly, this code will work for every tree? Why can't we count at the beginning - here, for example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This trick of checking "non-const" features effectively works only for "depth-first" build trees strategy. In this case it was put here.
252980c
to
23f8c80
Compare
/intelci: restart |
@Mergifyio rebase |
Command
|
23f8c80
to
d6c0661
Compare
/intelci: run |
@Mergifyio rebase |
/intelci: restart |
Command
|
Accuracy/MSE and performance were measured and compared on some datasets. Accuracy score was compared on autogluon dataset.