-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RF: fix tests #725
RF: fix tests #725
Conversation
daal4py/sklearn/ensemble/_forest.py
Outdated
dtype=[np.float64, np.float32]) | ||
if not hasattr(self, 'daal_model_') or \ | ||
sp.issparse(X) or self.n_outputs_ != 1 or \ | ||
not daal_check_version((2021, 'P', 200)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not daal_check_version((2021, 'P', 200)): |
daal4py/sklearn/ensemble/_forest.py
Outdated
(f'X has {X.shape[1]} features, ' | ||
f'but RandomForestClassifier is expecting ' | ||
f'{self.n_features_in_} features as input')) | ||
X = check_array(X, accept_sparse=['csr', 'csc', 'coo'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this after check for use daal, otherwise will do unnecessary checks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
@@ -36,4 +36,4 @@ def test_sklearnex_import_rf_regression(): | |||
random_state=0, shuffle=False) | |||
rf = RandomForestRegressor(max_depth=2, random_state=0).fit(X, y) | |||
assert 'daal4py' in rf.__module__ | |||
assert_allclose([-6.66], rf.predict([[0, 0, 0, 0]]), atol=1e-2) | |||
assert_allclose([-6.97], rf.predict([[0, 0, 0, 0]]), atol=1e-2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the value in stock?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In stock Sklearn it's near -8,32; new implementation is close
@@ -31,9 +31,9 @@ | |||
import RandomForestRegressor as DaalRandomForestRegressor | |||
from daal4py.sklearn._utils import daal_check_version | |||
|
|||
ACCURACY_RATIO = 0.85 | |||
ACCURACY_RATIO = 0.95 if daal_check_version((2021, 'P', 400)) else 0.85 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand correctly, the accuracy has bad?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, accuracy and log loss in new implementation is better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I checked MSE in the regression example(prediction saves in file decision_forest_regression_batch.csv), it used to ~17,65; in new implementation it's ~17,26. It means that error is decreased.
CI might be bound from this PR We don't have to run CI twice |
0f707b3
to
c2fde1e
Compare
deselected_tests.yaml
Outdated
# predict_proba | ||
- ensemble/tests/test_forest.py::test_probability[RandomForestClassifier] | ||
- ensemble/tests/test_forest.py::test_parallel_train | ||
- ensemble/tests/test_stacking.py::test_stacking_classifier_iris | ||
- ensemble/tests/test_stacking.py::test_stacking_classifier_drop_column_binary_classification | ||
- ensemble/tests/test_stacking.py::test_stacking_classifier_drop_estimator | ||
- ensemble/tests/test_stacking.py::test_stacking_classifier_drop_binary_prob | ||
- ensemble/tests/test_voting.py::test_weights_iris | ||
- ensemble/tests/test_voting.py::test_predict_on_toy_problem | ||
- ensemble/tests/test_voting.py::test_predict_proba_on_toy_problem | ||
- ensemble/tests/test_voting.py::test_gridsearch | ||
- ensemble/tests/test_voting.py::test_voting_classifier_set_params | ||
- ensemble/tests/test_voting.py::test_set_estimator_drop | ||
- ensemble/tests/test_voting.py::test_estimator_weights_format | ||
- ensemble/tests/test_voting.py::test_transform | ||
- inspection/tests/test_permutation_importance.py::test_permutation_importance_equivalence_sequential_parallel | ||
- tests/test_calibration.py | ||
- tests/test_common.py::test_estimators | ||
- tests/test_multioutput.py::test_multi_output_classification |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are this tests fail just because of predict_proba usage?
Is it clear what is broken in oneDAL?
I'm afraid that we broke big amount of tests that were passing previously.
@Mergifyio rebase |
No description provided.