Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dask] Include support for raw_score in predict (fixes #3793) #4024

Merged
merged 10 commits into from
Mar 27, 2021

Conversation

jmoralez
Copy link
Collaborator

This attempts to solve #3793 by testing the predict method for DaskLGBMRegressor and DaskLGBMClassifier with raw_score=True. The tests for DaskLGBMClassifier are performed with binary and multi-class classification since predict_proba=True and raw_score=True for binary classification return a 1d array/pandas Series as opposed to the multi-class variants (which return the same number of columns as classes).

I found that using drop_axis=1 for dask.array.map_blocks works everytime and avoids having many if statements testing the conditions described above but I'm happy to discuss the best way to do this. I also replaced the decision to return a pd.Series or pd.DataFrame by checking the shape of the input array, which I think should be discussed as well.

Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking this on! Please see my first round of comments

python-package/lightgbm/dask.py Show resolved Hide resolved
@@ -1255,3 +1255,59 @@ def test_sklearn_integration(estimator, check, client):
def test_parameters_default_constructible(estimator):
name, Estimator = estimator.__class__.__name__, estimator.__class__
sklearn_checks.check_parameters_default_constructible(name, Estimator)


@pytest.mark.parametrize('task', ['binary_classification', 'multi-class_classification', 'regression'])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please adding ranking tests as well?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually this change also makes me realize that the other tests using @pytest.mark.parametrize('task', tasks) are not currently testing multi-class classification.

Are you interested in submitting a separate PR to add multi-class-classification to tasks?

tasks = ['classification', 'regression', 'ranking']

Doing that would improve our test coverage AND reduce the risk of mistakes like this where a task if forgotten.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Should this modify the _create_data function to add an objective of multi-class-classification?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes please! based on this discussion: #4024 (comment)

when you do that, I think it would make sense to remove the centers argument from that function, and just use the objective to set centers=2 or centers=3. We don't have any tests right now that use more than 3 classes or that really care about the exact numeric values of the cluster centers coming out of make_blobs().

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi James. I'm having trouble with the tests for ranking. The check for the expected mean is because if you have 25 samples in one leaf with value 0 and 75 samples in another leaf with value 1 you expect your mean to be (25 * 0 + 75 * 1) / (25 + 75), right? Which I guess would be roughly equal to checking that there are 25 predictions with the value 0 and 75 with the value 1. But for ranking I'm seeing that these don't match:
image

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or I could compare directly the count column vs the unique counts of the predictions. It's not very clear to me why we don't use that here, why are we using the weights?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you know what now that I think about it...it would be enough to compare the unique values of the leaf nodes to the unique values of the preds. If those are the same, you know you got the raw scores

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the tests would check set(raw_predictions) == set(trees_df['value'])?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. please let me know what you think

tests/python_package_test/test_dask.py Outdated Show resolved Hide resolved
tests/python_package_test/test_dask.py Outdated Show resolved Hide resolved
tests/python_package_test/test_dask.py Outdated Show resolved Hide resolved
tests/python_package_test/test_dask.py Outdated Show resolved Hide resolved
tests/python_package_test/test_dask.py Outdated Show resolved Hide resolved
@jameslamb jameslamb self-requested a review March 18, 2021 21:08
Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me, thanks very much! I'm going to wait to merge this until after LightGBM 3.2.0 is released (#3872 ), assuming that happens in the next day or two.

@jameslamb jameslamb changed the title [dask] Include support for raw_score in predict [dask] Include support for raw_score in predict (fixes #3793) Mar 18, 2021
@jmoralez
Copy link
Collaborator Author

Sounds good. What would be a good next contribution? I'd like to work on the dataset but I'm happy to contribute something else if you consider it more important

@jameslamb
Copy link
Collaborator

Sounds good. What would be a good next contribution? I'd like to work on the dataset but I'm happy to contribute something else if you consider it more important

thanks very much! Could you try adding tests for voting_parallel learning? As part of #3834. I added some notes on how to get started at #3834 (comment).

I don't want to start on DaskLGBMDataset yet (#3944) since I think that will take a while and will probably conflict with @ffineis 's work on adding eval sets (#3952 (comment)).

@jmoralez
Copy link
Collaborator Author

Wow I hadn't seen the voting parallel learning, it looks awesome. Will definitely work on that and hope that it doesn't suffer from what I've seen in #4026.

@StrikerRUS
Copy link
Collaborator

@jmoralez Could you please sync with latest master branch?

@StrikerRUS StrikerRUS merged commit fe1b80a into microsoft:master Mar 27, 2021
@jmoralez jmoralez deleted the raw_score branch March 27, 2021 16:10
@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants