Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always respect forced splits, even when feature_fraction < 1.0 (fixes #4601) #4725

Merged
merged 17 commits into from
Nov 10, 2021

Conversation

tongwu-sh
Copy link
Contributor

@tongwu-sh tongwu-sh commented Oct 27, 2021

Fixes #4601.

Root cause for this issue: The histogram for the force split feature might not construct if feature_fraction < 1.0. And it would failed at SplitInner.

  • Only col-wise histogram construction impacted. row-wise histogram construction would go through all features.
  • Only force split settings gain > 0 impacted. If no gain for force_split leaf, the force_split settings would be auto ignored.

@jameslamb jameslamb changed the title Fix issue 4601 Always respect forced splits, even when feature_fraction < 1.0 (fixes #4601) Oct 27, 2021
@jameslamb jameslamb self-requested a review October 27, 2021 22:29
Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much for working on this! Please see my initial comments below.

Also, I changed the title of this pull request. In this project, pull request titles become items in the release notes (see, for example, https://github.com/microsoft/LightGBM/releases/tag/v3.3.0), and I don't think "fix issue 4601" would be very informative for a user reading the release notes to understand what has changed.

I'll try to test this as soon as possible using the reproducible example provided in #4601.

src/treelearner/serial_tree_learner.cpp Outdated Show resolved Hide resolved
tests/python_package_test/test_engine.py Show resolved Hide resolved
tests/python_package_test/test_engine.py Show resolved Hide resolved
@StrikerRUS StrikerRUS added the fix label Oct 28, 2021
Copy link
Collaborator

@shiyu1994 shiyu1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change in the cpp side LGTM. Just a suggestion in the test case. Thanks!

tests/python_package_test/test_engine.py Outdated Show resolved Hide resolved
@tongwu-sh tongwu-sh requested a review from hzy46 as a code owner November 1, 2021 02:33
@tongwu-sh tongwu-sh requested a review from StrikerRUS November 1, 2021 02:34
Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes. Please see a few suggestions I left about the tests. I think it's important to test with force_col_wise=true and force_row_wise=true to be confident that this fixes #4601.

tests/python_package_test/test_engine.py Outdated Show resolved Hide resolved
@tongwu-sh tongwu-sh requested a review from jameslamb November 1, 2021 05:01
Copy link
Collaborator

@StrikerRUS StrikerRUS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you very much for working on this!
I left only one suggestion for your consideration to make the test more strict.

assert len(tree_info) > 1
for tree in tree_info:
tree_structure = tree["tree_structure"]
assert tree_structure['split_feature'] == 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert tree_structure['split_feature'] == 0
assert tree_structure['split_feature'] == 0
assert tree_structure['threshold'] == pytest.approx(0.5, abs=1e-1)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the actual threshold is the nearest record is about 0.52..

@shiyu1994
Copy link
Collaborator

Thanks for the changes. Please see a few suggestions I left about the tests. I think it's important to test with force_col_wise=true and force_row_wise=true to be confident that this fixes #4601.

@jameslamb Hi James, we found that when force_row_wise=true, the test case fails due to other problem, which is not related to using forced splits with feature_fraction < 1.0. It is probably due to the same problem as in issues #3679 #4739.

Can we leave this in subsequent PRs to fix, and merge this for now, given that all tests are passed?

@jameslamb
Copy link
Collaborator

Hi James, we found that when force_row_wise=true, the test case fails due to other problem

@shiyu1994 just to be sure I understand, which of these do you mean?

  1. the tests added in this PR fail if you run them on master with force_row_wise=True
  2. the tests added in this PR fail on this PR branch (but not master) with force_row_wise=True

If it's number 1, then I support merging this PR as-is and fixing the other bug in a later PR.

If it's number 2, then that means that bug would be introduced by this PR, and I don't think it should be merged.

@tongwu-sh
Copy link
Contributor Author

Hi James, we found that when force_row_wise=true, the test case fails due to other problem

@shiyu1994 just to be sure I understand, which of these do you mean?

  1. the tests added in this PR fail if you run them on master with force_row_wise=True
  2. the tests added in this PR fail on this PR branch (but not master) with force_row_wise=True

If it's number 1, then I support merging this PR as-is and fixing the other bug in a later PR.

If it's number 2, then that means that bug would be introduced by this PR, and I don't think it should be merg ed.

@jameslamb thanks for the check here. It is for #1, "force_row_wise=True" same test would also repro on current master branch.

@jameslamb
Copy link
Collaborator

It is for #1

Got it, thanks! Then I think it's ok to merge this and fix that other bug separately. Thanks for explaining it to me 😄

Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working through this!

Never considered, when I encountered #4601, that a source of randomness could be in the choice of row-wise vs. column-wise Dataset construction. So I learned something really important through reviewing this and through your explanations ❤️

@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Using feature_fraction and forced splits together can cause training failures
4 participants