Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistency of min_child_weight parameter #5444

Open
RAMitchell opened this issue Mar 27, 2020 · 2 comments
Open

Consistency of min_child_weight parameter #5444

RAMitchell opened this issue Mar 27, 2020 · 2 comments

Comments

@RAMitchell
Copy link
Member

The min_child_weight parameter (default value 1.0) has different effects based on scaling of objective functions. I noticed this when developing a new objective function that had a small Hessian and the tree was not able to grow with default parameters. Objectives like squared error and logistic loss will be regularised very differently as a consequence. For example using logistic loss where the hessian values can be much smaller, it can require a much larger number of training instances to split. In #2483 it is noted that the hessian in the case of logistic loss is proportional to variance, however this is not true of other objectives in general.

This is relevant to the task of finding good default parameters across a range of objectives (#4986).

One obvious solution is normalising all objective functions in some consistent way.

Another solution is deprecating min_child_weight and moving to a parameter like min_child_instances, regularising based on the amount of training data without respect to the objective function.

@trivialfis has also proposed implementing multiclass objective functions via vector leaves, if we do this the hessian will be a vector and it is not obvious how to correctly apply min_child_weight.

@thvasilo
Copy link
Contributor

This is a good point. LightGBM uses a default of 1e-3 for comparison, with a min of 20 data points per leaf.

@QuantHao
Copy link

QuantHao commented Aug 6, 2020

I think one problem of replace min_child_weight with min_child_instances is that: how to deal with sample weight? A good questions is raised and answered at a LightGBM issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants