Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Path smoothing #2950

Merged
merged 14 commits into from
May 3, 2020
2 changes: 1 addition & 1 deletion .ci/test_r_package.sh
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ if grep -q -R "WARNING" "$LOG_FILE_NAME"; then
exit -1
fi

ALLOWED_CHECK_NOTES=3
ALLOWED_CHECK_NOTES=4
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jameslamb Should be 3, right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes @btrotta can you please change to 3?

A lot has changed in our CI since this PR was first opened, sorry 😬

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ALLOWED_CHECK_NOTES=4
ALLOWED_CHECK_NOTES=3

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jameslamb Is a similar increment needed for Windows?

$ALLOWED_CHECK_NOTES = 3

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries, fixed now.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping @jameslamb for Windows updates

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@StrikerRUS nothing is required on this PR since the Windows R tests are passing. We have it set to 3 and are generating exactly 3:

image

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jameslamb OK. I just was afraid that we can hit #2950 (comment) on Windows if let say the current size is 4.99 during some runs it can be calculated as 5, like in #2988.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do we can bump it back up and it wouldn't bother me. I just don't think this PR has to be the place where we deal with it. If CI is passing for the R package right now, nothing should be changed.

NUM_CHECK_NOTES=$(
cat ${LOG_FILE_NAME} \
| grep -e '^Status: .* NOTE.*' \
Expand Down
2 changes: 2 additions & 0 deletions docs/Parameters-Tuning.rst
Original file line number Diff line number Diff line change
Expand Up @@ -82,3 +82,5 @@ Deal with Over-fitting
- Try ``extra_trees``

.. _Optuna: https://medium.com/optuna/lightgbm-tuner-new-optuna-integration-for-hyperparameter-optimization-8b7095e99258

- Try increasing ``path_smoothing``
StrikerRUS marked this conversation as resolved.
Show resolved Hide resolved
14 changes: 14 additions & 0 deletions docs/Parameters.rst
Original file line number Diff line number Diff line change
Expand Up @@ -514,6 +514,20 @@ Learning Control Parameters

- applied once per forest

- ``path_smooth`` :raw-html:`<a id="path_smooth" title="Permalink to this parameter" href="#path_smooth">&#x1F517;&#xFE0E;</a>`, default = ``0``, type = double, constraints: ``path_smooth >= 0.0``

- controls smoothing applied to tree nodes

- helps prevent overfitting on leaves with few samples

- if set to zero, no smoothing is applied

- if `path_smooth > 0` then `min_data_in_leaf` must be at least `2`.

- larger values give stronger regularisation

- the weight of each node is `(n / path_smooth) * w + w_p / (n / path_smooth + 1)`, where `n` is the number of samples in the node, `w` is the calculated node weight, and `w_p` is the weight of the parent node

- ``verbosity`` :raw-html:`<a id="verbosity" title="Permalink to this parameter" href="#verbosity">&#x1F517;&#xFE0E;</a>`, default = ``1``, type = int, aliases: ``verbose``

- controls the level of LightGBM's verbosity
Expand Down
9 changes: 9 additions & 0 deletions include/LightGBM/config.h
Original file line number Diff line number Diff line change
Expand Up @@ -488,6 +488,15 @@ struct Config {
// desc = applied once per forest
std::vector<double> cegb_penalty_feature_coupled;

// check = >= 0.0
// desc = controls smoothing applied to tree nodes
// desc = helps prevent overfitting on leaves with few samples
// desc = if set to zero, no smoothing is applied
// desc = if `path_smooth > 0` then `min_data_in_leaf` must be at least `2`.
StrikerRUS marked this conversation as resolved.
Show resolved Hide resolved
// desc = larger values give stronger regularisation
// descl2 = the weight of each node is `(n / path_smooth) * w + w_p / (n / path_smooth + 1)`, where `n` is the number of samples in the node, `w` is the calculated node weight, and `w_p` is the weight of the parent node
double path_smooth = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more details about w and w_p?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some more explanation, hope it's clearer now.


// alias = verbose
// desc = controls the level of LightGBM's verbosity
// desc = ``< 0``: Fatal, ``= 0``: Error (Warning), ``= 1``: Info, ``> 1``: Debug
Expand Down
5 changes: 3 additions & 2 deletions include/LightGBM/tree.h
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,9 @@ class Tree {
/*! \brief Get depth of specific leaf*/
inline int leaf_depth(int leaf_idx) const { return leaf_depth_[leaf_idx]; }

/*! \brief Get parent of specific leaf*/
inline int leaf_parent(int leaf_idx) const {return leaf_parent_[leaf_idx]; }

/*! \brief Get feature of specific split*/
inline int split_feature(int split_idx) const { return split_feature_[split_idx]; }

Expand All @@ -163,8 +166,6 @@ class Tree {
return split_feature_inner_[node_idx];
}

inline int leaf_parent(int leaf_idx) const { return leaf_parent_[leaf_idx]; }

inline uint32_t threshold_in_bin(int node_idx) const {
return threshold_in_bin_[node_idx];
}
Expand Down
8 changes: 8 additions & 0 deletions src/io/config.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -317,6 +317,14 @@ void Config::CheckParamConflict() {
force_col_wise = true;
force_row_wise = false;
}
// min_data_in_leaf must be at least 2 if path smoothing is active. This is because when the split is calculated
// the count is calculated using the proportion of hessian in the leaf which is rounded up to nearest int, so it can
// be 1 when there is actually no data in the leaf. In rare cases this can cause a bug because with path smoothing the
// calculated split gain can be positive even with zero gradient and hessian.
if (path_smooth > kEpsilon && min_data_in_leaf < 2) {
min_data_in_leaf = 2;
Log::Warning("min_data_in_leaf has been increased to 2 because this is required when path smoothing is active.");
}
if (is_parallel && monotone_constraints_method == std::string("intermediate")) {
// In distributed mode, local node doesn't have histograms on all features, cannot perform "intermediate" monotone constraints.
Log::Warning("Cannot use \"intermediate\" monotone constraints in parallel learning, auto set to \"basic\" method.");
Expand Down
5 changes: 5 additions & 0 deletions src/io/config_auto.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -225,6 +225,7 @@ const std::unordered_set<std::string>& Config::parameter_set() {
"cegb_penalty_split",
"cegb_penalty_feature_lazy",
"cegb_penalty_feature_coupled",
"path_smooth",
"verbosity",
"input_model",
"output_model",
Expand Down Expand Up @@ -443,6 +444,9 @@ void Config::GetMembersFromString(const std::unordered_map<std::string, std::str
cegb_penalty_feature_coupled = Common::StringToArray<double>(tmp_str, ',');
}

GetDouble(params, "path_smooth", &path_smooth);
CHECK_GE(path_smooth, 0.0);

GetInt(params, "verbosity", &verbosity);

GetString(params, "input_model", &input_model);
Expand Down Expand Up @@ -646,6 +650,7 @@ std::string Config::SaveMembersToString() const {
str_buf << "[cegb_penalty_split: " << cegb_penalty_split << "]\n";
str_buf << "[cegb_penalty_feature_lazy: " << Common::Join(cegb_penalty_feature_lazy, ",") << "]\n";
str_buf << "[cegb_penalty_feature_coupled: " << Common::Join(cegb_penalty_feature_coupled, ",") << "]\n";
str_buf << "[path_smooth: " << path_smooth << "]\n";
str_buf << "[verbosity: " << verbosity << "]\n";
str_buf << "[max_bin: " << max_bin << "]\n";
str_buf << "[max_bin_by_feature: " << Common::Join(max_bin_by_feature, ",") << "]\n";
Expand Down
Loading