Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial support for multi-target tree. #8616

Merged
merged 6 commits into from
Mar 22, 2023

Conversation

trivialfis
Copy link
Member

@trivialfis trivialfis commented Dec 20, 2022

This is a rough PR for early reviews and discussions, it contains bugs and unfinished code.

  • Add a new multi-target tree structure that's embedded into the existing regtree.
  • Add a new builder to hist.
  • Add a new evaluator to hist.

I try to reuse as much existing code as possible. For instance, there's no structural change to the histogram builder and the implementation just iterates over a list of builders for each target. However, this might change in the future as we want a more integrated implementation. The evaluation code has to be rewritten for performance. Lastly, there are some optimization techniques for the multi-target tree when the number of targets is huge. Some of the known methods include summarizing the gradient, selecting the gradient, projecting the gradient, optimizing for sparse gradient, etc. I haven't implemented any of those yet, the PR is for the core multi-target tree structure.

There are other cases where we have vector leaf but not multi-target tree grower. For instance, we might want the leaf to be a linear model, or it might contain extra parameters for a probability distribution. These will require new tree training algorithms, but the tree structure is largely the same. The PR is a proof of concept.

The implementation is not as efficient as the single-target one, which doesn't represent the theoretical performance of the strategy.

For small testing datasets, using vector-leaf might lead to significant overfit.

what's working

  • Multi-target regression and single-target-multi-class classification training using hist and gbtree with most of the tree parameters except for the mono constraint. Numeric feature only.
  • Inference on CPU including QDM, inplace, DMatrix for normal outputs. (no leaf index, SHAP, etc)

what's not working

everything else.

Related

@trivialfis trivialfis force-pushed the multi-target-hist branch 2 times, most recently from c1cb30e to 0995252 Compare January 4, 2023 20:14
@trivialfis trivialfis force-pushed the multi-target-hist branch 2 times, most recently from 4a63670 to ec56a0f Compare February 9, 2023 09:30
@trivialfis trivialfis force-pushed the multi-target-hist branch 2 times, most recently from 0961133 to 99404c7 Compare March 6, 2023 13:06
@trivialfis trivialfis changed the title [WIP] Initial support for multi-target tree. Initial support for multi-target tree. Mar 17, 2023
@trivialfis trivialfis marked this pull request as ready for review March 17, 2023 14:00
@@ -310,14 +300,8 @@ void PredictBatchByBlockOfRowsKernel(DataView batch, gbm::GBTreeModel const &mod

FVecFill(block_size, batch_offset, num_feature, &batch, fvec_offset, p_thread_temp);
// process block of rows through all trees to keep cache locality
if (model.learner_model_param->IsVectorLeaf()) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to not rely on the model parameter, which is not serialized into JSON model.

@@ -530,17 +530,17 @@ class TensorView {
/**
* \brief Number of items in the tensor.
*/
LINALG_HD [[nodiscard]] std::size_t Size() const { return size_; }
[[nodiscard]] LINALG_HD std::size_t Size() const { return size_; }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clangd is not quite happy about the place of c++ attribute when running in CUDA mode.

@@ -352,19 +352,6 @@ struct WQSummary {
prev_rmax = data[i].rmax;
}
}
// check consistency of the summary
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused function.

#include "xgboost/objective.h"
#include "xgboost/predictor.h"
#include "xgboost/string_view.h"
#include "xgboost/string_view.h" // for StringView
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept the custom string view for now. Some changes in c++20 string_view might be useful, we can back-port it to xgboost when needed.

monitor_->Stop(__func__);
}

void LeafPartition(RegTree const &tree, linalg::MatrixView<GradientPair const> gpair,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not used yet. We need some work on L1 and quantile regression for estimating vector leaf.

@@ -230,6 +236,11 @@ def main(args: argparse.Namespace) -> None:
parser.add_argument("--format", type=int, choices=[0, 1], default=1)
parser.add_argument("--type-check", type=int, choices=[0, 1], default=1)
parser.add_argument("--pylint", type=int, choices=[0, 1], default=1)
parser.add_argument(
"--fix",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A new argument for convenience.

@@ -32,6 +32,19 @@ def train_result(param, dmat: xgb.DMatrix, num_rounds: int) -> dict:
return result


class TestGPUUpdatersMulti:
Copy link
Member Author

@trivialfis trivialfis Mar 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have extracted all the multi-target/class datasets into an independent hypothesis search strategy. Other than the test for CPU hist, no testing logic is changed.

@@ -352,137 +352,6 @@ def __repr__(self) -> str:
return self.name


@memory.cache
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pylint complains about the file being too huge (>1000 loc). I moved some of the data fetchers into testing/data.py.

@trivialfis
Copy link
Member Author

trivialfis commented Mar 17, 2023

Not sure if this is useful, but you can do it just for fun:

def alternate(plot_result: bool) -> None:
    """Draw a circle with 2-dim coordinate as target variables."""
    from xgboost.callback import TrainingCallback

    class ResetStrategy(TrainingCallback):
        def before_iteration(self, model, epoch: int, evals_log) -> bool:
            strategy = "multi_output_tree" if epoch % 2 == 0 else "one_output_per_tree"
            model.set_param({"multi_strategy": strategy})
            return False

    X, y = gen_circle()
    # Train a regressor on it
    reg = xgb.XGBRegressor(
        tree_method="hist",
        n_estimators=4,
        n_jobs=1,
        max_depth=8,
        subsample=0.6,
        callbacks=[ResetStrategy()]
    )
    reg.fit(X, y, eval_set=[(X, y)])

doc/parameter.rst Outdated Show resolved Hide resolved
- Add new hist tree builder.
- Move data fetchers for tests.
- Dispatch function calls in gbm base on the tree type.
@trivialfis trivialfis merged commit 151882d into dmlc:master Mar 22, 2023
@trivialfis trivialfis deleted the multi-target-hist branch March 22, 2023 15:50
@s-banach
Copy link

(Just seeing this for the first time, haven't put much brain power into it yet.)
Will it be easy to use this where the multiple targets are different parameters of a random variable?
E.g. shape and scale of a gamma variable.

@StatMixedML
Copy link

@s-banach

Essentially, multi target trees can be used if there are more than one parameter to predict. This can be useful if you want to model all parameters of a univariate and multivariate parametric distribution, see

Multi-Target XGBoostLSS Regression
Distributional Gradient Boosting Machines
XGBoostLSS - An extension of XGBoost to probabilistic forecasting

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants