[R-package] how to extract cross validation evaluation metrics from `lgb.cv()`? #5571

yilinwu123 · 2022-11-04T08:25:46Z

Dear Developers:

I am using R package lightgbm. I first split my data into train and test set. Then I want to conduct 2-fold cross validation on my training data to tune parameters. I would like to extract cross validation evaluation metrics such as 'auc'. For 2-fold cross validation, there are two iterations, so there are two evaluation metrics ‘auc’ predicted from the held-out data. May I ask how to extract the cross validation evaluation metrics which is the mean of these two aucs? I am not sure whether I could use $best_score to extract it.

Thanks a lot for your help!

jameslamb · 2022-11-05T03:41:23Z

Thanks for using LightGBM! I can help with this.

First, it's important to understand...2-fold cross validation does not mean "there are two iterations". It means that 2 separate LightGBM models will be trained on different randomly-selected subsets of the training data.

lgb.cv() returns a LightGBM CVBooster object. Metrics evaluated on the out-of-fold data (averaged across all models), is available in the $records_evals$valid attribute of that object.

Here's an example showing how to perform 2-fold cross validation for a binary classification problem, using 5 boosting rounds (i.e. training 5 trees in each model).

This code will work on the latest release of LightGBM on CRAN (v3.3.3) and with the latest development version from this git repository.

library(lightgbm)

# create a dataset for binary classification task "is this iris a setosa?"
data("iris")

feature_names <- c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width")
target_col <- "is_setosa"

irisDF <- data.frame(iris)
irisDF[[target_col]] <- as.integer(iris[["Species"]] == "setosa")

dtrain <- lightgbm::lgb.Dataset(
    data = data.matrix(irisDF[, feature_names])
    , label = irisDF[[target_col]]
    , params = list(
        min_data_in_bin = 1L
    )
)

# perform 2-folder cross-validation
num_boosting_rounds <- 5L
cv_bst <- lightgbm::lgb.cv(
    data = dtrain
    , nrounds = num_boosting_rounds
    , nfold = 2L
    , params = list(
        objective = "binary"
        , metric = c("auc", "binary_error")
        , num_leaves = 2L
        , min_data_in_leaf = 1L
    )
    , showsd = FALSE
)

# view out-of-fold binary error and AUC (averaged over the two models)
cv_metrics <- cv_bst[["record_evals"]][["valid"]]
metricDF <- data.frame(
    iteration = seq_len(num_boosting_rounds)
    , auc = round(unlist(cv_metrics[["auc"]][["eval"]]), 3L)
    , binary_error = round(unlist(cv_metrics[["binary_error"]][["eval"]]), 3L)
)
metricDF

metricDF in this sample code contains the values of two metrics (AUC and binary error) averaged across both models.

  iteration   auc binary_error
1         1 0.995        0.333
2         2 0.995        0.333
3         3 0.995        0.193
4         4 0.995        0.007
5         5 0.995        0.007

Hope that helps! Sorry, we will try to improve the documentation on this eval_results property and its interpretation in the future.

github-actions · 2022-12-05T04:03:19Z

This issue has been automatically closed because it has been awaiting a response for too long. When you have time to to work with the maintainers to resolve this issue, please post a new comment and it will be re-opened. If the issue has been locked for editing by the time you return to it, please open a new issue and reference this one. Thank you for taking the time to improve LightGBM!

mocista · 2023-07-26T15:14:24Z

The evaluation_log of xgboost crossvalidation may look like this:

iter train_rmse_mean train_rmse_std test_rmse_mean test_rmse_std
1: 1 3098 9.72 3052 41.00
2: 2 3002 9.22 3011 40.98

That means the training metrics are also there.
In lgbm this seems not to be the case, I only see the validation metrics (ex.: lgb_cv$record_evals$valid$rmse$eval)
Is there a way to also get the training metrics?

Thanks for your help!

jmoralez · 2023-07-26T21:18:58Z

Hey @mocista. You'll be able to use the eval_train_metric argument of lgb.cv (added in #4918) in lightgbm>=4.0.0, however we're still in the process of publishing that version to CRAN. You can subscribe to #5987 to track when we do.

Sorry for the inconvenience.

github-actions · 2023-10-25T00:18:48Z

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

jameslamb added the r-package label Nov 5, 2022

jameslamb changed the title ~~R package lgb.cv "how to extract cross validation evaluation metrics?"~~ [R-package] how to extract cross validation evaluation metrics from lgb.cv()? Nov 5, 2022

jameslamb added the awaiting response label Nov 5, 2022

github-actions bot closed this as completed Dec 5, 2022

github-actions bot removed the awaiting response label Oct 25, 2023

github-actions bot locked as resolved and limited conversation to collaborators Oct 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[R-package] how to extract cross validation evaluation metrics from `lgb.cv()`? #5571

[R-package] how to extract cross validation evaluation metrics from `lgb.cv()`? #5571

yilinwu123 commented Nov 4, 2022

jameslamb commented Nov 5, 2022

github-actions bot commented Dec 5, 2022

mocista commented Jul 26, 2023

jmoralez commented Jul 26, 2023

github-actions bot commented Oct 25, 2023

[R-package] how to extract cross validation evaluation metrics from lgb.cv()? #5571

[R-package] how to extract cross validation evaluation metrics from lgb.cv()? #5571

Comments

yilinwu123 commented Nov 4, 2022

jameslamb commented Nov 5, 2022

github-actions bot commented Dec 5, 2022

mocista commented Jul 26, 2023

jmoralez commented Jul 26, 2023

github-actions bot commented Oct 25, 2023

[R-package] how to extract cross validation evaluation metrics from `lgb.cv()`? #5571

[R-package] how to extract cross validation evaluation metrics from `lgb.cv()`? #5571