-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[R-package] how to extract cross validation evaluation metrics from lgb.cv()
?
#5571
Comments
lgb.cv()
?
Thanks for using LightGBM! I can help with this. First, it's important to understand...2-fold cross validation does not mean "there are two iterations". It means that 2 separate LightGBM models will be trained on different randomly-selected subsets of the training data.
Here's an example showing how to perform 2-fold cross validation for a binary classification problem, using 5 boosting rounds (i.e. training 5 trees in each model). This code will work on the latest release of LightGBM on CRAN (v3.3.3) and with the latest development version from this git repository. library(lightgbm)
# create a dataset for binary classification task "is this iris a setosa?"
data("iris")
feature_names <- c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width")
target_col <- "is_setosa"
irisDF <- data.frame(iris)
irisDF[[target_col]] <- as.integer(iris[["Species"]] == "setosa")
dtrain <- lightgbm::lgb.Dataset(
data = data.matrix(irisDF[, feature_names])
, label = irisDF[[target_col]]
, params = list(
min_data_in_bin = 1L
)
)
# perform 2-folder cross-validation
num_boosting_rounds <- 5L
cv_bst <- lightgbm::lgb.cv(
data = dtrain
, nrounds = num_boosting_rounds
, nfold = 2L
, params = list(
objective = "binary"
, metric = c("auc", "binary_error")
, num_leaves = 2L
, min_data_in_leaf = 1L
)
, showsd = FALSE
)
# view out-of-fold binary error and AUC (averaged over the two models)
cv_metrics <- cv_bst[["record_evals"]][["valid"]]
metricDF <- data.frame(
iteration = seq_len(num_boosting_rounds)
, auc = round(unlist(cv_metrics[["auc"]][["eval"]]), 3L)
, binary_error = round(unlist(cv_metrics[["binary_error"]][["eval"]]), 3L)
)
metricDF
Hope that helps! Sorry, we will try to improve the documentation on this |
This issue has been automatically closed because it has been awaiting a response for too long. When you have time to to work with the maintainers to resolve this issue, please post a new comment and it will be re-opened. If the issue has been locked for editing by the time you return to it, please open a new issue and reference this one. Thank you for taking the time to improve LightGBM! |
The evaluation_log of xgboost crossvalidation may look like this: iter train_rmse_mean train_rmse_std test_rmse_mean test_rmse_std That means the training metrics are also there. Thanks for your help! |
This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
Dear Developers:
I am using R package lightgbm. I first split my data into train and test set. Then I want to conduct 2-fold cross validation on my training data to tune parameters. I would like to extract cross validation evaluation metrics such as 'auc'. For 2-fold cross validation, there are two iterations, so there are two evaluation metrics ‘auc’ predicted from the held-out data. May I ask how to extract the cross validation evaluation metrics which is the mean of these two aucs? I am not sure whether I could use $best_score to extract it.
Thanks a lot for your help!
The text was updated successfully, but these errors were encountered: