-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LightGBMError: Check failed: best_split_info.left_count > 0 for ranking task #2742
Comments
@kenyeung128 Hi! You can download the latest nightly version from this page: https://lightgbm.readthedocs.io/en/latest/Installation-Guide.html if you do not want to build from sources. I think we need some time to test large changes introduced in #2699, fix some critical issues, wait for any news for #2628 and so on. Also, I believe the next release should be 2.4.0 at least, or even 3.0.0, according to the semantic versioning we try to follow due to removal of some parameters. cc @guolinke |
hi @StrikerRUS thanks for the info, i have followed the installation guide and installed the python version with the latest source code. however, i get an error as below when i tried to fit a LGBMRanker LightGBMError: Check failed: best_split_info.left_count > 0 at /data/github/LightGBM/src/treelearner/serial_tree_learner.cpp, line 706 . do u have any idea? |
@kenyeung128 can you provide the reproduce example? |
hi @guolinke the example is like this gbm = lgb.LGBMRanker(n_estimators=800, score='ncdg') and found out seems the root cause is that it's using categorical_feature, there are examples in x_train but not in x_valid, so it throws out the exception. But in previous version 2.3.1 lightgbm, it's working |
Thanks @kenyeung128 , could you provide the data for debugging? |
@guolinke it's quite a big dataset, but i tried to provide data in smaller dataset but it didn't come up with the error. I suspect it's something with the split of data not evenly distributed between train/valid (for those categorical ones), and somehow it throws the exception |
@kenyeung128 is that possible to reproduce this by randomly generated data? |
@guolinke I've also been running into this error when fitting models with categorical predictors in the latest master branch. Here is a reproducible example in R v3.5.2: Code: library(lightgbm)
set.seed(1)
data <- data.frame(
y = as.integer(runif(1000) > .5)
,x = sample(c(1,1,1,2), 1000, replace = TRUE)
)
data_matrix <- as.matrix(data[, "x", drop = FALSE])
dtrain <- lgb.Dataset(data_matrix, label = data$y, categorical_feature = "x")
model <- lgb.train(
params = list(objective = "binary")
,data = dtrain
,nrounds = 1
) Error: [LightGBM] [Fatal] Check failed: best_split_info.right_count > 0 at /tmp/RtmpE5HQrB/R.INSTALL1226e7bebac9e/lightgbm/src/src/treelearner/serial_tree_learner.cpp, line 706 . |
@kenyeung128 @rgranvil thanks very much, could you try the #2824 ? |
I still found this issue when I use GPU. |
I also found this issue on GPU build. |
I also found this issue occurring from time to time on relatively fresh GPU build. For ~4 Gb dataset with 250000 rows and 1500 columns it could train normally for hours, than I got
|
I'm encountering a very similar issue to this on 3.0.0 on GPU. I dropped my categorical columns because I was concerned that was the cause (looking at another issue on the GitHub). But, I still get the error, even though I might hit it less frequently? Also, I seem to hit the error quite randomly, so for instance, this set of parameters when passed to the train class created the error:
But, these two did not:
And the exact error that I recieved:
Just, for reference, to show my original function call:
|
Is this still going on? we should start a new entry |
I have this error on CPU with the latest version on Windows 10 x64. Data is private so I sadly can't share it.
|
Same here on CPU with macOS using virtualenv
Dependencies:
The df is a pkl'd file that is working fine on Kaggle notebooks. |
@muttoni @chutcheson Could you provide a reproducible example, or, if possible, share your dataset with us? We are trying to remove this bug before the next release. Any help is really appreciated. Thanks. |
The issue is reproducible with attached dataset @shiyu1994 @StrikerRUS Error message:LightGBMError: Check failed: (best_split_info.left_count) > (0) at D:\a\1\s\python-package\compile\src\treelearner\serial_tree_learner.cpp, line 651 . Code to reproduce:
Few Observations
|
I ran into the same problem here when using multiclassova. Multiclass worked nicely. |
I also have this issue when using a similar workflow as the example @arkothiwala posted. |
@arkothiwala @zacheberhart-kd Thanks! The example is OK with the latest master branch. You can clone the source code and build the python package from source. |
I built from source today, have this error on GPU both Windows and Linux Params = { [LightGBM] [Info] This is the GPU trainer!! Traceback (most recent call last): |
@Teeeto Increasing the number of features resolved the exception for me on CPU. |
I dont have this issue on CPU. However CPU is slow - one iteration per 2 minutes on my dataset (a 60-core machine).
Thank You! I think it would be great to prioritize this fix since the issue is prohibitive in certain use cases. |
Just built on linux - problem still persists, there is a circle of issues referencing each other all closed, not sure which one to reopen. |
Issue disappered when
|
Summary of this entire thread:
Solution from @Jumabek might work on Linux, but under Windows, pip does not have v3.2.1 yet. If using Anaconda install LightGBM v3.2.1:
|
Wheel file |
I've been consistently having this issue in Kaggle and Colab notebooks (package v3.3.2) while using the HDFSequence example and a GPU when I increase the number of leaves. Is there a generally accepted fix for this issue? |
I also observe this issue with LightGBM version 3.3.2 and a GPU. My setup for what it's worth:
|
This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
hi,
when would lightgbm 2.3.2 be released? as the documentation online (https://lightgbm.readthedocs.io/) is 2.3.2 with some new features but the latest python package is lightgbm 2.3.1. Thanks.
The text was updated successfully, but these errors were encountered: