Lightgbm CPU learner error - lightgbm.basic.LightGBMError: Check failed: (best_split_info.left_count) > (0) #3489

diditforlulz273 · 2020-10-26T18:31:28Z

How you are using LightGBM?

LightGBM component:

Environment info

Operating System:
Ubuntu 20.04
CPU/GPU model:
Threadripper 1920x
C++ compiler version:
gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04)
CMake version:
3.16.3
Java version:

Python version:
3.8.5
R version:

Other:

LightGBM version or commit hash:
Version: 3.0.0

Error message and / or logs

@guolinke Have just built it from the latest master branch, still fails. I'll try to separate a minimum reproducible example and create an issue then.

Originally posted by @diditforlulz273 in #2793 (comment)

[LightGBM] [Fatal] Check failed: (best_split_info.left_count) > (0) at /__w/1/s/python-package/compile/src/treelearner/serial_tree_learner.cpp, line 630 .

Traceback (most recent call last):
File "/home/seva/PycharmProjects/ECOM_demand/lgbm_mre.py", line 41, in
model = lgb.train(lgb_params, train_dat, valid_sets=test_dat, verbose_eval=20)
File "/home/seva/PycharmProjects/ECOM_demand/venv/lib/python3.8/site-packages/lightgbm/engine.py", line 252, in train
booster.update(fobj=fobj)
File "/home/seva/PycharmProjects/ECOM_demand/venv/lib/python3.8/site-packages/lightgbm/basic.py", line 2370, in update
_safe_call(_LIB.LGBM_BoosterUpdateOneIter(
File "/home/seva/PycharmProjects/ECOM_demand/venv/lib/python3.8/site-packages/lightgbm/basic.py", line 55, in _safe_call
raise LightGBMError(decode_string(_LIB.LGBM_GetLastError()))
lightgbm.basic.LightGBMError: Check failed: (best_split_info.left_count) > (0) at /__w/1/s/python-package/compile/src/treelearner/serial_tree_learner.cpp, line 630 .

Reproducible example(s)

Archive with code and pickled datasets, 4.5 kb total (lol really a MINIMAL reproducible example)

https://drive.google.com/file/d/1y04Z_11Ce-sETRZPfwjBEBY-aSOMBcHK/view?usp=sharing

Steps to reproduce

1.Run it on lgbm==3.0.0
2. ?????
3. NO PROFIT!
4. Run on lgbm==2.3.1
5.?????
6. PROFIT!

guolinke · 2020-10-27T03:35:02Z

Thanks @diditforlulz273
It seems I met the error when loading data.

>>> train_x = pd.read_pickle('train_x.pkl')
>>> test_x = pd.read_pickle('test_x.pkl')
>>> train_y = np.loadtxt('train_y.txt')
>>> test_y = np.loadtxt('test_y.txt')
>>> train_x
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\guoke\Anaconda3\lib\site-packages\pandas\core\frame.py", line 680, in __repr__
    self.to_string(
  File "C:\Users\guoke\Anaconda3\lib\site-packages\pandas\core\frame.py", line 801, in to_string
    formatter = fmt.DataFrameFormatter(
  File "C:\Users\guoke\Anaconda3\lib\site-packages\pandas\io\formats\format.py", line 593, in __init__
    self.max_rows_displayed = min(max_rows or len(self.frame), len(self.frame))
  File "C:\Users\guoke\Anaconda3\lib\site-packages\pandas\core\frame.py", line 1041, in __len__
    return len(self.index)
  File "C:\Users\guoke\Anaconda3\lib\site-packages\pandas\core\generic.py", line 5270, in __getattr__
    return object.__getattribute__(self, name)
  File "pandas\_libs\properties.pyx", line 63, in pandas._libs.properties.AxisProperty.__get__
  File "C:\Users\guoke\Anaconda3\lib\site-packages\pandas\core\generic.py", line 5270, in __getattr__
    return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute '_data'

could you share the raw data or NumPy format data?

diditforlulz273 · 2020-10-27T07:02:21Z

I guess the problem could be in different versions of Pandas and python's underlying pickle part.
I used Pandas 1.1.3 and Python 3.8.5

Sharing numpy arrays is nearly impossible - train dataframe has ~54 columns.
If versions matching won't help, I could share the data in .csv, although saving to .csv sometimes transforms ints like 7 to floats like 7.000000039 in an unpredictable manner, which can affect reproductability.

guolinke · 2020-10-27T09:23:51Z

@diditforlulz273 I found the root cause you cannot set both min_data_in_leaf and min_child_weight to 0.
The leaf should be at least have one sample, otherwise, it cannot be a leaf.
We will fix the parameter checking, and throw errors before training.

diditforlulz273 · 2020-10-27T11:53:37Z

@guolinke Well, indeed, this combination looks stupid, my bad. I used this tiny dataset in some of my integration tests, and, I guess, set all the possible constraints to 0 to make LightGBM build at least some trees and predict something reproducible. Anyway, it worked out well in 2.3.1 version :)

github-actions · 2023-08-23T20:47:04Z

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

guolinke mentioned this issue Oct 27, 2020

avoid min_data and min_hessian are zeros at the same time #3492

Merged

guolinke closed this as completed in #3492 Oct 28, 2020

github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lightgbm CPU learner error - lightgbm.basic.LightGBMError: Check failed: (best_split_info.left_count) > (0) #3489

Lightgbm CPU learner error - lightgbm.basic.LightGBMError: Check failed: (best_split_info.left_count) > (0) #3489

diditforlulz273 commented Oct 26, 2020 •

edited

Loading

guolinke commented Oct 27, 2020

diditforlulz273 commented Oct 27, 2020

guolinke commented Oct 27, 2020

diditforlulz273 commented Oct 27, 2020

github-actions bot commented Aug 23, 2023

Lightgbm CPU learner error - lightgbm.basic.LightGBMError: Check failed: (best_split_info.left_count) > (0) #3489

Lightgbm CPU learner error - lightgbm.basic.LightGBMError: Check failed: (best_split_info.left_count) > (0) #3489

Comments

diditforlulz273 commented Oct 26, 2020 • edited Loading

How you are using LightGBM?

Environment info

Operating System: Ubuntu 20.04 CPU/GPU model: Threadripper 1920x C++ compiler version: gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04) CMake version: 3.16.3 Java version:

Python version: 3.8.5 R version:

Error message and / or logs

Reproducible example(s)

Steps to reproduce

guolinke commented Oct 27, 2020

diditforlulz273 commented Oct 27, 2020

guolinke commented Oct 27, 2020

diditforlulz273 commented Oct 27, 2020

github-actions bot commented Aug 23, 2023

diditforlulz273 commented Oct 26, 2020 •

edited

Loading

Operating System:
Ubuntu 20.04
CPU/GPU model:
Threadripper 1920x
C++ compiler version:
gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04)
CMake version:
3.16.3
Java version:

Python version:
3.8.5
R version: