-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[python] handle arbitrary length feature names in Python-package #4293
Conversation
@@ -37,6 +39,8 @@ def test_basic(tmp_path): | |||
if i % 10 == 0: | |||
print(bst.eval_train(), bst.eval_valid()) | |||
|
|||
assert train_data.get_feature_name() == feature_names |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current master
fails here with
BufferError: Allocated feature name buffer size (255) wasinferior to the needed size (1001).
BTW, I overlooked bad string construction in #4143 🙁
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I overlooked bad string construction in #4143
ah!! I see now
required_string_vuffer_size.value
Sorry about that. I must have looked at that diff 10 times and never noticed that mistake (probably my mistake). I think we should consider adding pylint
in CI. I see this error running pylint python-package/ | grep vuffer
.
python-package/lightgbm/basic.py:3462:124: E0602: Undefined variable 'required_string_vuffer_size' (undefined-variable)
python-package/build/lib/lightgbm/basic.py:3460:124: E0602: Undefined variable 'required_string_vuffer_size' (undefined-variable)
I mentioned that last year and then forgot about it 😬 #3377 (comment). I should have written up an issue...I'll do that right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just created #4308
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow! Very nice catch! I haven't noticed this too 🙁 .
I was speaking about "no whitespace between words" issue.
BufferError: Allocated feature name buffer size (255) wasinferior to the needed size (1001).
------^------
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haha oh! Then two mistakes 😆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thank you! The tests make sense to me, and I think these calls are correct. I got a bit confused about the difference between actual_*
, required_*
, and reserved_*
, but I can't think of better names.
I compared these changes to the equivalent code in c_api
and in R-package/src/lightgbm_R.cpp
, and they all seem to match (except for LGBM_BoosterGetFeatureNames()
, which is not used in the R package today).
This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this. |
Refer to #4256 (comment).