-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Requests & Voting Hub #2302
Comments
Add support for early stopping in Dask interface #3712 |
Add Earth Mover Distance as objective metric to be optimized (maximized) #1256 |
Apache Arrow seems to be gaining a lot of traction in the dataframe space. |
Conan installation support #5770 |
Add support for Multi-output regression #524 |
Provide access to the bin ids and bin upper bounds of the constructed dataset #5191 |
Consider implementation of the sketchboost algorithm for multi output/multiclass setting. The current multiclass approach is highly ineffecient as a separate tree structure is required for each class. This approach significantly improves on training time and model size by allowing a single tree structure to handle many classes. This is already implemented in the Py-Boost library. |
I am currently working on Apache Arrow support and will likely open a PR next week :) Update: Implementation in #6022 |
WebAssembly support (#5372) |
Support monotone constraints with quantile objective #3371 |
Recalculate feature importance during the update process of a tree model / Calculate Gain Importance on Test Data (#2413) |
Add support for CRLF line endings or improve documentation and error message #5508 |
This issue is to maintain all features request on one page.
Note to contributors: If you want to work for a requested feature, re-open the linked issue. Everyone is welcome to work on any of the issues below.
Note to maintainers: All feature requests should be consolidated to this page. When there are new feature request issues, close them and create the new entries, with the link to the issues, in this page. The one exception is issues marked
good first issue
...these should be left open so they are discoverable by new contributors.Call for Voting
we would like to call the voting here, to prioritize these requests.
If you think a feature request is very necessary for you, you can vote for it by the following process:
Discussions
Efficiency related
Effectiveness related
label
(Warn when passing labels with missing values #4483)Distributed platform and GPU (OpenCL-based and CUDA)
Maintenance
LGBM_BoosterDumpModel
andLGBM_BoosterSaveModel
(Unify out results of LGBM_BoosterDumpModel and LGBM_BoosterSaveModel #2604)lib_lightgbm.dll
symbols to Microsoft Symbols Server (Publish lib_lightgbm.dll symbols to Microsoft Symbols Server #1725)Dataset
([R-package] Add the ability to predict onlgb.Dataset
inPredictor$predict()
#2666, Being able to do Prediction (task=prediction) on bin files. #6613, [python-package] How do I use lgb.Dataset() with lgb.Predict() without using pandas df or np array? #6285)CMakeLists.txt
so that it will be possible to build cpp tests with different options, e.g. with OpenMP support (RefactorCMakeLists.txt
so that it will be possible to build cpp tests with different options, e.g. with OpenMP support #4125)Python package:
HistGradientBoosting
) (The sklearn wrapper is not really compatible with the sklearn ecosystem #2966, [RFC] compatibility with scikit-learn #2628)POINTER()
bybyref()
in Python interface to pass data arrays #4298)staged_predict()
in the scikit-learn API (Staged predict function as in scikit-learn #5031)Dataset
pickleable ([python-package] makeDataset
pickleable #5098)polars
input ([python-package] Adding support for polars for input data #6204)feature_names_in_
and related APIs toscikit-learn
estimators ([python-package] Supportfeature_names_in_
attribute via sklearn API #6279)parametrize_with_checks
for scikit-learn integration tests ([python] Migrate to parametrize_with_checks for scikit-learn integration tests #2947)R package:
lgb.convert_with_rules()
should validate rules ([R-package] lgb.convert_with_rules() should validate rules #2682)save_model
to Booster object (Load back saved parameters with save_model to Booster object #2613)rchk
([R-package] [ci] Add a CI job testing the R package with rchk #4400)commandArgs
instead of hardcoded stuff in the installation script ([R-package] use commandArgs instead of hardcoded stuff in the installation script #2441)lgb.convert
functions should convert columns of type 'logical' ([R-package] lgb.convert() functions should convert columns of type 'logical' #2678)lgb.convert
functions should warn on unconverted columns of unsupported types ([R-package] lgb.convert functions should warn on unconverted columns of unsupported types #2681)lgb.prepare()
andlgb.prepare2()
should be simplified ([R-package] lgb.prepare() and lgb.prepare2() should be simplified #2683)lgb.prepare_rules()
andlgb.prepare_rules2()
should be simplified ([R-package] lgb.prepare() and lgb.prepare2() should be simplified #2684)lgb.prepare()
andlgb.prepare_rules()
([R-package] Remove lgb.prepare() and lgb.prepare_rules() #3075)New features
find_package
andtarget_link_libraries
(Allow LightGBM to be easily used in external projects via modern CMake style withfind_package
andtarget_link_libraries
#4067, fatal error: ../../../external_libs/fmt/include/fmt/format.h: No such file or directory #3925)min_child_sample
(min_child_samples plays bad with weights #5236)Booster.refit()
fails when the booster used a custom objective functionfobj
#5609)New algorithms:
Objective and metric functions:
Python package:
logging.Logger
(Relax constraint on logger class inregister_logger
#4783)Dask:
num_threads
([dask] allow customization of num_threads #3714)init_model
([dask] Support init_model #4063)LGBMModel
([dask] Add LGBMModel #3845)train()
function ([dask] add train() function #3846)cv()
function ([dask] add cv() function #3847)DaskDataset
([dask] add a DaskDataset #3944)pred_contrib
results for multiclass classification with sparse matrices ([dask] preserve chunks in results of multi-class pred_contrib predictions on sparse matrices #4438)DaskLGBMClassifier.predict()
andLGBMClassifier.predict()
([dask] Result shape from DaskLGBMClassifier.predict(pred_contrib=True) for CSC matrices is inconsistent with LGBMClassifier #3881)raw_score
inpredict()
([dask] support 'raw_score' in predict() #3793)init_score
([dask] support init_score #3807)pred_leaf
inpredict()
([dask] support 'pred_leaf' in predict() #3792)predict()
([dask] support 'pred_contrib' in predict() #3713)Support DataTable in Dask (Support DataTable in Dask #3830)R package:
lgb.cv()
([R-package] add support for specifying training indices in lgb.cv() #3924)cb.reset.parameters()
([R-package] Check parameters incb.reset.parameters()
#2665)lgb.Dataset
inPredictor$predict()
([R-package] Add the ability to predict onlgb.Dataset
inPredictor$predict()
#2666)pkgdown >2.0
([docs] [R-package] upgrade R documentation to {pkgdown} 2.0 #4859)lgb.cv()
([R-package] add flag of displaying train loss for lgb.cv() #4911)readRDS()
andsaveRDS()
([R-package] Request: work with R serialization functions #4296)New language wrappers:
Input enhancements:
ChunkedArray
in C API) ([feature] Streaming data allocation #3995, [SWIG] Add streaming data support + cpp tests #3997 (comment))to_numpy()
method as it currently is) (implement datatable ingest directly into lightgbm #2003)The text was updated successfully, but these errors were encountered: