You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add statistics to the plot from a dataset, its labels and predictions
Motivation
lgb.create_tree_digraph and plot_tree are really great to visualize the tree, but without the statistics attached this less useful.
Description
Be able to plot a tree with the result from a csv or bin dataset (ie. show_info=["statistics"])
I would pass a huge dataset bin file (with the right labels) and either LGBM does a prediction or I can pass a prediction file for those and then it would plot and add
the number of instances that hit this leaf (corretly and incorrectly classify)
the number of instances that were correctly classify that hit this leaf
and the percentage "correctly classify / (corretly and incorrectly classify)"
I guess there is no easy way to combines those trees to represents the classes and the iterations....
Later (lower priority), being able to generate this from CLI would be nice too.
Thanks
--w
The text was updated successfully, but these errors were encountered:
wil70
changed the title
Add statistics to the plot from a dataset, its labels and predictions
[Feature] Add statistics to the plot from a dataset, its labels and predictions
Aug 16, 2024
Thanks for using LightGBM, and for your interest in improving model inspection!
I'm not convinced that this is so generically useful that it should be in lightgbm directly, with all the documentation, testing, and maintenance work that entails. It sounds like this would be a better fit for your own custom code, and lightgbm (the Python package for LightGBM) provides all the core APIs to get the input data for it:
.dump_model() or .trees_to_dataframe() for the tree structure
predict(pred_leaf=True) to get leaf indices (which could be aggregated by leaf index to get those counts)
predict() to get the actual predictions (which could be used to compute the "% correctly classified" you refer to)
If you want to see something like this in lightgbm, the best way to make that happen is to implement it yourself and try to contribute it. If you do that, please be ready to work with us on testing, documentation, etc. There is very limited maintainer availability in this project (as you may have noticed), and situations like #5488 take some of that maintainer availability away from other parts of the project.
Summary
Add statistics to the plot from a dataset, its labels and predictions
Motivation
lgb.create_tree_digraph and plot_tree are really great to visualize the tree, but without the statistics attached this less useful.
Description
Be able to plot a tree with the result from a csv or bin dataset (ie. show_info=["statistics"])
I would pass a huge dataset bin file (with the right labels) and either LGBM does a prediction or I can pass a prediction file for those and then it would plot and add
I guess there is no easy way to combines those trees to represents the classes and the iterations....
Later (lower priority), being able to generate this from CLI would be nice too.
Thanks
--w
The text was updated successfully, but these errors were encountered: