Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linear decision trees improvements #60

Merged
merged 63 commits into from
Dec 6, 2020
Merged

Conversation

bytesnake
Copy link
Member

@bytesnake bytesnake commented Nov 20, 2020

This PR continues the work of #43

  • Rebase to master branch
  • Use new traits for linfa-trees
  • Add more tests to linfa-trees
  • Add function to generate tikz code describing the tree

@mossbanay
Copy link
Contributor

I have exams this week and next but I'm interested in helping out with this afterward. There are only a few changes left mentioned in the previous PR.

 * use toy test from sklearn
 * use four perfectly separable uniform blobs
This hyper-parameter can be estimated from the input data and is
therefore uneccessary in the API.
Copy link
Contributor

@mossbanay mossbanay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests & lints want a review?

Copy link
Contributor

@mossbanay mossbanay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test & lints want an approval?

@bytesnake bytesnake changed the title [WIP] Trees and ensemble algorithms Linear decision trees improvements Dec 5, 2020
@mossbanay
Copy link
Contributor

For looking at random forests (which I think should be in a separate PR to these improvements) we can either:

  1. Pass in a mask over the features
  2. Generate datasets with the features masked out
    It's not immediately clear to me which would be better in the long-term so I suggest we pursue (2) for the moment since it doesn't clutter the API for now.

@bytesnake
Copy link
Member Author

they were moved to #66

@codecov-io
Copy link

codecov-io commented Dec 6, 2020

Codecov Report

Merging #60 (5fd321a) into master (a3eede5) will decrease coverage by 0.56%.
The diff coverage is 2.39%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master     #60      +/-   ##
=========================================
- Coverage    9.97%   9.40%   -0.57%     
=========================================
  Files          47      49       +2     
  Lines        2507    2657     +150     
=========================================
  Hits          250     250              
- Misses       2257    2407     +150     
Impacted Files Coverage Δ
linfa-logistic/src/lib.rs 0.00% <0.00%> (ø)
linfa-svm/src/classification.rs 0.00% <ø> (ø)
linfa-trees/src/decision_trees/algorithm.rs 0.00% <0.00%> (ø)
linfa-trees/src/decision_trees/hyperparameters.rs 0.00% <0.00%> (ø)
linfa-trees/src/decision_trees/iter.rs 0.00% <0.00%> (ø)
linfa-trees/src/decision_trees/tikz.rs 0.00% <0.00%> (ø)
src/dataset/mod.rs 50.00% <ø> (ø)
src/dataset/impl_dataset.rs 9.37% <6.66%> (-7.30%) ⬇️
src/metrics_classification.rs 77.41% <66.66%> (ø)
src/dataset/impl_targets.rs 44.44% <100.00%> (ø)
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a3eede5...5fd321a. Read the comment docs.

 * introduce node iterator
 * rewrite `max_depth`, `num_leaves`, `features` in iterator syntax
Copy link
Contributor

@mossbanay mossbanay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to merge this then continue on with RFs in that other PR

@bytesnake
Copy link
Member Author

awesome 👍 I will just write a quick function which can generates a tikz styled tree (for example here) and then merge

@bytesnake
Copy link
Member Author

from the example, this is not as pretty as possible, but for now is sufficient
decision-tree

@bytesnake bytesnake merged commit bfa5aeb into rust-ml:master Dec 6, 2020
@mossbanay
Copy link
Contributor

Nice work! That looks great. One of the more premium features in ML libraries for sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants