Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version 0.5.0 checklist #178

Closed
14 tasks
MaxHalford opened this issue Nov 11, 2019 · 1 comment
Closed
14 tasks

Version 0.5.0 checklist #178

MaxHalford opened this issue Nov 11, 2019 · 1 comment

Comments

@MaxHalford
Copy link
Member

MaxHalford commented Nov 11, 2019

Here's a list of main issues I would like to see solved before we make the release:

  • Word embeddings; I think we can mark this as solved once we've implemented at least one word embedding algorithm. GloVe seems like a good candidate.
  • Batch windows for semi-incremental learning; I really like the comment by @johny-c and it would be great if we could implement an API for repeating observations when reading from a stream.
  • FMRegressor
  • Focal loss
  • Bayesian linear regression; I have to take care of this. Basically I only have to figure out how to produce prediction intervals and we're good to go.
  • Gradient boosting; @raphaelsty has already implemented AdaBoost, but it would be nice to check if other implementations work better.
  • Poisson regression; I've tried implementing the Poisson loss, but I didn't find a dataset on which it worked better than squared loss. Note that this is implemented in tf.keras.losses.
  • Exponential smoothing
  • Birch clustering; a fair amount of people have expressed interest in having more online clustering solutions (k-means just isn't good enough)
  • Online SVM; ALMA seems to be a good first method to implement. The main thing is that I'm not sure about the API. Should this be an optimizer or a separate class?. It's fine if we don't solve this issue for the release, but I would like to do some brainstorming on it regarding the API.

There also some issues related to docs and tests which need to be solved:

Naturally this doesn't mean other issues can't be worked on too!

@MaxHalford
Copy link
Member Author

Closing this because using milestones seems to be the right way to group issues.

MaxHalford pushed a commit that referenced this issue Jul 10, 2020
* Implements the naming convention defined in Issue #138 and updates legacy demo codes accordingly

* Apply minor improvements to MultiOutputLearner
    - Update documentation to reflect the fact that this estimator is task agnostic  and supports both classification and regression.
    - Add estimator type check
    - Add regression task test

* Fixes #111

* Update documentation of the Hoeffding Tree regressor

* Update test for Very Fast Decision Rules

* Update docstring in learning node for regression to indicate the correct tree type

* Add directive to skip stubs for old names in coverage

* Fix bug in test_pipeline.py

* Add naming convention for new methods in CONTRIBUTING.md file
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant