-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TPOTEnsemble idea #479
Comments
I made a hacky demo of the TPOTEnsemble idea in this commit. It seemed to work fine in my tests, although it gets much, much slower as the generations pass because, e.g., by generation 100 every pipeline is being evaluated in a VotingClassifier with 99 other pipelines. The only reasonable solution seems to be to store the predictions of each "best" pipeline from every generation, and manually ensemble those predictions with the new predictions from the pipelines in the current generation. Of course, there will be no way around storing the entire pipeline list in a VotingClassifier for new predictions in the TPOT |
Check this out: scikit-learn/scikit-learn#8960 In the next release, scikit-learn is probably going to get an implementation of stacking classifier, so TPOT might be able to search stacked ensembles the same way it searches pipelines. |
Awesome. I look forward to the next release, then! |
Ensemble of pipelines would be a great improvement for TPOT! |
@simonzcaiman, this is certainly something we should discuss now before we move forward with actual implementation of TPOTEnsemble. It seems like a good idea to allow different ensemble methods, but I only know of the ones in VotingClassifier from sklearn. Are there are ensemble methods (preferably with a sklearn-like interface) that we should be aware of? |
Not sure if you should, but Sebastian has own Stacker here https://rasbt.github.io/mlxtend/user_guide/regressor/StackingRegressor/ |
Dropping an idea here while it's on my mind: Maybe the original approach to TPOTEnsemble is not good because it requires too many expensive evaluations every generation. Perhaps a better approach would be similar to what @lacava does in FEW:
After the first generation, all pipelines with a 0 coefficient will be removed from the TPOT ensemble. At generation 1 (and beyond), all pipelines in the new population will be added to the TPOT ensemble along with the surviving pipelines currently in the TPOT ensemble. Stack all of the outputs, fit a regularized linear model, and again use the coefficients as the fitness. Maybe something we can collaborate on, @lacava? |
@rhiever sounds like a good idea. you could use it with any method that admits some kind of feature score, e.g. lasso, random forests, etc.. and perhaps even with stacking if stacking can be made to score the models it uses in its ensemble. |
Another strategy would be to use a randomized forest and use the importance weights as the fitness. |
Many people have been asking for a version of TPOT that creates ensembles of pipelines, as that's what often wins Kaggle competitions etc. We've created prototypes of TPOT that ensemble the Pareto front or final population, but those prototypes didn't work so well because TPOT pipelines are optimized to perform well on a dataset by themselves. In other words, there is no pressure from TPOT to create pipelines that work well with other pipelines.
Here's my proposal for allowing TPOT to create ensembles of pipelines: What if we treated the TPOT optimization procedure as a sort of boosting procedure? It could work as follows:
That way, TPOT is directly optimizing for pipelines that ensemble well with the previously-best pipelines, and the final ensemble is composed of one pipeline from each generation. Is this idea crazy enough to work?
The text was updated successfully, but these errors were encountered: