-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallelize cross validation as a provisional optimization #302
Comments
Thank you for sharing this nice tips. Based on the User Guide of cross_val_score from scikit-learn |
Indeed that would be more precise. I proposed to make njobs == num_cv_folds since the default number of cv folds in tpot is 3, and most machines (used for machine learning) have more than 3 cores. Just to make @minimumnz feel better not having idle cores [1] ;-) [1] #177 |
We've been talking about adding a |
Would not be better to use the multiprocessing capabilities of DEAP. i.e Each combination(preprocessor, algorithm, postprocessor, etc) is an individual in the population of different combinations in TPOT. Hence exploiting the DEAP's multiprocessing feature helps TPOT in parallelizing, through running different individual in different cores? |
We looked into using the multiprocessing capabilities of DEAP, but ran into issues with pickling lambda functions and a few other tricks we use in TPOT. Maybe @weixuanfu2016 can provide full details. In the meantime, I've merged the PR into the |
I propose setting the parameter n_jobs to num_cv_folds to get a sort of quick parallelism. When better solutions with dask are implemented we could set it again to 1
On base.py, _evaluate_individual method (line 575)
cv_scores = cross_val_score(self, sklearn_pipeline, features, classes, cv=self.num_cv_folds, scoring=self.scoring_function)
to
cv_scores = cross_val_score(self, sklearn_pipeline, features, classes, cv=self.num_cv_folds, scoring=self.scoring_function, n_jobs=self.num_cv_folds)
The text was updated successfully, but these errors were encountered: