-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tpot freezing jupyter notebook #645
Comments
Please check the alternative solution about this issue |
Hello, After inserting (at the first cell of the notebook)
Then I get quite a big error that ends up with:
can it be an error "notebook related"? |
I should add that I executed the same code in a import multiprocessing
import numpy as np
import tpot
import sklearn
if __name__ == '__main__':
X_train = np.random.random((1000,10))
y_train = np.random.random(1000)+10
def RMSLE(p,a):
return np.sqrt(np.mean( (np.log(p+1) - np.log(a+1))**2 ))
rmsle_score = sklearn.metrics.make_scorer(RMSLE,greater_is_better=False)
reg1 = tpot.TPOTRegressor(verbosity=2,
n_jobs=-1,
scoring= rmsle_score,
cv=10,
max_time_mins=2)
reg1.fit(X_train, y_train) |
I rechecked the issue. I think there is a bug in new scoring api. Check PR #626 Try reinstall TPOT with this fix via the command below
I think your original codes without reseting start mode in |
Hmm, now I think it is a notebook-related issue and it also related to the scoring API for customized scoring function. I will look into it and refine the API. Thank you for report this issue here. |
I have another look on this issue. I think this issue is related to whether the customized scorer is pickable in parallel computing using joblib. I can reproduce this issue using GridSearchCV from sklearn instead of using tpot (examples below). It seems that scorer is not pickable somehow.
Maybe it is a issue in sklearn's repo. |
I see it freezing after ~3 generations with and without forkserver for different scorers. A workaround seems to be setting backend='threading' as default kwarg for Parallel in sklearn/externals/joblib/parallel.py |
I found that this happens when setting |
I think this issue is for notebook only. I will try to find a work around for this. |
@HamedMP, have you tried running TPOT with |
Not actually. Although looking into other comments, and importing `multiprocessing` helped to run the training, although the time to update the bar is very long (maybe it waits all the processes finish benchmarking or something). Like for 5 minutes it’s 0 model, then it goes to 1%, 10%, …
…On Jun 12, 2018, 5:21 PM +0200, Randy Olson ***@***.***>, wrote:
@HamedMP, have you tried running it with n_jobs!=1 on the command line?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
This still happens when I have tried to add the This import multiprocessing
from tpot import TPOTRegressor
multiprocessing.set_start_method('forkserver')
if __name__ == '__main__':
#mycode returns Traceback (most recent call last):
File "test_tpot_santander.py", line 3, in <module>
multiprocessing.set_start_method('forkserver')
File "/Users/davidbuchaca1/anaconda3/lib/python3.6/multiprocessing/context.py", line 242, in set_start_method
raise RuntimeError('context has already been set')
RuntimeError: context has already been set Nevertheless import multiprocessing
multiprocessing.set_start_method('forkserver')
from tpot import TPOTRegressor
if __name__ == '__main__':
#mycode Does not return any error but the same behaviour occurs. Nothing happens (even though CPU goes to 100 all threads for a long time). Probably there is something weird in my multiprocessing in OSX because in Ubuntu It works fine. |
Setting the |
@jaksmid did it work with |
OSX machine |
Someone commit some example code, whats the point if this cannot be scaled? Have tried every combination of forked, nothing works. Have 32 processors, and no progress after 30 minutes at verbosity 3. Garbage. Wasted 4 hours trying to get this to do anything with more than 1 cpu. |
It is a documented open issue. We are trying to use dask backend to solve it. Related to #730 |
Hi,
When training a regressor with n_jobs=-1 the process freezes after a minute. All resources go to 0 and the algorithm never stops.
Process to reproduce the issue
I get the following
And the process freezes even though
max_time_mins=3
.The text was updated successfully, but these errors were encountered: