TPOT stuck at 0% #542

Nirvana2211 · 2017-08-05T13:12:37Z

I am using TPot classifier for a smallish dataset with 20,000 rows and 68 features. I ran the following code

pipeline_optimizer = TPOTClassifier()
pipeline_optimizer = TPOTClassifier(generations=5, population_size=20, cv=5, random_state=0, verbosity=2,n_jobs = 10)
X_train = np.nan_to_num(X_train)
pipeline_optimizer.fit(X_train, dataY_train)
Warning: Although parallelization is currently supported in TPOT for Windows, pressing Ctrl+C will freeze the optimization process without saving the best pipeline! Thus, Please DO NOT press Ctrl+C during the optimization procss if n_jobs is not equal to 1. For quick test in Windows, please set n_jobs to 1 for saving the best pipeline in the middle of the optimization process via Ctrl+C.
Optimization Progress: 0%| | 0/120 [00:00<?, ?pipeline/s]

The optimization process is stuff at 0% for last 14 hours. Is this normal? Any help would be appreciated. Thank you!

weixuanfu · 2017-08-05T13:25:17Z

I think this is related to #508. Please try to run TPOT like this demo below:

import multiprocessing
if __name__ == '__main__':
    multiprocessing.set_start_method('forkserver')
    # Note: need move import sklearn into main unless a RuntimeError (RuntimeError: context has already been set) will raise
    from sklearn.datasets import make_classification
    from tpot import TPOTClassifier
    # your TPOT codes
    pipeline_optimizer = TPOTClassifier()
    pipeline_optimizer = TPOTClassifier(generations=5, population_size=20, cv=5, random_state=0, 
                                                              verbosity=2,n_jobs = 10)
    X_train = np.nan_to_num(X_train)
    pipeline_optimizer.fit(X_train, dataY_train)

Please let me know if this way solve this issue.

Nirvana2211 · 2017-08-05T16:53:21Z

@weixuanfu Thank you for the prompt reply. I should have mentioned that I am using Windows. Sorry about that. 'fork server' doesn't work on windows. How can I make it work for Windows? I have changed n_jobs =1, and even that doesn't seem to work. Thanks again!

weixuanfu · 2017-08-05T18:26:11Z

Oh, I just found that. Maybe decreasing n_jobs to 1 would help for Windows. Or you could try the latest dev branch where has better timeout control. Please let me know and inform me more environment infos (Tpot and its deps' versions) if both possible solutions do not work. I need double-check it.

Nirvana2211 · 2017-08-06T17:00:38Z

@weixuanfu n_jobs =1 worked for windows. I am also running it on a linux box with with n_jobs=20. It seems to be working on linux.

deo1 · 2017-09-17T06:41:32Z

I have the exact same issue, running on Windows. Even with the below params running on the tiny Titanic dataset (100's of rows), the optimizer simply never makes progress.

model = tp.TPOTClassifier(generations=1, population_size=1, cv=5, verbosity=2, n_jobs=8, config_dict=config_dict)

Optimization Progress: 0%| | 0/2 [00:00<?, ?pipeline/s]

That said, CPU usage is around 100% and python processes are constantly getting spun up and torn down, but no progress is made. n_jobs=1 works as expected (< 15 sec).

Have any of the devs tried multiprocessing on a Windows machine? I suspect it just doesn't work.

multiprocessing.cpu_count() == 12

            platform : win-64
       conda version : 4.3.25
    conda is private : False
   conda-env version : 4.3.25
 conda-build version : 3.0.14
      python version : 3.5.4.final.0
    requests version : 2.13.0
                TPOT : 0.8.3
               numpy : 1.12.1
               scipy : 0.19.1
        scikit-learn : 0.19.0
                deap : 1.0.2

rhiever · 2017-09-18T11:53:03Z

Multiprocessing simply doesn't work in Windows with Python, so we had to drop support for it.

rhiever · 2017-10-10T15:04:27Z

Closing this issue. Please feel free to re-open or file a new issue if you have any further questions or comments.

OhMyGodness · 2018-11-15T09:18:12Z

I think this is related to #508. Please try to run TPOT like this demo below:

import multiprocessing
if __name__ == '__main__':
    multiprocessing.set_start_method('forkserver')
    # Note: need move import sklearn into main unless a RuntimeError (RuntimeError: context has already been set) will raise
    from sklearn.datasets import make_classification
    from tpot import TPOTClassifier
    # your TPOT codes
    pipeline_optimizer = TPOTClassifier()
    pipeline_optimizer = TPOTClassifier(generations=5, population_size=20, cv=5, random_state=0, 
                                                              verbosity=2,n_jobs = 10)
    X_train = np.nan_to_num(X_train)
    pipeline_optimizer.fit(X_train, dataY_train)

Please let me know if this way solve this issue.

I am just rewrite my code like this demo,but it still stuck at 0% for two days.
my data is 100w rows and 64 cols,are they too large to cause this problem?
I run my code in aws Linux with 16 cores and 120G RAMs and I set the n_jobs =10.

OhMyGodness · 2018-11-15T11:18:13Z

I think this is related to #508. Please try to run TPOT like this demo below:

import multiprocessing
if __name__ == '__main__':
    multiprocessing.set_start_method('forkserver')
    # Note: need move import sklearn into main unless a RuntimeError (RuntimeError: context has already been set) will raise
    from sklearn.datasets import make_classification
    from tpot import TPOTClassifier
    # your TPOT codes
    pipeline_optimizer = TPOTClassifier()
    pipeline_optimizer = TPOTClassifier(generations=5, population_size=20, cv=5, random_state=0, 
                                                              verbosity=2,n_jobs = 10)
    X_train = np.nan_to_num(X_train)
    pipeline_optimizer.fit(X_train, dataY_train)

Please let me know if this way solve this issue.

I am just rewrite my code like this demo,but it still stuck at 0% for two days.
my data is 100w rows and 64 cols,are they too large to cause this problem?
I run my code in aws Linux with 16 cores and 120G RAMs and I set the n_jobs =10.

weixuanfu · 2018-11-15T14:37:42Z

@OhMyGodness could you please try Parallel Training with Dask for this big dataset?

OhMyGodness · 2018-11-16T02:01:18Z

@OhMyGodness could you please try Parallel Training with Dask for this big dataset?

Yes,I have done some try using dask just like this demo ,but ,but find some other mistake in the script way on my aws
Linux instance.Could you give me detail in using dask by script way.

weixuanfu added the question label Aug 8, 2017

rhiever closed this as completed Oct 10, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TPOT stuck at 0% #542

TPOT stuck at 0% #542

Nirvana2211 commented Aug 5, 2017

weixuanfu commented Aug 5, 2017

Nirvana2211 commented Aug 5, 2017 •

edited

Loading

weixuanfu commented Aug 5, 2017

Nirvana2211 commented Aug 6, 2017 •

edited

Loading

deo1 commented Sep 17, 2017 •

edited

Loading

rhiever commented Sep 18, 2017

rhiever commented Oct 10, 2017

OhMyGodness commented Nov 15, 2018

OhMyGodness commented Nov 15, 2018

weixuanfu commented Nov 15, 2018

OhMyGodness commented Nov 16, 2018

TPOT stuck at 0% #542

TPOT stuck at 0% #542

Comments

Nirvana2211 commented Aug 5, 2017

weixuanfu commented Aug 5, 2017

Nirvana2211 commented Aug 5, 2017 • edited Loading

weixuanfu commented Aug 5, 2017

Nirvana2211 commented Aug 6, 2017 • edited Loading

deo1 commented Sep 17, 2017 • edited Loading

rhiever commented Sep 18, 2017

rhiever commented Oct 10, 2017

OhMyGodness commented Nov 15, 2018

OhMyGodness commented Nov 15, 2018

weixuanfu commented Nov 15, 2018

OhMyGodness commented Nov 16, 2018

Nirvana2211 commented Aug 5, 2017 •

edited

Loading

Nirvana2211 commented Aug 6, 2017 •

edited

Loading

deo1 commented Sep 17, 2017 •

edited

Loading