Feature request: put the fit time in evaluated_individuals_ #780

louisabraham · 2018-10-07T21:16:31Z

It would be handy. GridSearchCV does it for example.
I also think I encountered some strange pipelines that did not stop after max_eval_time_mins, and this would help me to reproduce the issue.

weixuanfu · 2018-10-08T13:47:43Z

This is a old known issue for python.

The way CPython supports threading and asynchronous features has impacts on the accuracy of the timeout. For more background about this issue - that cannot be fixed - Please read Python gurus thoughts about Python threading, the GIL and context switching like these ones:

http://pymotw.com/2/threading/
https://wiki.python.org/moin/GlobalInterpreterLock

But I think it is a good idea to add this fit time into pipeline statistics.

louisabraham · 2018-10-08T15:01:02Z

I think the relevant code is there

tpot/tpot/base.py

Lines 1236 to 1239 in 507b45d

    
           parallel = Parallel(n_jobs=self._n_jobs, verbose=0, pre_dispatch='2*n_jobs') 
        
           tmp_result_scores = parallel( 
        
               delayed(partial_wrapped_cross_val_score)(sklearn_pipeline=sklearn_pipeline) 
        
               for sklearn_pipeline in sklearn_pipeline_list[chunk_idx:chunk_idx + chunk_size])

It seems you used the threading_timeoutable from stopit to handle the timeout. Why didn't you use instead the timeout parameter of joblib.Parallel?

louisabraham · 2018-10-08T15:05:31Z

Oh, the timeout parameter of joblib.Parallel raises a timeout if any task lasts to long.

Would joblib/joblib#366 allow for a more precise time control?

weixuanfu · 2018-10-08T15:15:09Z

Maybe, I will look into it. But two issues need attentions when using timeout in joblib:

TPOT uses joblib in sklearn to avoid adding one more dependency, so we need watch if scikit-learn updates the built-in joblib.
this timeout in joblib only works when n_jobs !=1. We need a workaround for this.

louisabraham · 2018-10-08T15:24:20Z

Maybe we should just integrate some custom joblib code?
I'm not sure about what causes the issues with threads in the first place, but wouldn't multiprocessing.Process provide a timeout ability as well?

weixuanfu added enhancement need contributor labels Oct 8, 2018

louisabraham mentioned this issue Oct 8, 2018

Random state is reset even when doing a warm_start #782

Open

perib mentioned this issue Sep 21, 2023

TPOT2 and the future of TPOT development -- From the Devs #1322

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: put the fit time in evaluated_individuals_ #780

Feature request: put the fit time in evaluated_individuals_ #780

louisabraham commented Oct 7, 2018

weixuanfu commented Oct 8, 2018

louisabraham commented Oct 8, 2018

louisabraham commented Oct 8, 2018

weixuanfu commented Oct 8, 2018

louisabraham commented Oct 8, 2018

Feature request: put the fit time in evaluated_individuals_ #780

Feature request: put the fit time in evaluated_individuals_ #780

Comments

louisabraham commented Oct 7, 2018

weixuanfu commented Oct 8, 2018

louisabraham commented Oct 8, 2018

louisabraham commented Oct 8, 2018

weixuanfu commented Oct 8, 2018

louisabraham commented Oct 8, 2018