-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TPOT stuck at 75th generation with no errors #1214
Comments
You should use It might be valuable to test your system and environment with this example gist or confirm your configuration is similar. |
I just got caught by this, there needs to be a better error message when using cuML and leaving the n_jobs set to -1. |
If the maintainers are open to it, perhaps we could open a PR that validates the |
Yes probably just >0 as I think you can use multiple GPUs |
I am running the GPU-accelerated (using dask) configuration of TPOT (TPOT version 0.11.7) on a couple of different data. I am also using TPOT cuML in the configuration. I am using python 3 with anaconda.
For all the data, TPOT is getting stuck at generation 74 or 75, no matter the size of the databases (some of them are 480 rows, 10 columns, up to 9000 rows, 83 columns). No error is output, the periodic checkpoint folder just stops updating, and no new messages appear. I have left it running but after 8 hours nothing new came up.
I have changed the random seed of the TPOT regressor to see if it would be an issue with a specific model architecture, but changing the seed still results in it getting stuck at generation 75.
My tpot regressor looks as follows:
tpot = TPOTRegressor(verbosity=2,
use_dask = True,
n_jobs=-1,
cv=5,
random_state=42, #this was changed, as mentioned above
template='Regressor',
config_dict='TPOT cuML',
periodic_checkpoint_folder='../checkpoints/{}/'.format(target),
max_time_mins = None
)
Any idea how to solve this issue/ why it would be happening every time for different data?
Thank you!
The text was updated successfully, but these errors were encountered: