-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pipeline Produced Before Generations Completed #1308
Comments
Have a similar issue, there is a exception that is not caught that terminates the training loop. I am not sure why this exception doesn't get raised and show a stack trace by default, but if I specifically extend the "try" block in base.py:813 to also catch other exceptions, I got:
So ind.fitness.values is a tuple of a larger size for the first individual in the population, and then a later one has a smaller tuple leading to the IndexError. Indeed, the reason is that there are some elements in the population with ind.fitness.valid == False, with a empty tuple for ind.fitness.values. Not sure why this is. |
I believe this may be the same thing happening in #1313 |
Hello, so I've tried a couple of datasets, which I got early crash errors with 0.12.0, by using 0.12.1 version, and until now they run smoothly, thanks so much for a quick response! |
Hello,
So I am running this code to get a pipeline by using TPOT version of 0.12.0:
from tpot import TPOTRegressor
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
import pandas as pd
import numpy as np
df = pd.read_excel('C:/Users/OneDrive/Desktop/KodSystems/TPOT/abc.xlsx')
X = df.iloc[:, :-1]
y = df.iloc[:, -1]
print(X.shape, y.shape)
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.80, test_size=0.20, random_state=42)
tpot = TPOTRegressor(generations=10, population_size=50, verbosity=2, random_state=42, n_jobs=-2 ,cv=10)
...
perform the search
tpot.fit(X_train, y_train)
export the best model
tpot.export('abc.py')
extracted_best_model = tpot.fitted_pipeline_.steps[-1][1]
extracted_best_model.fit(X_train ,y_train)
print(extracted_best_model.feature_importances_)
However, it gives me a pipeline, before 10 generation is completed, as the following:
(7478, 5) (7478,)
Best pipeline: RandomForestRegressor(input_matrix, bootstrap=True, max_features=0.7500000000000001, min_samples_leaf=11, min_samples_split=9, n_estimators=100)
[0.06836239 0.08344129 0.18414733 0.25859585 0.40545313]
When I change random_state to 1 from 42, it does give me a pipeline after 10 generations of run, but the same thing happens in another dataset with shape of (7478, 2061). I have run the same datasets in 0.11.7 version, but didn't get any problem. What could be the reason, and the solution for that problem?
Thanks in advance!
The text was updated successfully, but these errors were encountered: