Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run Arbitrary Code Every Generation #667

Closed
phillip-martin opened this issue Feb 13, 2018 · 1 comment
Closed

Run Arbitrary Code Every Generation #667

phillip-martin opened this issue Feb 13, 2018 · 1 comment
Labels

Comments

@phillip-martin
Copy link

phillip-martin commented Feb 13, 2018

I couldn't find this in the api or documentation, so please excuse me if this is trivial. At the end of training a model using tpot, I score the generated model based on mean absolute error and mean squared error. The following demonstrates an example.

x_train, x_test, y_train, y_test = train_test_split(
    x, y, test_size=test_size, random_state=seed
)

tpot = TPOTRegressor(generations=50, population_size=20, verbosity=2)
tpot.fit(x_train, y_train)

# score at the end of training
y_predicted = tpot.predict(scaler_x.transform(x_test))
print('me: ', mean_absolute_error(y_test, y_predicted))
print('mse: ', mean_squared_error(y_test, y_predicted))

I would like to be able to run these scores every generation, ideally through a function passed to either TPOTRegressor or tpot.fit. This might look like the following.

x_train, x_test, y_train, y_test = train_test_split(
    x, y, test_size=test_size, random_state=seed
)

def score_tpot(_tpot):
    y_predicted = _tpot.predict(scaler_x.transform(x_test))
    print('me: ', mean_absolute_error(y_test, y_predicted))
    print('mse: ', mean_squared_error(y_test, y_predicted))

tpot = TPOTRegressor(
    generations=50, population_size=20, verbosity=2, each_generation=score_tpot
)
tpot.fit(x_train, y_train)

Is there currently a way to do something like this that I could not find in the documentation?

Thank you for any time you put into reviewing this question.

@rhiever
Copy link
Contributor

rhiever commented Feb 13, 2018

We don't have direct support for this functionality, but it could technically be feasible through the use of the warm_start parameter. Some quick code:

x_train, x_test, y_train, y_test = train_test_split(
    x, y, test_size=test_size, random_state=seed
)

tpot = TPOTRegressor(generations=1, population_size=20, verbosity=2, warm_start=True)
for _ in range(50):
    tpot.fit(x_train, y_train)
    y_predicted = tpot.predict(scaler_x.transform(x_test))
    print('me: ', mean_absolute_error(y_test, y_predicted))
    print('mse: ', mean_squared_error(y_test, y_predicted))

# score at the end of training
y_predicted = tpot.predict(scaler_x.transform(x_test))
print('me: ', mean_absolute_error(y_test, y_predicted))
print('mse: ', mean_squared_error(y_test, y_predicted))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants