Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Validation data metrics for model selection #19

Open
romanovzky opened this issue Jun 6, 2024 · 1 comment
Open

Feature request: Validation data metrics for model selection #19

romanovzky opened this issue Jun 6, 2024 · 1 comment
Assignees

Comments

@romanovzky
Copy link

Currently, SymbolicRegressor returns a model that better complies with a certain criteria. This, however, is computed on the training set. Machine learning best practices dictate that model selection should be done using a validation set. Currently, this can be "hacked" by selecting the best pareto front individual against a validation metric after the SymbolicRegressor completes its run. However, with callbacks (see #18) this feature could allow for earlystop criteria using the validation set. This is common in machine learning packages with iterative training (see Keras, Lightning, xGboost, etc for examples).

@gkronber
Copy link
Member

gkronber commented Jun 6, 2024

I like the idea of using the callback mechanism for this, so that users have different options for model selection. Selecting based on a validation set could be a good default. Other options are selection based on criteria such as Bayesian evidence, AIC, BIC or description length, but these could be easily added by users once the callback mechanism is in place.

@foolnotion foolnotion self-assigned this Jun 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants