Feature request: Validation data metrics for model selection #19

romanovzky · 2024-06-06T09:51:02Z

Currently, SymbolicRegressor returns a model that better complies with a certain criteria. This, however, is computed on the training set. Machine learning best practices dictate that model selection should be done using a validation set. Currently, this can be "hacked" by selecting the best pareto front individual against a validation metric after the SymbolicRegressor completes its run. However, with callbacks (see #18) this feature could allow for earlystop criteria using the validation set. This is common in machine learning packages with iterative training (see Keras, Lightning, xGboost, etc for examples).

The text was updated successfully, but these errors were encountered:

gkronber · 2024-06-06T10:04:38Z

I like the idea of using the callback mechanism for this, so that users have different options for model selection. Selecting based on a validation set could be a good default. Other options are selection based on criteria such as Bayesian evidence, AIC, BIC or description length, but these could be easily added by users once the callback mechanism is in place.

foolnotion self-assigned this Jun 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: Validation data metrics for model selection #19

Feature request: Validation data metrics for model selection #19

romanovzky commented Jun 6, 2024

gkronber commented Jun 6, 2024 •

edited

Loading

Feature request: Validation data metrics for model selection #19

Feature request: Validation data metrics for model selection #19

Comments

romanovzky commented Jun 6, 2024

gkronber commented Jun 6, 2024 • edited Loading

gkronber commented Jun 6, 2024 •

edited

Loading