Open
Description
Related to #760
Current implementation of the validation
parameter in boost_tree
is to only set the proportion of training data to use as the validation set. It would be great to have the ability to pass a dataframe as an argument to validation
as well. This would be really helpful if there is a grouping structure within the data and you want to test if the model generalizes to difference groups, and would align the parsnip
capabilities to match xbg.train()
Not a great example but just for demonstration
library(modeldata)
data("penguins")
train <-
penguins |>
filter(species %in% c("Gentoo", "Adelie"))
valid <-
penguins |>
filter(!(species %in% c("Gentoo", "Adelie")))
boost_tree(mode = 'regression',
mtry = 3,
tree_depth = 2,
stop_iter = 5) |>
set_engine(validation = valid)