-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Grf qr engine #360
Grf qr engine #360
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comparing make_quantile_reg
and make_grf_quantiles
, it looks like they're both using the same model/engine/mode settings. I remember some discussion about maybe switching so that quantile_regression
was a separate mode? I guess if that's happening it's happening some later version?
Also, I'm assuming all parameters not here are just set to their defaults, so sample.fraction=0.5
; is there some reason for this particular trio of parameters, and not including say sample.fraction
, honesty
, honesty.fraction
, honesty.prune.leaves
, etc?
Overall, mostly just some clarification stuff and some bike-shedding w.r.t which parameters we include. I think it's good to go.
set.seed(12345) | ||
library(grf) | ||
tib <- tibble( | ||
y = rnorm(100), x = rnorm(100), z = rnorm(100), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
y = rnorm(100), x = rnorm(100), z = rnorm(100), | |
+ y = rnorm(100), x = rnorm(100), z = y-x+ .01 * rnorm(100)+1, |
if we make it so that there's something other than noise, the fit models are the same regardless of the state of the seed.
So I tried this, and got different results still, unfortunately.
colnames(p_manual) <- c("0.1", "0.5", "0.9") | ||
p_manual <- tibble::as_tibble(p_manual) | ||
# not equal despite the seed, etc | ||
# expect_equal(p_pars, p_manual) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My guess is that this is because fseed
specifies the c++ seed, whereas set_engine(...seed=fseed)
is specifying an R
seed, which isn't getting passed down. We may just want to set the tolerances large on expect_equal
or something along those lines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran the manual
twice and got something roughly at the same size. Given that this is just a wrapper I'm not sure we need to do too extensive of testing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that these two should result in the same thing. So, I'm a bit miffed it doesn't.
library(epipredict)
library(grf)
tib <- tibble::tibble(
y = rnorm(100), x = rnorm(100), z = rnorm(100),
f = factor(sample(letters[1:3], 100, replace = TRUE))
)
fseed <- 12345
spec_seed <- rand_forest(mode = "regression", mtry = 2L, min_n = 10) %>%
set_engine("grf_quantiles", seed = fseed)
manual <- quantile_forest(
as.matrix(tib[, 2:3]), tib$y,
quantiles = c(0.1, 0.5, 0.9), seed = fseed,
mtry = 2L, min.node.size = 10
)
translate(spec_seed)
#> Random Forest Model Specification (regression)
#>
#> Main Arguments:
#> mtry = 2
#> min_n = 10
#>
#> Engine-Specific Arguments:
#> seed = fseed
#>
#> Computational engine: grf_quantiles
#>
#> Model fit template:
#> grf::quantile_forest(X = missing_arg(), Y = missing_arg(), mtry = min_cols(~2L,
#> x), min.node.size = min_rows(~10, x), seed = fseed, quantiles = c(0.1,
#> 0.5, 0.9), num.threads = 1L)
Created on 2024-08-13 with reprex v2.1.1
The function is being called with the same args, and passing the seed. I'm not concerned, but I wish the were the same.
As for the overall comment, yes, the plan is for |
Checklist
Please:
dajmcdon.
DESCRIPTION
andNEWS.md
.Always increment the patch version number (the third number), unless you are
making a release PR from dev to main, in which case increment the minor
version number (the second number).
(backwards-incompatible changes to the documented interface) are noted.
Collect the changes under the next release number (e.g. if you are on
0.7.2, then write your changes under the 0.8 heading).