-
-
Notifications
You must be signed in to change notification settings - Fork 191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
read_stan_csv
is mandatory in brm()
#1331
Comments
I agree it's not ideal. The problem is that I need to translate whatever output we are getting to a |
Hi @paul-buerkner I've been working with Simon on this, and while we think the best solution is to merge this into Rstan, we aren't sure how Ben feels about a The rewrite changes only the first several lines of
Everything is identical except for some floating point issues on the 16th decimal place that apparently are handled inconsistently by |
It would be amazing to have it in brms! Would you mind preparing a PR? Can we also do selective loading of columns with this? I still need a mechanism in the cmdstanr backend to exclude variables I don't want to save. |
|
Perfect! I cannot wait for a PR :-) |
We're waiting to hear back from |
For some of my current project, I would love to have this feature available in brms. I could copy your code you linked above and make some edits to it in order to put it into brms, but the problem is that this would make brms GPL >= 3 which is something I don't want (brms is GPL2). So basically, do I have @bgoodri's approvement (and yours) to put this into brms under GPL2? |
Hi Paul, you are welcome to use any of the code I have written (i.e. the modified version of |
Hi, just to follow up on this: the above approach involves tearing out fairly large quantities of code from Better, I think, is to just use I think the licensing situation is resolved by this, as nothing is taken out of There's a little bit more work involved in finishing off a couple of smaller details, checking speed/memory on some larger files, and writing some code to more comprehensively check for consistency between the |
The approach you suggested makes total sense to me! Thank you for working on this solution! |
just one thought... as a - very dirty - hack... could one possibly write a function which temporarily redefines the Even more brutal would be to take the loaded (this assumes that swapping out |
I just tried this approach and I got amazingly far. The culprit is that it's not just about swapping out library(rstan)
rf <- as.character(body(read_stan_csv))
read_stan_csv
rf2 <- gsub("readLines", "data.table::fread", rf)
pseudoFun <- function(csvfiles, col_major = TRUE) {}
pseudoFun
rstanNamespace <- asNamespace("rstan")
body(pseudoFun) <- parse(text=c(rf2, "}"))
environment(pseudoFun) <- rstanNamespace
pseudoFun
csvfiles <- dir(system.file('misc', package = 'rstan'),
pattern = 'rstan_doc_ex_[0-9].csv', full.names = TRUE)
## almost works, but it's more complex than that...
pseudoFun(csvfiles) maybe this is inspiring to someone. The point is that R is amazing in what it let's you do to the language itself. Essentially one could write your own import function and rely on the internals from rstan by assigning the internal rstan namespace as environment of the function. This way one could be saved from having to rewrite all the internals. Not sure how CRAN friendly this would be... maybe this is more a fun thing... |
A bit slower than promised, but submitted a PR that follows this approach of reading in via cmdstanr and then repackaging into a stanfit object. PR here. The bit that took a bit longer was to include functionality to allow selecting a subset of variables at the read stage, which it now does. The stanfit print method however does not like it if you select a subset of a variable at the read stage (e.g. |
Implemented via #1400 |
Hello,
brm()
function by default usesread_stan_csv()
function:brm.R
script there is a functionbrm()
in which at the endfit_model()
function is called.backend = cmdstanr
,fit_model()
calls another function.fit_model_cmdstanr()
.fit_model_cmdstanr()
function at the end runsrstan::read_stan_csv
.However, saved CSV file, which
brm()
is trying to read withrstan::read_stan_csv
, can reach several gigabytes in size for a large number of parameters. In that case, functionread_stan_csv
can take up on several hours, even days, to load CSV file. This is because ofrstan::read_stan_csv
uses slow data reading packages. Moreover, it seems thatrstan::read_stan_csv
inbrm()
is mandatory. In my case it tooks >90% of running time. For models with a lot of parameters, I have to kill a proccess whenrstan::read_stan_csv
is running and read CSV file with a lot faster functionread_cmdstan_csv()
.My questions/offers:
brm()
function additional parameter which could disable CSV reading partrstan::read_stan_csv
?read_cmdstan_csv()
instead ofrstan::read_stan_csv
?The text was updated successfully, but these errors were encountered: