file_refit = "on_change" will work weirdly when renaming factor levels #1128

martinmodrak · 2021-03-22T09:13:10Z

This is a minor issue I realized about the current implementation of file_refit = "on_change", not sure what to do about it, but posting here for discussion/future reference. If the only change I do to my data is to rename factor levels (while keeping the order the same),the data as passed to Stan will not change, neither will the model code change. So this means that an old fit will be loaded. However, this old fit could break subsequent code as the names of the parameters exposed by the fit will have changed.

The text was updated successfully, but these errors were encountered:

paul-buerkner · 2021-03-24T09:01:12Z

I see. Could it make sense to also compare the processed data which each other, that is, fit$data with the new output of validate_data?

paul-buerkner · 2021-06-05T12:55:12Z

Currently, we check the Stan data via all.equal(sdata, cached_sdata, check.attributes = FALSE). I think setting check.attributes = TRUE should fix your problem. Is there, in your opinion, a downside of also checking attributes?

hsbadr · 2021-06-13T21:38:33Z

Is there, in your opinion, a downside of also checking attributes?

Depending on the data type and how it's created (and even the environment), checking the attributes can unnecessarily trigger a refit. For example, reading a data from CSV file as a data frame, data table, or tibble could change the attributes; same for how the factors are created (if using a different function that store extra attributes). Probably it's safer to add checks for the factor variables (levels and reference level).

paul-buerkner · 2021-06-14T10:30:26Z

Good points, thank you!

martinmodrak · 2021-06-15T15:00:30Z

@hsbadr is right. Actually, I did add the check.attributes = FALSE right after I saw some unnecessary refits, I don't remember exactly what was the cause. I currently think the problem is that maybe we are caching at the wrong level of abstraction (the underlying stan fit is the same, but we store all sorts of stuff that could change). But we probably don't want to change the format how we store the fits... Maybe the correct solution is to have backup the original parameter names when doing rename_pars, so that one could load the fit, map the names back using the stored version and then map them again....

hsbadr · 2021-06-18T21:30:21Z

Take a look at waldo::compare(), which is inspired by all.equal() and generate actionable insights by:

Ordering the differences from most important to least important.
Displaying the values of atomic vectors that are actually different.
Carefully using colour to emphasise changes (while still being readable when colour isn’t available).
Using R code (not a text description) to show where differences arise.
Where possible, comparing elements by name, rather than by position.
Erring on the side of producing too much output, rather than too little.

paul-buerkner · 2021-08-11T12:08:04Z

Factor levels should now be checked as well.

paul-buerkner added the bug label Mar 24, 2021

paul-buerkner added this to the brms 2.15.0++ milestone Mar 24, 2021

paul-buerkner added a commit that referenced this issue Aug 11, 2021

brmsfit_needs_refit: check factor level names #1128

d868c55

paul-buerkner closed this as completed Aug 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

file_refit = "on_change" will work weirdly when renaming factor levels #1128

file_refit = "on_change" will work weirdly when renaming factor levels #1128

martinmodrak commented Mar 22, 2021

paul-buerkner commented Mar 24, 2021

paul-buerkner commented Jun 5, 2021

hsbadr commented Jun 13, 2021

paul-buerkner commented Jun 14, 2021

martinmodrak commented Jun 15, 2021

hsbadr commented Jun 18, 2021

paul-buerkner commented Aug 11, 2021

file_refit = "on_change" will work weirdly when renaming factor levels #1128

file_refit = "on_change" will work weirdly when renaming factor levels #1128

Comments

martinmodrak commented Mar 22, 2021

paul-buerkner commented Mar 24, 2021

paul-buerkner commented Jun 5, 2021

hsbadr commented Jun 13, 2021

paul-buerkner commented Jun 14, 2021

martinmodrak commented Jun 15, 2021

hsbadr commented Jun 18, 2021

paul-buerkner commented Aug 11, 2021