Structured covariates for non-linear functions #1488

wds15 · 2023-04-27T08:49:36Z

I am working with Pharmacometrics models which require non-linear functions. So the dosing history of each patient is needed to calculate with a non-linear function the drug concentration in blood at any time-point whenever I have an observation of a patient. A solution is a wide data format, but this is inconvenient as the number of dosing time-points which I need to take into account can get large. Below is an example of what I have in mind. So I would like to use the covariates dose_time and dose_amt as argument to a non-linear function.

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tidyr)

set.seed(346356)

Information on drug dosing is most naturally specified in a long
format. Here is an example for two regimens where we administer 8
units of drug to a patient at two occasions. The time-points are 24
hours apart, but we use a different amount at each time-point.

dosing_long <- tribble(~regimen, ~dose_id, ~dose_time, ~dose_amt,
                       1L, 1L, 0,  4,
                       1L, 2L, 24, 4,
                       2L, 1L, 0,  2,
                       2L, 2L, 24, 6)

To use such information in brms the dosing history must be cast
into a wide format.

dosing_wide <- pivot_wider(dosing_long,
                           id_cols="regimen",
                           names_from="dose_id",
                           values_from=c("dose_time", "dose_amt"))

dosing_wide
#> # A tibble: 2 × 5
#>   regimen dose_time_1 dose_time_2 dose_amt_1 dose_amt_2
#>     <int>       <dbl>       <dbl>      <dbl>      <dbl>
#> 1       1           0          24          4          4
#> 2       2           0          24          2          6

Only in wide format one can merge the dosing history with
observations from a patient. Here is a pseudo data set as example:

fake_wide <- expand_grid(subject=1:4, time=24*c(2, 3, 4)) %>%
    left_join(data.frame(subject=1:4, regimen=sample(1:2, 4, TRUE))) %>%
    left_join(dosing_wide) %>%
    mutate(y=rnorm(n())) %>%
    relocate(subject, y)
#> Joining, by = "subject"
#> Joining, by = "regimen"

modeling this data-set works in brms as we have all dosing
information available in a given row, which is needed to model the
measured data y…

knitr::kable(fake_wide)

subject	y	time	regimen	dose_time_2	dose_amt_1	dose_amt_2
1	-1.4335139	48	2	24	2	6
1	-0.2447368	72	2	24	2	6
1	0.5337662	96	2	24	2	6
2	0.8153970	48	1	24	4	4
2	-0.3382850	72	1	24	4	4
2	0.0922915	96	1	24	4	4
3	0.8734750	48	1	24	4	4
3	-2.4113110	72	1	24	4	4
3	2.4293361	96	1	24	4	4
4	0.6915627	48	1	24	4	4
4	-1.6689784	72	1	24	4	4
4	1.1158514	96	1	24	4	4

… but it is super in-convenient, since we need to deal with an
arbitrary wide format here! More doses mean more columns, etc.
Thus a nested format would be much easier to deal with.

So let’s cast data into a nested format such that the dosing
columns themselve contain data vetors in each entry:

dosing_nested <- dosing_long %>%
    group_by(regimen) %>%
    nest(data=c("dose_id", "dose_time", "dose_amt")) %>%
    unnest_wider("data")

tibble

dosing_nested
#> # A tibble: 2 × 4
#> # Groups:   regimen [2]
#>   regimen     dose_id   dose_time    dose_amt
#>     <int> <list<int>> <list<dbl>> <list<dbl>>
#> 1       1         [2]         [2]         [2]
#> 2       2         [2]         [2]         [2]

having nested data

knitr::kable(dosing_nested)

regimen	dose_id	dose_time	dose_amt
1	1, 2	0, 24	4, 4
2	1, 2	0, 24	2, 6

now the model data set becomes a lot easier

fake_nested <- select(fake_wide, 1:4) %>%
    left_join(dosing_nested)
#> Joining, by = "regimen"

tibble

fake_nested
#> # A tibble: 12 × 7
#>    subject       y  time regimen     dose_id   dose_time    dose_amt
#>      <int>   <dbl> <dbl>   <int> <list<int>> <list<dbl>> <list<dbl>>
#>  1       1 -1.43      48       2         [2]         [2]         [2]
#>  2       1 -0.245     72       2         [2]         [2]         [2]
#>  3       1  0.534     96       2         [2]         [2]         [2]
#>  4       2  0.815     48       1         [2]         [2]         [2]
#>  5       2 -0.338     72       1         [2]         [2]         [2]
#>  6       2  0.0923    96       1         [2]         [2]         [2]
#>  7       3  0.873     48       1         [2]         [2]         [2]
#>  8       3 -2.41      72       1         [2]         [2]         [2]
#>  9       3  2.43      96       1         [2]         [2]         [2]
#> 10       4  0.692     48       1         [2]         [2]         [2]
#> 11       4 -1.67      72       1         [2]         [2]         [2]
#> 12       4  1.12      96       1         [2]         [2]         [2]

having nested data

knitr::kable(fake_nested)

subject	y	time	regimen	dose_id	dose_time	dose_amt
1	-1.4335139	48	2	1, 2	0, 24	2, 6
1	-0.2447368	72	2	1, 2	0, 24	2, 6
1	0.5337662	96	2	1, 2	0, 24	2, 6
2	0.8153970	48	1	1, 2	0, 24	4, 4
2	-0.3382850	72	1	1, 2	0, 24	4, 4
2	0.0922915	96	1	1, 2	0, 24	4, 4
3	0.8734750	48	1	1, 2	0, 24	4, 4
3	-2.4113110	72	1	1, 2	0, 24	4, 4
3	2.4293361	96	1	1, 2	0, 24	4, 4
4	0.6915627	48	1	1, 2	0, 24	4, 4
4	-1.6689784	72	1	1, 2	0, 24	4, 4
4	1.1158514	96	1	1, 2	0, 24	4, 4

What would now be needed in brms is to allow for strucutred columns to be passed as covariates to
non-linear functions in brms.

^{Created on 2023-04-27 with reprex v2.0.2}

... and as a fancy add-on... I think it would actually be nice to also allow the response y in a brms model to come from such a nested thing.... but that's not directly related...

The text was updated successfully, but these errors were encountered:

paul-buerkner · 2023-05-17T12:16:47Z

This feature is now implemented. Below you can find an example:

set.seed(2134)
N <- 100
dat <- data.frame(y=rnorm(N))
dat$X <- matrix(rnorm(N*2), N, 2)

nlfun_stan <- "
  real nlfun(real a, real b, real c, row_vector X) {
     return a + b * X[1] + c * X[2];
  }
"
nlstanvar <- stanvar(scode = nlfun_stan, block = "functions")

# version for R post processing
nlfun <- function(a, b, c, X) {
  a + b * X[, , 1] + c * X[, , 2]
}

# fit the model
bform <- bf(y~nlfun(a, b, c, X), a~1, b~1, c~1, nl = TRUE)
fit <- brm(bform, dat, stanvars = nlstanvar)
summary(fit)

# fit benchmark model that should yield the same results up to MCMC error
fit2 <- brm(y~X, dat)
summary(fit2)

# post processing works too
str(posterior_epred(fit))
loo(fit, fit2)

wds15 · 2023-05-17T12:59:21Z

WOW... it works with integers too:

set.seed(2134)
N <- 100
dat <- data.frame(y=rnorm(N))
dat$X <- matrix(as.integer(rnorm(N*2, 0, 10)), N, 2)

nlfun_stan <- "
  real nlfun(real a, real b, real c, int[] X) {
     return a + b * X[1] + c * X[2];
  }
"
nlstanvar <- stanvar(scode = nlfun_stan, block = "functions")

# version for R post processing
nlfun <- function(a, b, c, X) {
  a + b * X[, , 1] + c * X[, , 2]
}

# fit the model
bform <- bf(y~nlfun(a, b, c, X), a~1, b~1, c~1, nl = TRUE)
fit <- brm(bform, dat, stanvars = nlstanvar)
summary(fit)


stancode(fit)

Awesome!

wds15 · 2023-05-26T12:01:12Z

FYI... the expose_functions(model, vectorize=TRUE) does not work any longer with these structured covariates...new issue?

paul-buerkner · 2023-05-26T12:55:03Z

yes, new issue. thank you wds15 ***@***.***> schrieb am Fr., 26. Mai 2023, 14:01:

…

FYI... the expose_functions(model, vectorize=TRUE) does not work any longer with these structured covariates...new issue? — Reply to this email directly, view it on GitHub <#1488 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADCW2AHV7OXY26TIGAJCLI3XICLRFANCNFSM6AAAAAAXNREJTE> . You are receiving this because you modified the open/close state.Message ID: ***@***.***>

paul-buerkner added the feature label Apr 27, 2023

paul-buerkner added this to the 2.20.0 milestone Apr 27, 2023

paul-buerkner added a commit that referenced this issue May 17, 2023

feature issue #1488

bfff51d

paul-buerkner closed this as completed May 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Structured covariates for non-linear functions #1488

Structured covariates for non-linear functions #1488

wds15 commented Apr 27, 2023

paul-buerkner commented May 17, 2023

wds15 commented May 17, 2023

wds15 commented May 26, 2023

paul-buerkner commented May 26, 2023 via email

Structured covariates for non-linear functions #1488

Structured covariates for non-linear functions #1488

Comments

wds15 commented Apr 27, 2023

paul-buerkner commented May 17, 2023

wds15 commented May 17, 2023

wds15 commented May 26, 2023

paul-buerkner commented May 26, 2023 via email