These codes are the online material for the article:
Proportional odds assumption for modeling longitudinal ordinal multiple toxicity outcomes in dose finding studies of targeted agents: a pooled analysis of 53 studies.
By Damien Drubay, Laurence Collette, and Xavier Paoletti (Submitted to Biometrical journal)
Data generated by phase I trials is richer than the classical binary DLT measured at the first cycle used as primary endpoints. Several works developed designs based on more informative endpoints, e.g. ordinal toxicity grades and/or longitudinal data.
The codes proposed here allow to fit multivariate (several types of ordinal tocicities) continuation ratio logit model (CRLM) (Agresti 2007) with multivariate gaussian random effect to take into account intra-patient correlation. We also evaluated the proportional odds model, which is a special case of the CRLM, and compared the 2 models using WAIC (Watanabe 2010). LKJ prior (Lewandowski 2009) was used for random effect correlation matrix and horseshoe (Carvalho 2009) for the fixed effect parameters.
The R setup is available on the CRAN website (https://cran.r-project.org/).
The following extra-packages are required:
rstan
parallel
ordinal
knitr
xtable
venn
gridExtra
ggplot2
Stan (https://mc-stan.org/) (Carpenter 2017) is a suite of statistical procedure for statistical modeling. We used it for bayesian inference, especially to perform Hamiltonian Monte Carlo sampling.
Installing rstan
R package should be sufficient. If not, you may refer to the Stan website (https://mc-stan.org/) for the installation according to your operating system.
The data should have at least 8 columns: Patient ID, patient dose, cycle of the observation and the 5 columns of ordinal grade corresponding to the 5 types of toxicity.
Ex :
patId | dose | cycle | cutaneous | digestive | genDisorder | hemato | other |
---|---|---|---|---|---|---|---|
1 | 0.75 | 1 | 0 | 0 | 2 | 0 | 1 |
1 | 0.75 | 2 | 2 | 0 | 0 | 1 | 0 |
2 | 0.38 | 1 | 0 | 0 | 1 | 0 | 2 |
2 | 0.38 | 2 | 0 | 0 | 0 | 0 | 0 |
2 | 0.38 | 3 | 0 | 0 | 1 | 3 | 0 |
3 | 0.53 | 1 | 3 | 1 | 3 | 0 | 0 |
... |
Before to run the code, you have to indicate the location of your data and customize the columns labels. It can be realized inside the first part of the file initialization.R
(the code was pre-filled according to the previous example).
You can run directly the bash code run.sh
or sequentially in your console:
initialization.R
POmodel.R
CRModel.R
The results.Rmd
is a Rmarkdown code allowing to produce table and figures similar to those of the article.
PO model is a special case of CRLM. However several intermediate models may be of interest. Stan codes for custom models can be easily writen using the function writeStanCRModel
(in the src/writeStanCRModel.R
file).
The following code allows to write the stan code for an univariate mixed effect CRLM:
code<-writeStanCRModel(Y="Y",nY=1,nlvlY=4,X="X",T="T",lvlSlpX=list(1:3),lvlSlpT=list(1:3),sub=TRUE)
#write file in the stanCodes folder with the name subModel.stan (you can change, e.g. univ.stan)
outFile<-file("stanCodes/subModel.stan")
writeLines(code$stanFile,outFile,sep="")
close(outFile)
The following arguments are required to customize your model:
Y
: The name of the response variable in the data provided to Stan (should be a nObs vector of one type of ordinal toxicity for univariate, a matrix nObs x nTox for multivariate)nY
: The number of type of toxicity (1 for univariate)nlvlY
: The number of toxicity grade (maximum grade + 1 for no grade (grade 0))X
: Names corresponding to the column of the doses in the data provided to StanT
: Names corresponding to the column of the the cycles in the data provided to StanlvlSlpX
: List of toxicity grades for which a dose parameter is associated (if no parameter, indicate "0")lvlSlpT
: List of toxicity grades for which a cycle parameter is associatedsub
: Univariate model was a "submodel" relatively to our final multivariate model.sub
argument should passed toTRUE
(default =FALSE
)FEpriors
: Choice of the fixed effect prior: horseshoe (HS
, default) (Carvalho 2009) or a weakly informative Cauchy prior with location 0 and scale 2.5 (Cauchy
) (Gelman 2008)
For the example of an univariate model with a cycle parameter associated to the grades 1 and 3 and without dose effect:
code<-writeStanCRModel(Y="Y",nY=1,nlvlY=4,X="X",T="T",lvlSlpX=list(0),lvlSlpT=list(c(1,3)),sub=TRUE)
The same arguments hold for the multivariate model, but lvlSlpX
and lvlSlpT
should be list of nY
values/vectors, example:
code<-writeStanCRModel(Y="Y",nY=5,nlvlY=4,X="X",T="T",lvlSlpX=list(c(1:3),3,0,c(2:3),c(1:3)),lvlSlpT=list(0,c(1,2),1,c(1,3),c(1:3)))
Note that sub
argument is no longer required (because default = FALSE
).
Agresti A, and Wiley InterScience (Online service). An introduction to categorical data analysis. Wiley-Interscience, 2007.
Carpenter B, Gelman A, Hoffman MD, Lee D, Goodrich B, Betancourt M, Brubaker M, Guo J, Li P, and Riddell A. Stan : A Probabilistic Programming Language. Journal of Statistical Software, 76(1):1–32, jan 2017.
Carvalho CM, Polson NG, and Scott JG. Handling Sparsity via the Horseshoe. In David van Dyk and Max Welling, editors, Proceedings of the Twelth International Conference on Artificial Intelligence and Statistics, volume 5 of Proceedings of Machine Learning Research, pages 73–80, Hilton Clearwater Beach Resort, Clearwater Beach, Florida USA, 2009. PMLR.
Gelman A, Jakulin A, Grazia Pittau M, and Su YS. A weakly informative default prior distribution for logistic and other regression models. The Annals of Applied Statistics, 2(4):1360–1383, 2008.
Lewandowski D, Kurowicka D, and Joe H. Generating random correlation matrices based on vines and extended onion method. Journal of Multivariate Analysis, 100(9):1989–2001, 2009.
Watanabe S. Asymptotic Equivalence of Bayes Cross Validation and Widely Applicable Information Criterion in Singular Learning Theory. Journal of Machine Learning Research, 11(Dec):3571–3594, 2010.