This repo accompanies our preprint Bayesian Gaussian Process Latent Variable Models for pseudotime inference in single-cell RNA-seq data. To reproduce the figures in the paper:
- The Jupyter notebooks
gpseudotime_all.ipynb
andmonocle_analysis_inversegamma.ipynb
in julia_notebooks will reproduce the synthetic and 'monocle' workflows respectively - The R scripts
synthetic_plots.R
andmonocle_plots.R
in plotting will then reproduce the figures
Note to construct the monocle representation you need to run R_notebooks/vignette.Rmd
to get the Laplacian Eigenmaps representation. This all relies heavily on HDF5 (through the rhdf5 and HDF5.jl libraries).
The main MH algorithm is in bgplvm.jl
. Briefly, it is invoked via
B_GPLVM_MH(X, n_iter, burn, thin,
t, tvar, lambda, lvar, sigma, svar,
r = 1, return_burn = false, cell_swap_probability = 0,
gamma = 1.0)
where
X
- cell-by-feature matrixt
,lambda
,sigma
- initial values for the markov chain (note all are vectors)tvar
,lvar
,svar
- variances for the proposal distributions of t, lambda and sigma respectivelyr
- repulsion parameter for Corp priorreturn_burn
- should the burn period of the traces be returned?cell_swap_probability
- leave as 0gamma
- rate parameter for exponential prior on lambda
bgplvm.jl
contains the MH algorithms written in Julia for inferenceR_notebooks
has R markdown notebooks to convert the embeddings to HDF5julia_notebooks
contains Jupyter notebooks for all the analysis (synthetic + moncole)plotting
contains R scripts to create the plots for the paperdata
contains MCMC trace data in the form of HDF5 & CSV