EM Efficiency vs. INLA #27

smeisler · 2022-07-20T16:38:15Z

Hi,

I know the benchmarking in the BayesfMRI paper used the INLA implementation, but I have been testing the EM version (2.0 latest commit) and have been getting very long run times compared to the published benchmarks.

I am running on a CentOs7 HPC, using 32 CPUs / 100GB RAM. A single hemisphere took ~18 hours to process for a single HPC subject resampled to 10000 vertices. Max memory usage was around 28GB.

My run is pretty much identical to the vignette with the exception that I am running on the HCP working memory N-back task, and thus have 8 conditions of interest (4 conditions for 2-back and 0-back each).

Is this to be expected, and are there any ways I can speed up the run? For example, if I only ran two conditions (pool all the 2-back and 0-back conditions together), would I expect to see speed increases? I would prefer not to resample further, and I imagine I will get diminishing returns on adding CPUs.

The results look great, by the way! Plotted below is a single subject 2-back vs 0-back, plotted against the HCP group level analysis. Thanks again for all of your work in developing this!

Best,
Steven

mandymejia · 2022-07-21T13:53:51Z

Hi Steven, Thank you for bringing this to our attention. The EM version should not generally take longer than INLA. Dan has been working on the EM implementation and knows the most about its efficiency under different conditions, but he is currently out on vacation. Is this something that can wait a week or so until he gets back? We will definitely address it, as one of the motivations for the EM implementation is efficiency! 18 hours is a very long time compared with what we generally expect. This is a plot based on a simulation study from Dan's draft manuscript, and you can see that with 10000 vertices and 8 tasks, we start to see the EM computation time suffer, but nothing like 18 hours... [image: image.png] Could this subject be an anomaly, or are you seeing similar runtimes for other subjects? There are three factors that might be at play: the relatively high number of tasks, the relatively high spatial resolution, and possibly a lack of data (few time points) associated with each task. There are a few things you can try and other things that we can implement to improve on this. On your end, you could pool the tasks so you only have 2 conditions, as you suggested. If you'd rather keep the 8 conditions separate, you could also reduce resolution down to 5000 vertices – though I don't expect that will help quite as much as reducing the number of tasks. On our end, we could allow for the spatial hyperparameters to be shared across tasks. That should greatly improve computational efficiency, because computation time is very affected by the number of hyperparameters in the model. And, perhaps more importantly, convergence could be really slow if there is too little data to estimate the hyperparameters for individual tasks. We might be able to build in a flag for cases like this where pooling hyperparameters across the tasks might be recommendable. Mandy *Mandy Mejia, PhD* Assistant Professor Department of Statistics Indiana University https://www.statmindlab.com/

…

On Wed, Jul 20, 2022 at 11:38 AM Steven Meisler ***@***.***> wrote: Hi, I know the benchmarking in the BayesfMRI paper used the INLA implementation, but I have been testing the EM version (2.0 latest commit) and have been getting very long run times compared to the published benchmarks. I am running on a CentOs7 HPC, using 32 CPUs / 100GB RAM. A single hemisphere took ~18 hours to process for a single HPC subject resampled to 10000 vertices. Max memory usage was around 28GB. My run is pretty much identical to the vignette with the exception that I am running on the HCP working memory N-back task, and thus have 8 conditions of interest (4 conditions for 2-back and 0-back each). Is this to be expected, and are there any ways I can speed up the run? For example, if I only ran two conditions (pool all the 2-back and 0-back conditions together), would I expect to see speed increases? I would prefer not to resample further, and I imagine I will get diminishing returns on adding CPUs. The results look great, by the way! Plotted below is a single subject 2-back vs 0-back, plotted against the HCP group level analysis. Thanks again for all of your work in developing this! [image: image] <https://user-images.githubusercontent.com/27028726/180036170-2766055d-08d2-42b5-bd91-2cd5e24c056e.png> [image: image] <https://user-images.githubusercontent.com/27028726/180036212-76332bb9-9c2a-4325-b5d8-c520833f3de8.png> Best, Steven — Reply to this email directly, view it on GitHub <#27>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABHVKUWNSLYICIQRC4EUSYTVVATQFANCNFSM54ELFJCA> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

mandymejia · 2022-07-21T14:04:07Z

Just a quick follow-up – I misspoke about EM not taking longer than INLA. Under some conditions it clearly does as you can see in the plot I sent. But we wouldn't expect it to take THAT much longer. *Mandy Mejia, PhD* Assistant Professor Department of Statistics Indiana University https://www.statmindlab.com/

…

On Thu, Jul 21, 2022 at 8:53 AM Amanda Mejia ***@***.***> wrote: Hi Steven, Thank you for bringing this to our attention. The EM version should not generally take longer than INLA. Dan has been working on the EM implementation and knows the most about its efficiency under different conditions, but he is currently out on vacation. Is this something that can wait a week or so until he gets back? We will definitely address it, as one of the motivations for the EM implementation is efficiency! 18 hours is a very long time compared with what we generally expect. This is a plot based on a simulation study from Dan's draft manuscript, and you can see that with 10000 vertices and 8 tasks, we start to see the EM computation time suffer, but nothing like 18 hours... [image: image.png] Could this subject be an anomaly, or are you seeing similar runtimes for other subjects? There are three factors that might be at play: the relatively high number of tasks, the relatively high spatial resolution, and possibly a lack of data (few time points) associated with each task. There are a few things you can try and other things that we can implement to improve on this. On your end, you could pool the tasks so you only have 2 conditions, as you suggested. If you'd rather keep the 8 conditions separate, you could also reduce resolution down to 5000 vertices – though I don't expect that will help quite as much as reducing the number of tasks. On our end, we could allow for the spatial hyperparameters to be shared across tasks. That should greatly improve computational efficiency, because computation time is very affected by the number of hyperparameters in the model. And, perhaps more importantly, convergence could be really slow if there is too little data to estimate the hyperparameters for individual tasks. We might be able to build in a flag for cases like this where pooling hyperparameters across the tasks might be recommendable. Mandy *Mandy Mejia, PhD* Assistant Professor Department of Statistics Indiana University https://www.statmindlab.com/ On Wed, Jul 20, 2022 at 11:38 AM Steven Meisler ***@***.***> wrote: > Hi, > > I know the benchmarking in the BayesfMRI paper used the INLA > implementation, but I have been testing the EM version (2.0 latest commit) > and have been getting very long run times compared to the published > benchmarks. > > I am running on a CentOs7 HPC, using 32 CPUs / 100GB RAM. A single > hemisphere took ~18 hours to process for a single HPC subject resampled to > 10000 vertices. Max memory usage was around 28GB. > > My run is pretty much identical to the vignette with the exception that I > am running on the HCP working memory N-back task, and thus have 8 > conditions of interest (4 conditions for 2-back and 0-back each). > > Is this to be expected, and are there any ways I can speed up the run? > For example, if I only ran two conditions (pool all the 2-back and 0-back > conditions together), would I expect to see speed increases? I would prefer > not to resample further, and I imagine I will get diminishing returns on > adding CPUs. > > The results look great, by the way! Plotted below is a single subject > 2-back vs 0-back, plotted against the HCP group level analysis. Thanks > again for all of your work in developing this! > [image: image] > <https://user-images.githubusercontent.com/27028726/180036170-2766055d-08d2-42b5-bd91-2cd5e24c056e.png> > [image: image] > <https://user-images.githubusercontent.com/27028726/180036212-76332bb9-9c2a-4325-b5d8-c520833f3de8.png> > > Best, > Steven > > — > Reply to this email directly, view it on GitHub > <#27>, or unsubscribe > <https://github.com/notifications/unsubscribe-auth/ABHVKUWNSLYICIQRC4EUSYTVVATQFANCNFSM54ELFJCA> > . > You are receiving this because you are subscribed to this thread.Message > ID: ***@***.***> >

smeisler · 2022-07-21T14:35:09Z

Thanks for the suggestions! Definitely no rush to address this, and I will try pooling across conditions and testing a different subject as well.

danieladamspencer · 2022-08-01T17:45:59Z

Hey @smeisler , just checking in to see if you saw similar runtimes with other subjects. Did you try pooling across conditions? Happy to help now that I'm back!

smeisler · 2022-08-02T21:53:46Z

Hi, I have not tried this yet, but I will update when I do.

smeisler · 2022-08-20T00:11:14Z

Getting back to this, I retried INLA after grouping together the 0- and 2-back conditions and changing variable names (see #28), and the program crashes with the following error message (on both HCP subjects I tested):

SETTING UP DATA 
.. reading in data for session 1
.. reading in data for session 2
MAKING DESIGN MATRICES 
RUNNING MODELS 

.. LEFT CORTEX ANALYSIS 
    848 locations removed due to NA or NaN values in at least one scan.
.... prewhitening... done!

.... estimating model with INLAError in INLA::f(ZeroBack_HRF, model = spde, replicate = repl1, hyper = list(theta = list(initial = c(-2,  : 
 object 'spde' not found

*** inla.core.safe:  inla.program has crashed: rerun to get better initial values. try=1/2 
Error in INLA::f(ZeroBack_HRF, model = spde, replicate = repl1, hyper = list(theta = list(initial = c(-2,  : 
 object 'spde' not found

*** inla.core.safe:  inla.program has crashed: rerun to get better initial values. try=2/2 
Error in INLA::f(ZeroBack_HRF, model = spde, replicate = repl1, hyper = list(theta = list(initial = c(-2,  : 
 object 'spde' not found
Error in inla.core.safe(formula = formula, family = family, contrasts = contrasts,  : 
 *** Fail to get good enough initial values. Maybe it is due to something else.
Calls: BayesGLM_cifti -> BayesGLM -> <Anonymous> -> inla.core.safe
Execution halted

Restarting with EM implementation and will update when that finishes.

smeisler · 2022-08-20T01:34:55Z

The EM implementation was notably quicker. The run time was 58 and 67 minutes for the two subjects.

danieladamspencer · 2022-08-22T15:47:09Z

Hi @smeisler , do you have a reproducible example that I can use to try to figure this out? I am unable to reproduce this error. Glad to hear that the EM is working for you, though!

smeisler · 2022-08-22T17:32:42Z

Yes, this is using HCP1200 data, very similar to the vignette, but using the working memory task and grouping 0- and 2-back conditions together. Just make sure to change the two path_to_XX variables upfront to match your file system. And, 'data_dir' should point to your HCP 1200 subjects directory. Please let me know if you need any help in understanding the code.

subject <-'100307' # subject name
outfile='SS_test_INLA_100307'
path_to_lic <- '/PATH/TO/pardiso.lic'
path_to_workbench <- '/PATH/TO/workbench'
data_dir <- '/PATH/TO/HCP1200'

knitr::opts_chunk$set(echo = TRUE)
options(warn = -1)
library(ciftiTools)
library(dplyr)
library(INLA)
library(BayesfMRI)
library(abind)
library(ggplot2)
library(IRkernel)
library(IRdisplay)
library(rgl)
library(gifti)
rgl.printRglwidget = TRUE

inla.setOption(pardiso.license = path_to_lic)
inla.pardiso.check()
wb_cmd <- file.path(path_to_workbench,'/bin_rh_linux64/wb_command')
ciftiTools::ciftiTools.setOption("wb_path", path_to_workbench)

task='WM'
TR=0.72
tasks <- c('0bk_body','0bk_tools','0bk_places','0bk_faces',
           '2bk_body','2bk_tools','2bk_places','2bk_faces') # Task event filenames
names_tasks <- c('ZeroBack', 'TwoBack')
K <- length(names_tasks)
fmri_acqs <- c('LR','RL')

# Load surfaces
dir_s <- file.path(data_dir, subject, 'T1w', 'fsaverage_LR32k')
fname_gifti_left <- file.path(dir_s, paste0(subject,'.L.midthickness_MSMAll.32k_fs_LR.surf.gii'))
fname_gifti_right <- file.path(dir_s, paste0(subject,'.R.midthickness_MSMAll.32k_fs_LR.surf.gii'))

# Get fMRI images and event info
for(i in 1:length(fmri_acqs)){
    acq = fmri_acqs[i]
    func_dir <- file.path(data_dir,subject, 'MNINonLinear', 'Results', paste('tfMRI',task,acq,sep="_"))
    assign(paste0('fname',i,'_ts'), file.path(func_dir,paste('tfMRI',task,acq,'Atlas_MSMAll.dtseries.nii',sep="_")))
    assign(paste0('motion',i),as.matrix(read.table(file.path(func_dir,'Movement_Regressors.txt'), header=FALSE)))
    onsets <- vector('list', length=K)
    names(onsets) <- names_tasks
    for(t in tasks){
        # Put all 0-back in task 1, 2-back in task 2
        if (grepl("0", t)){
        ind_t <- 1
        } else{
        ind_t <- 2 }

        fname_t <- file.path(func_dir,'EVs',paste0(t,'.txt'))
	row_to_add <- read.table(fname_t, header=FALSE)
	onsets[[ind_t]] <- rbind(onsets[[ind_t]], row_to_add)
    }
    assign(paste0('onsets',i), onsets)
}


thetas <- NULL # No starting values for precision parameters
results <- BayesGLM_cifti(
    cifti_fname = c(fname1_ts, fname2_ts),
    surfL_fname = fname_gifti_left,
    surfR_fname = fname_gifti_right,
    onsets = list(onsets1, onsets2),
    TR = TR,
    nuisance = list(motion1, motion2),
    session_names = fmri_acqs,
    resamp_res = 10000,
    num.threads =  parallel::detectCores() - 2,
    verbose = TRUE,
    outfile = outfile,
    avg_sessions = TRUE,
    EM = TRUE)
saveRDS(results, file = paste0(outfile,'.rds'))

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EM Efficiency vs. INLA #27

EM Efficiency vs. INLA #27

smeisler commented Jul 20, 2022 •

edited

Loading

mandymejia commented Jul 21, 2022 via email

mandymejia commented Jul 21, 2022 via email

smeisler commented Jul 21, 2022

danieladamspencer commented Aug 1, 2022

smeisler commented Aug 2, 2022

smeisler commented Aug 20, 2022 •

edited

Loading

smeisler commented Aug 20, 2022 •

edited

Loading

danieladamspencer commented Aug 22, 2022

smeisler commented Aug 22, 2022 •

edited

Loading

EM Efficiency vs. INLA #27

EM Efficiency vs. INLA #27

Comments

smeisler commented Jul 20, 2022 • edited Loading

mandymejia commented Jul 21, 2022 via email

mandymejia commented Jul 21, 2022 via email

smeisler commented Jul 21, 2022

danieladamspencer commented Aug 1, 2022

smeisler commented Aug 2, 2022

smeisler commented Aug 20, 2022 • edited Loading

smeisler commented Aug 20, 2022 • edited Loading

danieladamspencer commented Aug 22, 2022

smeisler commented Aug 22, 2022 • edited Loading

smeisler commented Jul 20, 2022 •

edited

Loading

smeisler commented Aug 20, 2022 •

edited

Loading

smeisler commented Aug 20, 2022 •

edited

Loading

smeisler commented Aug 22, 2022 •

edited

Loading