Building model for cell2location #2

ccruizm · 2022-07-07T09:57:55Z

Good day!

I am trying out this nice tool to analyze data generated using Visium. I have employed cell2location for cell-type deconvolution and have successfully prepared the data according to the tutorials you shared in this repository. However, I want to continue with the receiver/sender analysis but did not find a tutorial explaining how to create the model in detail.

I assume it has to follow this tutorial (https://github.com/theislab/ncem_tutorials/blob/main/tutorials/model_tutorial_interactions.ipynb), but I am not sure whether the parameters suggested there are applicable for Visium as well and not sure how to input the h5ad file (from the data preparation section) for the model training. Could you please share how to generate the model?

Thanks in advance!

ccruizm · 2022-07-07T11:53:11Z

Forgot to mention that I did the analysis on a group of Visium samples together and do not know if that can be used as input for NCEM or can only take one sample at a time.

ccruizm · 2022-07-08T12:32:07Z

I am using the scripts to train the model in SLURM (deconvolution_hvg.sh and deconvolution_baseline_hvg.sh) . So far seems to be working. However, I am not sure I am doing it right. The scripts for the cell2location dataset were designed to run only one sample from the lymph node dataset. I have run cell2location to estimate cell abundances on 10 different samples. I have seen the code on data.py and it assumes it is only one sample, right? I am using the DataLoaderCell2locationLymphnode to read my dataset but it does not account for multiple samples.

I prepared the data as suggested in the tutorial but there is no metadata or additional info regarding each individual sample. Now all estimated cells have been placed in one unique matrix. Will that influence the outcome? How would you recommend I should proceed in this case?

Now is running like this:

Loading data from raw files
registering celldata
AnnData object with n_obs × n_vars = 671874 × 2000
    obs: 'index', 'target_cell'
    var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
    uns: 'hvg', 'log1p', 'node_type_names', 'metadata', 'img_to_patient_dict'
    obsm: 'node_types', 'proportions', 'spatial'
collecting image-wise celldata
adding graph-level covariates
Loaded 1 images with complete data from 1 patients over 671874 cells with 2000 cell features and 21 distinct celltypes.
Mean of mean node degree per images across images: 20.010502
Using split method: node. 
 Train-test-validation split is based on total number of nodes per patients over all images.

Excluded 0 cells with the following unannotated cell type: [None] 

Whole dataset: 671874 cells out of 1 images from 1 patients.
Test dataset: 67187 cells out of 1 images from 1 patients.
Training dataset: 547009 cells out of 1 images from 1 patients.
Validation dataset: 60469 cells out of 1 images from 1 patients.

Hope I am being clear about my question. Thanks in advance!

AnnaChristina · 2022-07-08T13:47:05Z

Hi @ccruizm,

thanks for raising an issue and your interest in using ncem. I will push a tutorial on how to use ncem for deconvoluted visium beginning of next week so showcase the usage.

Maybe additional question: are you planning to identify potential type couplings in your data or do you want to inspect convergence of the model?

Running the model itself is only required once to identify if ncem can infer communication, afterwards you can just simply use the trained model so obtain insights from your data.

ccruizm · 2022-07-08T15:13:23Z

Thanks for your reply! I want to use it mainly to check type couplings. I am particularly seeing specific cell associations based on the estimated cell-type abundance and would like to check how the transcriptome changes in the different cellular compartments we have identified.

I am not sure which of the models will be more suitable for my data. I am now running the linear one but do not know whether the others could fit better. Particularly for Visium data.

Looking forward to the upcoming tutorial!

vkedlian · 2022-07-12T13:30:57Z

Hi @AnnaChristina !
Just wanted to add that I am also looking into running NCEM for my Visium samples and looking forward and appreciate a lot your new tutorial and any advice! :) I also have a question on how to adapt grid search script to the LSF cluster in case you or anyone from your team has experience in that.

Many thanks and with best wishes,
Veronika

ccruizm · 2022-07-25T07:16:59Z

Hello @AnnaChristina!

Do you have any news where we can find the tutorial on how to use ncem for deconvoluted visium data, please?

Thanks!

AnnaChristina · 2022-08-09T07:27:45Z

Hi @ccruizm,

sorry for the delay, I added a tutorial for using ncem for deconvoluted Visium data here: https://github.com/theislab/ncem_tutorials/blob/main/tutorials/type_coupling_visium.ipynb

We added this to a new release and it is now available in ncem==0.1.4

Please let me know if you have additional questions.

ccruizm · 2022-08-09T08:28:55Z

No problem! thank you very much for this.

josephineyates · 2024-05-15T12:31:51Z

Dear @AnnaChristina ,

First of all, thank you so much for a great tool!
I am reopening this thread because I successfully have used your tutorial on deconvolved visium data but wanted to reiterate on the comment you made earlier, distinguishing the type coupling analysis from convergence.

I first prepared the data as described in this tutorial (https://github.com/theislab/ncem_benchmarks/blob/main/notebooks/data_preparation/deconvolution/cell2location_human_lymphnode.ipynb), then used this tutorial to find type coupling (https://github.com/theislab/ncem_tutorials/blob/main/tutorials/type_coupling_visium.ipynb). In the latter, the type coupling step is actually run before training a model. I am not sure I understand how these functions work. Am I supposed to have trained the model somewhere on my data in between these two tutorials? Or can I just obtain the type coupling from the InterpreterDeconvolution, which I understand fits a standard OLS?

Sorry if this has been explained elsewhere, but I couldn't find the information.

Thank you!

AnnaChristina self-assigned this Jul 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Building model for cell2location #2

Building model for cell2location #2

ccruizm commented Jul 7, 2022

ccruizm commented Jul 7, 2022

ccruizm commented Jul 8, 2022

AnnaChristina commented Jul 8, 2022

ccruizm commented Jul 8, 2022

vkedlian commented Jul 12, 2022

ccruizm commented Jul 25, 2022

AnnaChristina commented Aug 9, 2022

ccruizm commented Aug 9, 2022

josephineyates commented May 15, 2024

Building model for cell2location #2

Building model for cell2location #2

Comments

ccruizm commented Jul 7, 2022

ccruizm commented Jul 7, 2022

ccruizm commented Jul 8, 2022

AnnaChristina commented Jul 8, 2022

ccruizm commented Jul 8, 2022

vkedlian commented Jul 12, 2022

ccruizm commented Jul 25, 2022

AnnaChristina commented Aug 9, 2022

ccruizm commented Aug 9, 2022

josephineyates commented May 15, 2024