Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building model for cell2location #2

Open
ccruizm opened this issue Jul 7, 2022 · 9 comments
Open

Building model for cell2location #2

ccruizm opened this issue Jul 7, 2022 · 9 comments
Assignees

Comments

@ccruizm
Copy link

ccruizm commented Jul 7, 2022

Good day!

I am trying out this nice tool to analyze data generated using Visium. I have employed cell2location for cell-type deconvolution and have successfully prepared the data according to the tutorials you shared in this repository. However, I want to continue with the receiver/sender analysis but did not find a tutorial explaining how to create the model in detail.

I assume it has to follow this tutorial (https://github.com/theislab/ncem_tutorials/blob/main/tutorials/model_tutorial_interactions.ipynb), but I am not sure whether the parameters suggested there are applicable for Visium as well and not sure how to input the h5ad file (from the data preparation section) for the model training. Could you please share how to generate the model?

Thanks in advance!

@ccruizm
Copy link
Author

ccruizm commented Jul 7, 2022

Forgot to mention that I did the analysis on a group of Visium samples together and do not know if that can be used as input for NCEM or can only take one sample at a time.

@AnnaChristina AnnaChristina self-assigned this Jul 7, 2022
@ccruizm
Copy link
Author

ccruizm commented Jul 8, 2022

I am using the scripts to train the model in SLURM (deconvolution_hvg.sh and deconvolution_baseline_hvg.sh) . So far seems to be working. However, I am not sure I am doing it right. The scripts for the cell2location dataset were designed to run only one sample from the lymph node dataset. I have run cell2location to estimate cell abundances on 10 different samples. I have seen the code on data.py and it assumes it is only one sample, right? I am using the DataLoaderCell2locationLymphnode to read my dataset but it does not account for multiple samples.

I prepared the data as suggested in the tutorial but there is no metadata or additional info regarding each individual sample. Now all estimated cells have been placed in one unique matrix. Will that influence the outcome? How would you recommend I should proceed in this case?

Now is running like this:

Loading data from raw files
registering celldata
AnnData object with n_obs × n_vars = 671874 × 2000
    obs: 'index', 'target_cell'
    var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
    uns: 'hvg', 'log1p', 'node_type_names', 'metadata', 'img_to_patient_dict'
    obsm: 'node_types', 'proportions', 'spatial'
collecting image-wise celldata
adding graph-level covariates
Loaded 1 images with complete data from 1 patients over 671874 cells with 2000 cell features and 21 distinct celltypes.
Mean of mean node degree per images across images: 20.010502
Using split method: node. 
 Train-test-validation split is based on total number of nodes per patients over all images.

Excluded 0 cells with the following unannotated cell type: [None] 

Whole dataset: 671874 cells out of 1 images from 1 patients.
Test dataset: 67187 cells out of 1 images from 1 patients.
Training dataset: 547009 cells out of 1 images from 1 patients.
Validation dataset: 60469 cells out of 1 images from 1 patients.

Hope I am being clear about my question. Thanks in advance!

@AnnaChristina
Copy link
Member

Hi @ccruizm,

thanks for raising an issue and your interest in using ncem. I will push a tutorial on how to use ncem for deconvoluted visium beginning of next week so showcase the usage.

Maybe additional question: are you planning to identify potential type couplings in your data or do you want to inspect convergence of the model?

Running the model itself is only required once to identify if ncem can infer communication, afterwards you can just simply use the trained model so obtain insights from your data.

@ccruizm
Copy link
Author

ccruizm commented Jul 8, 2022

Thanks for your reply! I want to use it mainly to check type couplings. I am particularly seeing specific cell associations based on the estimated cell-type abundance and would like to check how the transcriptome changes in the different cellular compartments we have identified.

I am not sure which of the models will be more suitable for my data. I am now running the linear one but do not know whether the others could fit better. Particularly for Visium data.

Looking forward to the upcoming tutorial!

@vkedlian
Copy link

Hi @AnnaChristina !
Just wanted to add that I am also looking into running NCEM for my Visium samples and looking forward and appreciate a lot your new tutorial and any advice! :) I also have a question on how to adapt grid search script to the LSF cluster in case you or anyone from your team has experience in that.

Many thanks and with best wishes,
Veronika

@ccruizm
Copy link
Author

ccruizm commented Jul 25, 2022

Hello @AnnaChristina!

Do you have any news where we can find the tutorial on how to use ncem for deconvoluted visium data, please?

Thanks!

@AnnaChristina
Copy link
Member

Hi @ccruizm,

sorry for the delay, I added a tutorial for using ncem for deconvoluted Visium data here: https://github.com/theislab/ncem_tutorials/blob/main/tutorials/type_coupling_visium.ipynb

We added this to a new release and it is now available in ncem==0.1.4

Please let me know if you have additional questions.

@ccruizm
Copy link
Author

ccruizm commented Aug 9, 2022

No problem! thank you very much for this.

@josephineyates
Copy link

Dear @AnnaChristina ,

First of all, thank you so much for a great tool!
I am reopening this thread because I successfully have used your tutorial on deconvolved visium data but wanted to reiterate on the comment you made earlier, distinguishing the type coupling analysis from convergence.

I first prepared the data as described in this tutorial (https://github.com/theislab/ncem_benchmarks/blob/main/notebooks/data_preparation/deconvolution/cell2location_human_lymphnode.ipynb), then used this tutorial to find type coupling (https://github.com/theislab/ncem_tutorials/blob/main/tutorials/type_coupling_visium.ipynb). In the latter, the type coupling step is actually run before training a model. I am not sure I understand how these functions work. Am I supposed to have trained the model somewhere on my data in between these two tutorials? Or can I just obtain the type coupling from the InterpreterDeconvolution, which I understand fits a standard OLS?

Sorry if this has been explained elsewhere, but I couldn't find the information.

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants