Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use merged object for Ewing sarcoma samples to assign cell types #1017

Closed
allyhawkins opened this issue Jan 31, 2025 · 0 comments · Fixed by #1027
Closed

Use merged object for Ewing sarcoma samples to assign cell types #1017

allyhawkins opened this issue Jan 31, 2025 · 0 comments · Fixed by #1027
Assignees
Labels

Comments

@allyhawkins
Copy link
Member

If you are filing this issue based on a specific GitHub Discussion, please link to the relevant Discussion.

#696

Describe the goals of the changes to the analysis module.

Before we spend a lot of time trying to manually validate annotations in each sample (via #1003), I think it might be useful to explore using the merged object to finalize annotations. We have some samples that are heterogeneous and have both tumor cells and a set of normal cells, while most samples are almost exclusively tumor. Working with the homogenous samples is difficult when they all show high expression of tumor markers. But if we were to look at all cells together we would expect to see a more obvious distinction between tumor and normal cell types. In particular I would like to run AUCell on the merged object with the gene sets that we have identified from MSigDB and look at that output alongside both the cell types obtained by running SingleR with the tumor cell reference and consensus cell types.

One caveat here is that we don't want to integrate the samples, but just work with the merged and uncorrected data. So it is possible that we might see some technical differences. We should pay close attention to the normal cells in this case since I expect those to be more similar to each other across samples than tumor cells.

What will your pull request contain?

This is going to be broken into a few PRs.

  • The first PR will add running the merged object through AUCell as an additional step to the script that currently runs AUCell on each individual object. In looking at the AUCell vignette there is a note that rankings could be combined after running independently, but it's best to be sure the same genes are being considered and proceed with caution. Because of this, I think we should just re-run on the merged object rather than use existing results.
  • The second PR will be a script to read in the merged object, all singleR results, all consensus cell types, and all AUC results. The output will be a TSV file with UMAP embeddings, cell type annotations, and AUC scores for each cell across all datasets. This can then be used as input to a notebook rather than having to work with the full merged object.
  • The last PR will be an exploratory notebook that looks at the cell type assignments and AUC values in the merged object. I plan to use the guide notebook that I've been developing as part of Write a template notebook to use for "finalizing" cell type annotations for Ewing samples #993 to create this notebook. The goal is to adjust any assignments based on findings in this notebook and validate the assignments.

Will you require additional software beyond what is already in the analysis module?

No, we should have everything but we will need to use the merged object from OpenScPCA-nf.

Will you require different computational resources beyond what the analysis module already uses?

No

If known, when do you expect to file the pull request?

Hoping to have these items done by middle of next sprint (2/14).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
1 participant