-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add marker gene lists for tumor cell states #971
Add marker gene lists for tumor cell states #971
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks pretty much in place, but I have a couple questions -
- While looking (a bit) over the papers, I wondered whether we want to be considering CD44 here which is mentioned in Goodspeed, and seems to be otherwise recognized as a EWS-FL1 low marker?
- How do you envision the
gene_signatures
tsvs being used in the future, and is there any information that indicates high/low for these genes? My sense here is the marker genes file is the main file of interest to explore cell states, but it might be supplemented with the other signature files? This comment might be me asking for a smidge more docs, but that also depends on your answer here :)- I'll also note that I did confirm their gene symbol id <-> ensembl mappings are correct as part of review
For fun (for science?!) I may have also asked ChatGPT what it thinks about marker genes here (with, happily, some overlap with what you found so at least ChatGPT has some degree of accuracy here...) - https://chatgpt.com/share/677bdf60-1a1c-8003-9708-a7e91873976a |
Co-authored-by: Stephanie Spielman <stephanie.spielman@gmail.com>
I actually was thinking I should do this so thank you! Although some of the genes overlap, their descriptions are somewhat off. For example, PRKCB is a direct target of EWS-FLI1 and is upregulated by the fusion not repressed!
Don't trust the robots 🤖 |
Nobody is even a little shocked 🥴 |
I added this to the list! I think because they show transcriptional heterogeneity at the single-cell level in Goodspeed I'm good with adding it. I was trying to be careful about making the list too long since there are a lot of EWS-FLI1 targets and multiple lists that exist of those targets. I wanted to use markers that had been validated experimentally to be heterogeneous in samples and can actually serve as markers of the EWS-FLI1 low/high phenotype within a single sample.
I added a line to the documentation for each of these files with the expectation for expression in EWS-FLI1 high/low cells. My initial thought is to use the main marker gene list to try and define cell states and then use these published gene signatures to help validate the assignments. I also wonder if these lists will be more helpful than my custom list in assigning cell states. I basically wanted to include anything that could be helpful in making these assignments now, even if they don't get fully used. But you are correct that the marker genes file is the main file of interest. This should be ready for another look! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good to me, thanks!
Canceling the workflow run here since no changes were made to the code. |
Purpose/implementation Section
Please link to the GitHub issue that this pull request addresses.
Closes #939
What is the goal of this pull request?
Here I am adding a marker gene list that can be used to define different tumor cell states that we expect to find in the Ewing sarcoma samples. Right now the goal is to label cell states that have been shown to be consistently present across Ewing sarcoma studies and are not looking to find novel cell states. Generally it's been shown that EWS cells lie on a continuum of EWS-FLI1 expression and publications have categorized cells into EWS-FLI1 high and EWS-FLI1 low cells. There's also a population of cells that has been labeled as "proliferative" across multiple publications (although there is some old argument that EWS-FLI1 high cells are also proliferative). Because of this, I made a table that has a column for the name for the cell state (either proliferative, EWS-low, or EWS-high) and then each row is an individual marker gene. I also included any custom gene lists that we may consider using to identify these cell states that were published by others.
Briefly describe the general approach you took to achieve this goal.
There is no clear consensus of exactly which set of genes describes EWS-FLI1 high or low cells, but there are some key marker genes that were identified and validated in some of the publications. These are the ones that I chose to include in the main marker gene table,
tumor-cell-state-markers.tsv
.Wrenn et al identified
NT5E
as a marker for EWS-FLI1 low cells along with a group of ECM and EMT related genes. I chose to include the subset of genes that they validated as having high expression in EWS-FLI1 low cells in this table. See Figure S2EGoodspeed et al used
MKI67
andPCNA
to identify the "proliferative" population:MKI67
was also used to differentiate proliferating cells from mesenchymal like cells (EWS-FLI1 low cells) in spatial profiling performed by Wrenn et al.I also added two custom marker gene lists that I think we may want to use:
aynaud-ews-targets.tsv
represent the targets for EWS-FLI1 identified in Aynaud et al Fig. 4. These genes were shown to be on a continuum with EWS-FLI1 expression at the single-cell level. Note that there were a few genes in this list that did not return any mappings for Ensembl IDs, so I left them as NA.wrenn-nt5e-genes.tsv
represent the intersection between the top genes correlated with NT5E expression in patient tumors and the top genes that were markers of NT5E+ Ewing sacoma cells. These genes were listed in Fig 5D and 5E.The last thing I did here was update the README in the
references
folder to document all of these gene lists. I also included a section where I linked to potentially useful marker gene lists in MsigDB that were mentioned in the various publications I saw. I think these may be helpful in identifying EWS-FLI1 high/low cells or at least helping to validate our assignments so I added them for future reference just in case.If known, do you anticipate filing additional pull requests to complete this analysis module?
The next thing I plan on doing is creating an exploratory notebook where I look at 2-3 samples and see if I can identify EWS-FLI1 high/low clusters of cells. To do this I am going to start with the genes in
tumor-cell-state-markers.tsv
.Author checklists
Analysis module and review
README.md
has been updated to reflect code changes in this pull request.Reproducibility checklist
Dockerfile
.environment.yml
file.renv.lock
file.