Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hematopoiesis velocity directionality #173

Closed
AMA111 opened this issue Mar 20, 2020 · 20 comments
Closed

Hematopoiesis velocity directionality #173

AMA111 opened this issue Mar 20, 2020 · 20 comments
Labels
question Further information is requested

Comments

@AMA111
Copy link

AMA111 commented Mar 20, 2020

Hey Volker,

I have a question regarding an analysis of Bone Marrow stem and progenitors compartment from heathy human samples.

image

The Figure her shows that there is a developmental process directed from the Erythroid prog (green color) to the Hematopietic stem cells (HSCs in Red color). We already know from previous studies of the Hematopietic system that the development is rather the other way around (From the stem to the prog).

Code: Compute velocity and velocity graph

scv.tl.recover_dynamics(sub_data)
scv.tl.velocity(sub_data, mode='dynamical')
scv.tl.velocity_graph(sub_data)
scv.tl.recover_latent_time(sub_data)

I tried also the steady-state model and gave the same conclusions. Also, I I tried the same to other samples from disease states and gave the same result!

Do you have an idea about the reverse directionality here? Possibly, there is dedifferentiation process is going on, but before we go to this conclusion, I would like to know your opinion.

Best,
Abdelrahman Mahmoud
DKFZ

@AMA111 AMA111 added the question Further information is requested label Mar 20, 2020
@VolkerBergen
Copy link
Contributor

Hi Abdelrahman,

great, that you examine this carefully before drawing any conclusions.

Looking into the phase potraits of top likelihood genes would certainly help.

top_genes = adata.var.sort_values('fit_likelihood', ascending=False).index[:10]
scv.pl.scatter(adata, basis=top_genes)

If you send me the output, I can help you with interpretations.

@AMA111
Copy link
Author

AMA111 commented Mar 20, 2020

Hey Volker, Sure!

image

@VolkerBergen
Copy link
Contributor

Looks like the velocities are mostly driven by the Hemoglobin Beta genes.

Can you also color it with pseudotime / latent time coord. to see the diffs. in the erythroid compartment? What do you expect for HBB towards maturation?

@AMA111
Copy link
Author

AMA111 commented Mar 20, 2020

Latent time :

image

Pseudotime :
image

UMAP:
image

yes, in the hematopiotic system, we expect maturation steps from: HSCs >> Erythroid prog >> mature Erythroid cells (mostly express hemoglobin related genes)

@VolkerBergen
Copy link
Contributor

We analysed one of our datasets today, which also includes erythroid cells. Interestingly, with RNA velocity we obtain directional flows in accordance with biol. expectations in all cell types except for erythroids. It turned out in our example that splicing efficiency seems to change dramatically in erythroids, decreasing the fractions of unspliced counts by half. That could be a reason for 'false' signal.

You could get an indication of such with the following code.

counts_s = scv.utils.sum_var(adata.layers['spliced'])
counts_u = scv.utils.sum_var(adata.layers['unspliced'])
fractions_u = counts_u / (counts_s + counts_u)
scv.pl.scatter(adata, color=fractions_u, smooth=True)

Would be very interesting to see whether the fractions are gradually decreasing in your erythroid progenitors.

@AMA111
Copy link
Author

AMA111 commented Mar 23, 2020

Interesting!! I calculated the fractions from my data and here is how it looks like:

Screenshot 2020-03-23 at 18 59 03

Do you have possible explanation for such velocities specially in the Erthroid prog? would be great if you could send me the velocities figures of your dataset? I am interested to see how the overall velocities behaviour in the stem and progenitors other than my data; since I couldn't find any published example from the BM hematopiotic system with velocities analysis.

@VolkerBergen
Copy link
Contributor

Indeed, there is now hundreds of papers making use of RNA velocity, yet (nearly) none of them on hematopoiesis. And I think the reason is because it usually does not comply with the biology. Conspicuously always observed in blood development, while it works just fine in almost all other systems (under some assumptions). Your example perfectly resembles what I also see in our data. The significant change in splicing efficiency eventually yields false signal, as illustrated by the dynamics of Hbb gene: It is up-regulated. However, due to a gradual increase in the splicing rate, the curvature in phase space may be turned such that it indicates down-regulation. That violates the fundamental assumption of any RNA velocity framework potentially making it inapplicable. I'll have some thoughts, on whether it can be adapted to changing splicing efficiency.

@AMA111
Copy link
Author

AMA111 commented Mar 23, 2020

Great! I found that it's known in the HSCs biology that they retain more of the unspliced mRNA and get more spliced form under activation: https://stemcellsjournals.onlinelibrary.wiley.com/doi/full/10.1634/stemcells.2005-0552.
Possibly, the Stem cells has undefined state of uncommitted cells for certain state/fate (reflected by high unspliced fraction), then further through the development state, start to have more spliced forms which would have a critical role for fate commitment. I would be happy to discuss this further and please let me know by any further updates regarding this issue. Thanks Volker!

Best wishes,
Abdelrahman

@ftheis
Copy link

ftheis commented Mar 24, 2020

This is really interesting - so far my assumption always was that hematopoietic kinetics in particular in HSCs and further up are so much slower than the usual kinetics (more weeks than days) that we do not find much. Hence, as Volker said, no real RNA velocity appl in blood. Do you guys see this effect only in erythroids (where transcription could go down etc as well) or also in other downstream cell types?

@AMA111
Copy link
Author

AMA111 commented Mar 25, 2020

I think specially in the erythroid prog, the system needs to be very fast to fit the Erythroid cells (RBCs) production rate (~2.4 Million cell/sec) which is very high demand per day. It make sense that the system will increase the splicing rate dramatically at this stage to commit to certain fate > RBCs. Also, I looked at the cell cycle and it seems that the they are high cycling cells (S and G2M) which interestingly fits this paper: https://www.ncbi.nlm.nih.gov/pubmed/30463007 .
From my point of view, the velocities accurately detected that there is a development process (False direction as we think) going on from Eryth prog to MK prog. The issue here is the "direction", but from an optimistic point of view it capture the developmental process correctly :))

Screenshot 2020-03-25 at 12 24 42

Can you please explain for me from a method perspective how it capture the directionality? I see that HBB is possibly in a repression phase but this doesn't help me to understand the directionality here?

@VolkerBergen
Copy link
Contributor

SCB_scVelo
The directionality is already depicted in gene-wise dynamics by assignment of positive or negative velocities (induction/repression). The projected velocity vector summarizes all gene-wise velocities, whereas in your example we can presume that it is mainly driven by HBB. HBB is picked up as repression (even though it may be induction, thus potentially false signal), which implies that cells should move from the highly expressed to the lowly expressed, i.e. from yellow to blue (colored by velocity pseudotime).

Good that you mention the cycle effects - same in our data (for Erythroid prog.)! That could also be a cause for false signal (if the cycle cannot be resolved by embedding coord. due to compositional effects of cycle and dev. progression) and should be investigated. Perhaps, you could try regressing out cycle effects.

Can you also show the phase potraits of HBB on raw data with scv.pl.scatter(adata, basis='HBB', use_raw=True)?

@AMA111
Copy link
Author

AMA111 commented Mar 31, 2020

I don't recommend regressing out cell cycle, since it's reflecting important underlaying process of the hematopoiesis differentiation. Have you tried to regress out the cell cycle in your data? and do you see change in directionality?

sure!
Screenshot 2020-03-31 at 14 39 07

@ogilliam
Copy link

Hi Volker,

I am facing the same issues as @AMA111. We are using a mouse model of the intestinal crypt in order to investigate cell differentiation.

image

From our biological understanding the arrows in the paneth cells should be pointing the other way around. Differentiation starts in the stem cells and terminates in the different celltypes (Enterocytes, Gobblet cells and Paneth cells). There is some knowledge about dedifferentiation of Paneth cells, but we didn’t expect to observe the majority of arrows pointing in the inverse direction of differentiation.

Top 10 putative driver genes
image

Fraction of unspliced RNA
image

This left me with the assumption, that a change in splicing efficiency could be the reason for my observation. Does this makes sense or am I missing something here?
And if this is the reason, would you have any suggestion on how to solve this issue?

Also, we tested for the number of cells and velocity genes, which looked fine to me:
print(adata.var['velocity_genes'].sum(), adata.n_vars, adata.n_obs)
401 3000 5412

Thanks a lot for your help!

Best wishes,
Oliver

@VolkerBergen
Copy link
Contributor

Your genes are undoubtedly supporting directionality from stem to enterocytes.
However, for paneth cells you got some interesting effects. In Gm0104, the red-highlighted part imho does not really fit to the remaining kinetic, thus induces some signal for down-regulation towards stem, which seems to be artificial imho given that the change in unspliced counts is too sudden/rapid..

image

@VolkerBergen
Copy link
Contributor

You seem to get this consistently for 'Gm*' genes. Any clue, whether there's anything specific to those genes?

@ogilliam
Copy link

Thank you for the quick response!

Indeed, all the Gm* genes show similar results.
In intestine, these genes are exclusively expressed in paneth cells.

image

It seems, that with further differentiation of paneth cells, the expression of GM10104 is also increasing. I am just a bit confused why the scatter plot is indicating the opposite?
Could the splicing efficiency have any effect in this?

(Gm10104 (DEFA25) belongs to the family of alpha defensins.
The respective proteins play an important role in the regulation of the microbial flora and the innate immune response against pathogens)

@ogilliam
Copy link

ogilliam commented Jul 1, 2020

Hi Volker,

I reanalyzed the dataset with the updated version of scVelo (0.2.1) and could observe a very low fraction of unspliced counts over the whole dataset, especially low for Paneth cells (Cluster 7, 8, 10, 17).

image

image

image

According to the scVelo website, between 25 - 10% unspliced counts is expected. Are the 10% a lower threshold or is it possible to analyze datasets with less than 10% unspliced RNA, like in my case?

Also, I was wondering what the reason for the low fraction of unspliced RNA might be. In this context I found this preprint:

image
https://www.biorxiv.org/content/10.1101/2020.03.13.990069v1.full

According to the paper, different preprocessing methods create different overall results. Is this something you observed as well?
For my analysis, I used velocyto.py with the run10X command.

Thanks a lot!

@WeilerP
Copy link
Member

WeilerP commented May 25, 2021

@ogilliam

According to the scVelo website, between 25 - 10% unspliced counts is expected. Are the 10% a lower threshold or is it possible to analyze datasets with less than 10% unspliced RNA, like in my case?

It seems rather low. This should result in a lot of genes being filtered out (based on shared counts). You would have to look at your phase portraits to see if enough information is given to infer dynamics.

Also, I was wondering what the reason for the low fraction of unspliced RNA might be. In this context I found this preprint:

image
https://www.biorxiv.org/content/10.1101/2020.03.13.990069v1.full

According to the paper, different preprocessing methods create different overall results. Is this something you observed as well?

I did not read the paper but yes, preprocessing does matter!

@cimbusch
Copy link

In case somebody is interested or bumps into this:

Coordinated changes in gene expression kinetics underlie both mouse and human erythroid maturation
https://pubmed.ncbi.nlm.nih.gov/34225769/

@elifdogandar
Copy link

Hi,

I wanted to check splicing rates too. But apparently 'scv.utils.sum_var' is not there anymore. Do you have an alternative way to check this? I would be grateful...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

7 participants