Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to interpret 5mC and 5hmC call for same position in the genome #258

Open
dasn588 opened this issue Sep 9, 2024 · 1 comment
Open
Labels
question Looking for clarification on inputs and/or outputs

Comments

@dasn588
Copy link

dasn588 commented Sep 9, 2024

Hi,
I have generated human genome based methylation call output for a sample using Modkit pileup command.
The output bedmethyl file has following data, where for position 10566 both 5hmC and 5mC are reported.
chrom start position end position modified base code score strand Nvalid_cov fraction modified Nmod Ncanonical Nother_mod Ndelete Nfail Ndiff Nnocall
1 10566 10567 h 8 - 8 12.50 1 0 7 0 0 0 0
1 10566 10567 m 8 - 8 87.50 7 0 1 0 0 0 0

Please let me know how to interpret this,

  1. whether a genomic loci can have both 5mC and 5hmC methylation call
  2. based on maximum read depth and modified base specific read depth it should be considered as 5mC event
@ArtRand
Copy link
Contributor

ArtRand commented Sep 9, 2024

Hello @dasn588,

  1. whether a genomic loci can have both 5mC and 5hmC methylation call

From what you've shown here there are 8 reads with passing modification calls, 1 reports 5hmC and 7 report 5mC. A particular position may be 5mC in some cells and 5hmC in others, depending on what kind of sample you have. What you can do is interpret these rows as this position is 5mC 87.5% of the time.

  1. based on maximum read depth and modified base specific read depth it should be considered as 5mC event

I'm not sure what you mean by "based on maximum read depth". In the data you've shown you N_fail, N_diff, and N_nocall are all zero. That makes me think this is probably a high-confidence position and high-quality base modification calls. However, making this decision depends a little on your biological question and the sample you've sequenced. What you can say is that this position is mostly likely a 5mC.

@ArtRand ArtRand added the question Looking for clarification on inputs and/or outputs label Sep 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Looking for clarification on inputs and/or outputs
Projects
None yet
Development

No branches or pull requests

2 participants