Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle no-ops more intelligently when creating MD tags #392

Closed
kristalcurtis opened this issue Sep 22, 2014 · 5 comments
Closed

Handle no-ops more intelligently when creating MD tags #392

kristalcurtis opened this issue Sep 22, 2014 · 5 comments

Comments

@kristalcurtis
Copy link

Currently, Adam creates very verbose MD tags when it encounters no-ops, e.g., 2G0A0T0G0A2G2A0T0G0T1G0T0C3T0G0A0G0T1G0T0C0A1G0T0C0T0G0A0T0G1A3G0A0C0A0T0C0A0T1G2C0A0G0A0T0G0C0T0G0A0G4C0A0G1C0A2C0A0T0A0T1G0T0G9.

@fnothaft
Copy link
Member

D'oh! That doesn't look correct... Which piece of code is emitting this?

@fnothaft
Copy link
Member

Er, actually, if those all are mismatches, that could be correct. Are you able to share the read/alignment publicly?

@kristalcurtis
Copy link
Author

Sure, I can share; it's just from the SMaSH Venter reads.

Here's the original read:
@chr22_42898209_42898675_?:?:?_?:?:?_MATERNAL_42933514_42933981_649d432/2
GGGGGGGGGGGCACCATATGGGTGGCTGGGGGCTCAGCATCTGGGCCATGATGTCCCCTTCATCAGACCTGACCACTCAAAAGACCACATTTCCCTCATCC
+
CGFGGG?FGGCGFGGAGG4GFGGG6FFAGGGGDGEFFGFGGGDB@GFGGGGG?GGGGGFFEEFFGG@>EGGBGGGGG@G16?DGB4GDGFBEGFEGFFGF>

It aligns to chr22, 42898574 (if 0-indexed), in the reverse direction. Here's the cigar string & MD tag I get from Adam:
cigar: 10M1I1M2I1M1I3M1D3M1D5M1D4M1I4M1I4M2I1M2I16M4D4M1D5M1D2M1I1M1I9M1I3M1D1M2D1M1D9M
mdTag: 3A0T1A3A0A0A0T0G0^T2T0^C1T0T0T0G0^A1T2T0C0A0G0G0T0C0T1A0T0G0A1G1G0G0A1A1C0A0T0^GGCC0C0A0G0A0^T0G1T0G0A0^G0C1C0C2G0C0C0A0C1C0A0T0^A1^GG1^G0C3C1C0C1

It looks like there's an indel at the beginning of the read:
scala> genome.substring(42898574, 42898574 + 101)
res6: String = GGGATGAGGGAAATGTGGTCTTTTGAGTGGTCAGGTCTGATGAAGGGGACATCATGGCCCAGATGCTGAGCCCCCAGCCACCCATATGGTGCCCCCCCCCC

Maybe that's what is causing the hiccup? I realize this MD tag is different from the one I posted... maybe the code changed in between me getting the above result (about a week ago) and now?

@fnothaft
Copy link
Member

Hmmm, that's a messy alignment. I'll look into this...

@fnothaft
Copy link
Member

Closing as won't fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants