-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bismark: fixed bug for ambiguous alignments #58
Conversation
@@ -3050,6 +3049,9 @@ sub check_bowtie_results_single_end_bowtie2{ | |||
# warn "$first_ambig_alignment\n"; sleep(1); | |||
} | |||
} | |||
elsif ($alignment_score == $best_AS_so_far) { | |||
$amb_same_thread = 1; | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the right way to go about this because if this alignment has the same alignment score as the best AS so far then this is not ambiguous in the same thread. To be ambiguous within the same thread the AS and the second best hit have to have the same score.
If overwrite is specified the alignment will be added to an alignment data structure further down, and in a later step all alignments in that data structure will be sorted by their lowest AS. If two or more different locations have the same lowest AS the read is booted altogether.
Have you tried the latest version and did ambiguous reads get booted that previously stayed in? I have tried it with the one read you supplied and there it seems to have worked just fine. Best, Felix |
Felix, I did try your latest version. It did better with some reads, but it failed for all reads that had AS1 == AS2 and (AS1 > XS1 or AS2 > XS2). This is because now when $alignment_score == $best_AS_so_far (i.e. AS1 == AS2), $amb_same_thread is reset to 0. The alignments are not booted altogether later; the second alignment is reported as unambiguous. It is not clear to me why if AS1 == AS2, you wouldn't want to abort (line 3188 -- only entered if $amb_same_thread == 1), the same as if AS1 == XS1 == best or AS2 == XS2 == best. For clarity/consistency, I/you could define a new $amb_diff_thread variable that is separate from $amb_same_thread but performs the same function in cases where AS1 == AS2. John Gaspar |
Would you be able to supply one of the reads that fail your test (AS1 == AS2 and (AS1 > XS1 or AS2 > XS2)? |
Felix, Here are the first 3 reads that fail: // indiv. alignments: // indiv. alignments: // indiv. alignments: With my update to Bismark, all such reads are unaligned, except in the *ambig.bam file. John Gaspar |
Hi John, Many thanks for these sequences, they do indeed post a new corner case which the last 'fix' introduced... I have now changed it again so that the |
Felix, The short answer is no, it doesn't fix it. John Gaspar |
and they are all present in the |
Felix, Before I close the pull request, please go back and re-read my previous comments. I'd rather not continue this cycle of testing and diagnosing your "fixes", especially since I already spent a great deal of time debugging the code, producing an actual fix, and testing and verifying it on 83.5 million reads (not just 1 or 3 reads). John Gaspar |
Hi John, First of all I am very grateful for your help and willingness to improve the Bismark code. I have re-read your comments and have tried to implement a solution that is correct and does the right thing.
This is happening a few lines further down in the code already: If there is only a single reported alignment it will be used as Unique alignment. If there are two or more alignments they are sorted by AS and if they have the same AS the read will be booted. This scenario would be the I have tried to faithfully address the bug you have outlined to me which seems to work just fine on the offending reads you supplied, but "the short answer is no" isn't exactly helpful if there isn't a slightly longer one as well. Best wishes, Felix |
Felix, Sorry to say I can't assist further. This has been very time consuming, and it seems we have profoundly different notions about software debugging and testing. Good luck, John Gaspar |
Hi Felix,
Your changes didn't quite do the trick. This commit, as far as I can tell, handles the ambiguous alignments correctly. It also produces a complete ambig.bam file -- previously (even with v0.15.0) some alignments were left out of this file. Tested on single-end reads.
Cheers,
John Gaspar