-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CIGAR strings in "SA" tags differ from the CIGAR strings in the corresponding supplementary alignment records #724
Comments
In minimap2, the SA tag mainly tells you the start and end coordinates of other alignments. It is not intended to keep detailed CIGARs. |
Hi @lh3, I am also very grateful for your work in developing Minimap2! I think that Marcus makes some good points and I'd suggest updating Minimap2 so that it reports identical CIGAR strings in the SA tag and the SA record. This would be helpful for a few reasons:
At a minimum, it would be helpful if the documentation was updated to clarify this point. |
Thank you both! I agree with Charles' points; if keeping the SA tag in the slightly inaccurate "shortened" format is necessary, then I propose that we update minimap2's man page to explain this. Maybe the following line: Line 699 in bc588c0
could be changed to something like
This should make the situation much clearer. (If you'd prefer, we could also just add a link to this issue or to #287.) If these changes seem reasonable, I would be happy to file a PR that updates the man page -- so that other users don't get stuck on this issue like we did. |
) * Mention approx CIGAR strings in man page #724 * Fix FAQ typo
Hi, and thank you for developing minimap2! I have two questions about an issue with supplementary alignments.
Problem description
For records that are part of a supplementary alignment in SAM files generated by minimap2, the CIGAR strings listed in the
SA
tags are different from the CIGAR strings listed for these supplementary alignments' records in the same file. It is not immediately clear which of the two CIGAR strings should be interpreted as the "canonical" CIGAR string for this alignment, or why the CIGAR strings are different.It looks like this was brought up previously in #524 (comment) and in #287, which imply that the
SA
CIGAR string should not be relied on in these cases.Questions
As a general rule, does it make sense to ignore the CIGAR string listed in the
SA
tag -- and to always use the CIGAR string located on that alignment's own line in the SAM file instead?If so, do you think it would make sense to update the documentation to clarify this? From reading the description of the
SA
tag, it was not clear to me that CIGAR strings listed for these alignments were expected to be incorrect. I am happy to submit a PR that updates the README or the FAQ accordingly, in order to help future minimap2 users.My apologies if I am misunderstanding anything!
Example showing the problem
This file,
supplemental_alignment.sam.txt
, is a subset of a SAM file generated by minimap2. This SAM file has been filtered to just the two lines originating from a read namedm54033_180919_161442/4194410/ccs
.Line 1 describes the primary alignment of this read to a reference sequence named
edge_25034
, and includes a reference to a supplementary alignment of this read in a different reference sequence namededge_34620
. The CIGAR string listed on Line 1 in theSA:
tag for theedge_34620
supplementary alignment is1136M 3I 5180S
(spaces added for clarity).Line 2 describes the
edge_34620
supplementary alignment in detail: however, the CIGAR string listed on this line for this supplementary alignment is23M 1I 537M 1I 64M 1I 512M 5180H
, instead.The counts of each operation generally match up (e.g. in both CIGAR strings there are 1,136 M operations) but the alignments represented by these strings are still slightly different.
The exact command to minimap2 used to generate the full SAM file was
minimap2 -ax asm20 [reference FASTA file] [reads FASTQ file] > alignment.sam
; the data is derived from this BioProject.Software versions
minimap2 version:
2.17-r941
(installed using linuxbrew)Running on Ubuntu version
16.04.7
The text was updated successfully, but these errors were encountered: