-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue with 16 bit N CIGAR OP
field with Long Reads
#708
Comments
See #227 which added an additional CG tag to work around this. Internally in htslib we spot this tag and convert it back to the CIGAR field, so the API works as if it was always there. Actually changing the BAM format is something we'd like to be able to do, but realistically it's just too legacy to consider it worth while, and it'd cause even more confusion with many tools failing to upgrade. (You wouldn't believe how popular samtools 0.1.18 still is! People just refuse to update their tools!) Edit: also which tools are failing to work? This spec change was merged in 2017. 6 years ought to be enough, so if tools are still choking then it's justified to be filing bug reports against them. |
Ah! I apologize - I was searching for previous PRs and Issues and I missed it. @cmnbroad just pointed me to that as well. This is the first time I've experienced the 16 bit issue, so I missed the I'm a proponent of updating the BAM format / SAM spec and pushing the change - most people won't be affected by this and since it's versioned I think people should expect that at some point there will be updates. :) That said, I have also seen groups that are VERY hesitant to update things, but they shouldn't hold everyone else back. |
Closing as it's a question and has been answered. The topic of whether to update BAM with a format-breaking change or a backwards compatible aux tag was a hot topic with valid views from both sides, but ultimately the decision was made to not do a major version number upgrade. |
I am doing some alignments of long read data and I am getting CIGAR operators with length beyond 65535. Because this is the spec, all the tools I'm using are falling over when trying to convert and work with my data in BAM format (SAM files work fine because they're ascii files).
These are genuine data - it's not an artifact / mistake.
I know there has been talk about expanding this field in the past to at least 32 bits. Can we revisit this and update the spec to handle longer alignments?
The text was updated successfully, but these errors were encountered: