Fix for rendering of softclipping when there are insertions in the sequence #4402
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #4401
The code now processes the entire CIGAR string to get the seqOffset and refOffset. This is canonical CIGAR parsing code that is pretty idiomatic. The alternative to this would be adding an extra field for seqOffset to the mismatches datastructure. possibly that could be considered later on, but it would actually increase memory consumption so it's a tradeoff
this branch, showing both BAM and CRAM softclipping on long reads that have insertions and deletions. this actually looks expected as it shows that the alignments are pretty similar nearby the clip site but then diverge as they move away from it (compared to main branch, where it is almost random)
main branch, showing both BAM and CRAM softclipping on long reads that have insertions and deletions