Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using --max hasn't worked #12

Open
eoziolor opened this issue Jun 15, 2020 · 2 comments
Open

Using --max hasn't worked #12

eoziolor opened this issue Jun 15, 2020 · 2 comments

Comments

@eoziolor
Copy link

Hey Torsten,

Sorry that I cannot provide data (all of it is unpublished and confidential), but I've had a lot of issues with samclip not clipping reads. One example of where I've tried to use it is:

$my_stools view -h ${i}_merged.bam |
$my_clip
--max 250
--ref $my_gen |
$my_stools sort > ${i}_sorted.bam

Where $my_clip is the variable holding samclip and $my_stools is the poorly named variable for samtools.

These are long reads, so I'm trying to allow a maximum of 250 clipped reads. However when I look at the cigar string of the resulting file I see strings such as:
229S39M183S
143S30M215S

And many more. My understanding is that this --max 250 filter should be allowing for a max of 250 clipped reads?

Thank you!

-Elias

@eoziolor
Copy link
Author

Oh @tseemann I think I know where the problem is! I think what samclip is doing is allowing --max parameter on each end of a mapped read. So if the softclip is only on 1 end, it'll work as expected, but if there's clipping on both ends, it'll double the allotted max! Is there a way to reconcile this!

@eoziolor
Copy link
Author

eoziolor commented Jun 16, 2020

Ok my colleague helped me re-write a small portion to filter based on total clipped length:
my $LR = $L + $R;
my $info = $debug ? "CHROM=$sam[SAM_RNAME]:1..$contiglen POS=$start..$end CIGAR=$sam[SAM_CIGAR] L=$L R=$R | HL=$HL SL=$SL SR=$SR HR=$HR max=$max)" : "need --debug";
if ($LR > $max) {
msg("BAD! $info") if $debug;
$removed++;
next;
}
msg("GOOD $info") if $debug;
# otherwise pass through untouched
print $line if $invert;
$kept++;
}

I'm sure you'd be able to do this a lot better than I would if you want to add it as an option (something like --max-tot)? But yeah, just a thought. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant