You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Sorry that I cannot provide data (all of it is unpublished and confidential), but I've had a lot of issues with samclip not clipping reads. One example of where I've tried to use it is:
Where $my_clip is the variable holding samclip and $my_stools is the poorly named variable for samtools.
These are long reads, so I'm trying to allow a maximum of 250 clipped reads. However when I look at the cigar string of the resulting file I see strings such as:
229S39M183S
143S30M215S
And many more. My understanding is that this --max 250 filter should be allowing for a max of 250 clipped reads?
Thank you!
-Elias
The text was updated successfully, but these errors were encountered:
Oh @tseemann I think I know where the problem is! I think what samclip is doing is allowing --max parameter on each end of a mapped read. So if the softclip is only on 1 end, it'll work as expected, but if there's clipping on both ends, it'll double the allotted max! Is there a way to reconcile this!
Ok my colleague helped me re-write a small portion to filter based on total clipped length:
my $LR = $L + $R;
my $info = $debug ? "CHROM=$sam[SAM_RNAME]:1..$contiglen POS=$start..$end CIGAR=$sam[SAM_CIGAR] L=$L R=$R | HL=$HL SL=$SL SR=$SR HR=$HR max=$max)" : "need --debug";
if ($LR > $max) {
msg("BAD! $info") if $debug;
$removed++;
next;
}
msg("GOOD $info") if $debug;
# otherwise pass through untouched
print $line if $invert;
$kept++;
}
I'm sure you'd be able to do this a lot better than I would if you want to add it as an option (something like --max-tot)? But yeah, just a thought. Thanks!
Hey Torsten,
Sorry that I cannot provide data (all of it is unpublished and confidential), but I've had a lot of issues with samclip not clipping reads. One example of where I've tried to use it is:
$my_clip
--max 250
--ref $my_gen |
Where $my_clip is the variable holding samclip and $my_stools is the poorly named variable for samtools.
These are long reads, so I'm trying to allow a maximum of 250 clipped reads. However when I look at the cigar string of the resulting file I see strings such as:
229S39M183S
143S30M215S
And many more. My understanding is that this --max 250 filter should be allowing for a max of 250 clipped reads?
Thank you!
-Elias
The text was updated successfully, but these errors were encountered: