Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove clipped bases? #15

Open
apredeus opened this issue Jan 5, 2021 · 3 comments
Open

Remove clipped bases? #15

apredeus opened this issue Jan 5, 2021 · 3 comments

Comments

@apredeus
Copy link

apredeus commented Jan 5, 2021

I was wondering if it's worth adding the option to not remove the whole read with soft/hardclipping, but rather just trim the clipped reads? There's one obscure and no longer supported tool (ngsutils) that supports this option, but that's all I was able to find.

@steinbrl
Copy link

Hey, I would really appreciate if you could implement an option to remove only clipped bases. As well as apredeus, I crawled the web to find tools, which can do that. ngsutils dont work in my hands (HPC cluster). It only produces errors...

So, this function would be a real improvement!

@lskatz
Copy link

lskatz commented Jun 1, 2024

I thought this could be a fun mini project and so what do you think about this subroutine? I think it is working at least in theory locally for me.

Add in the qual column constant

# SAM file TSV columns
use constant {
  SAM_RNAME => 2,
  SAM_POS   => 3,
  SAM_CIGAR => 5,
  SAM_TLEN  => 8,
  SAM_SEQ   => 9,
  SAM_QUAL  => 10,
};

In the main body right before filtering:

    if($clip){
      my $samArr = clip(\@sam, $HL, $SL, $SR, $HR);
      @sam = @$samArr;
    }

Defined later in the code:

sub clip{
  my($samArr, $HL, $SL, $SR, $HR) = @_;

  # Avoid modifying the original array
  my @sam = @$samArr;

  # Trim the sequence and qual according to soft clips
  $sam[SAM_SEQ]  = substr($sam[SAM_SEQ],  $SL, length($sam[SAM_SEQ])-$SL-$SR);
  $sam[SAM_QUAL] = substr($sam[SAM_QUAL], $SL, length($sam[SAM_QUAL])-$SL-$SR);

  # What to do with hard clips? Just remove them from cigar string?

  return  \@sam;
}

And some getopts changes too.

@lskatz
Copy link

lskatz commented Jun 1, 2024

I'll put some more changes that I just made into a PR so that it's visible but it is not tested

@lskatz lskatz mentioned this issue Jun 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants