Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

basequality filter for sambamba depth region/window? #154

Closed
melferink opened this issue Jul 14, 2015 · 5 comments
Closed

basequality filter for sambamba depth region/window? #154

melferink opened this issue Jul 14, 2015 · 5 comments

Comments

@melferink
Copy link

Hi Artem,

Is it possible to filter on basequality for sambamba depth region/window?
I see the option -p for depth base but not for region/window.
Also, I can not find basequality in the code for the filter (-F) option.

Did I miss it, or is it not possible (yet)?

Btw. the tool is working great! Keep up the good work!

Greetz,
Martin

@lomereiter
Copy link
Contributor

Hi Martin,

I've added (minimum) base quality filtering for all modes.
It's currently not possible to filter out reads based on their base qualities, I'm not sure how useful that could be, but maybe I should add something like avg_base_quality / min_base_quality / median_base_quality to the --filter option.

@melferink
Copy link
Author

Thanks!

@melferink
Copy link
Author

Hi Artem,

I did some small test, and I am afraid the -q option is not working properly (at least not the way I expected it to do ;) ).

It appears that the q filtering is fixed on base 2 only (irrespective of orientation of the read) and when this base does not meet the threshold, the entire reads (thus all bases) are removed from the analysis.
Quality of other bases (besides base 2), does not influence the filtering/calculations even if they are below the threshold.

The way I see the filtering is that each base in a read thats lower than the quality is removed, the remaining bases should be used.
So a read with JJJJJJAJJJ should be counted as 10 with no filterings, and as 9 with filtering -q 33.

Let me know if you need more input (or test files)!

Greetz

@lomereiter
Copy link
Contributor

Oops, you are right. Thanks for the testing.

@lomereiter lomereiter reopened this Jul 17, 2015
lomereiter added a commit that referenced this issue Jul 17, 2015
@lomereiter
Copy link
Contributor

OK, I made the changes so that now bases are counted according to your suggestion. A read is counted only if it has at least one good quality base in the region. Indels are just ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants