Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--contig-end-exclusion doesn't work with -m not set to mean #163

Open
alexcritschristoph opened this issue Apr 19, 2023 · 2 comments
Open

Comments

@alexcritschristoph
Copy link

Hi Ben - big fan / user of coverM here. Recently I uncovered this issue with v0.6.1:

When I run

coverm contig --contig-end-exclusion 1000 --bam-files ./test/*.bam --output-format sparse -o test1.tsv --no-zeros -m mean
vs
coverm contig --contig-end-exclusion 0 --bam-files ./test/*.bam --output-format sparse -o test1.tsv --no-zeros -m mean

Different results are obtained consistent with the --contig-end-exclusion parameter working.

But when I run:

coverm contig --contig-end-exclusion 0 --bam-files ./test/*.bam --output-format sparse -o test1.tsv --no-zeros -m count

vs

coverm contig --contig-end-exclusion 1000 --bam-files ./test/*.bam --output-format sparse -o test1.tsv --no-zeros -m count

I get the exact same results, indicating to me that the contig end exclusion parameter is not working. The same is true when -m is set to covered_bases or covered_fraction. I think this is a bug.

@wwood
Copy link
Owner

wwood commented Apr 19, 2023

Thanks Alex, much appreciated for the bug and the kind words.

I think this is really an issue with count, not with non-mean methods, agree?

What would be a good definition that accounts for reads that cross the boundary? Starting position for start of contig and end for end of contig?

@alexcritschristoph
Copy link
Author

Hi Ben,
So, I see this issue with -m covered_bases and -m covered_fraction in my data as well, do you see that as well?

I'm actually not sure I follow your second question - my guess would be best for the parameter to be a hard cutoff, so that any read that crosses the boundary (e.g. the 100 bp from the edge by default) at all is not counted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants