Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-alphabet characters aren't counted as gaps when computing sequence coverage #268

Open
nikkithadani opened this issue Nov 9, 2021 · 1 comment
Assignees

Comments

@nikkithadani
Copy link

I'm not sure if this is something we want to fix or just something we want to note (can be a real issue for some viral sequences) but sequences with large numbers of Xs or other non-residue characters are not counted as gaps when applying the sequence coverage threshold.

keep_seqs = (1 - ali.count("-", axis="seq")) >= min_cov

@thomashopf thomashopf self-assigned this May 11, 2023
@thomashopf
Copy link
Contributor

Will be updated in future release together with more flexible sequence weight calculation handling

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants