Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--haplotype rework and metadata loading flag #329

Merged
merged 9 commits into from
Feb 10, 2023

Conversation

jmcbroome
Copy link
Collaborator

This PR addresses two issues.

First, it addresses #303. Typically, only metadata for samples in the users query set is loaded into memory. This was originally implemented to reduce the memory footprint of our approach. However, in cases with -N, -K, and similar, users may want full metadata to be available for any and all samples in their output, including non-query context samples. Accordingly, I have added a flag (without a single letter accompanying it) --load-all-metadata to matUtils extract indicating that all available metadata should be loaded and available for output.

Second, it addresses #326. This is a significant rework of the implementation and output of matUtils summary --haplotype. It is now dynamically computed, significantly reducing runtime, and instead of representing haplotypes as unordered mutational paths, they are now represented as location-state strings in a set (e.g. '56A,60G' means that a haplotype where position 56 is A, position 60 is G, and the rest are reference).

@yatisht yatisht merged commit d90bc9f into yatisht:master Feb 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants