Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Taxonium for bacteria #445

Open
theosanderson opened this issue Nov 3, 2022 · 4 comments
Open

Taxonium for bacteria #445

theosanderson opened this issue Nov 3, 2022 · 4 comments

Comments

@theosanderson
Copy link
Owner

I know that some people have/are using Taxonium for bacterial genomes. I suspect that this probably poses some issues with e.g. the number of mutations on each branch which might get overwhelming. If any of you folks would like to chat about ways to make your experience better, do let me know!

@AngieHinrichs
Copy link
Contributor

@lilymaryam, @aofarrel, @russcd and I are using taxonium to view UShER trees of M. tuberculosis genomes and it works fine unless we try to use the usher_to_taxonium --genbank option to see what the protein-coding mutations are. Then the first line of .jsonl.gz becomes so huge (650MB-900MB+ depending on the size of the tree & filtering options) that it apparently exceeds v8's string length limit of 512MB and node crashes with the error RangeError: Invalid string length (nodejs/node#35973). A more compact JSON representation of mutations might help, and/or splitting some of the first line values into multiple lines? I can provide example files if that would help.

@theosanderson
Copy link
Owner Author

theosanderson commented Mar 14, 2024

Thanks a lot for the report, and it's exciting that you are doing this!

Have you tried adding the --only_variable_sites parameter? I think the issue could be about the encoding of the ref genome. I definitely need a better solution to that generally, and intend that, but it could be a kind of workaround for now.

@AngieHinrichs
Copy link
Contributor

Have you tried adding the --only_variable_sites parameter?

Ah, I didn't know about that one! And it does fix it! Thanks and I'll use that for large genomes going forward (and make sure to look at the --help again next time I have a problem 🙂).

@theosanderson
Copy link
Owner Author

Fantastic, and no prob, and it still definitely needs a real solution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants