Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pyani tree subcommand #187

Open
widdowquinn opened this issue May 25, 2020 · 12 comments · May be fixed by #370
Open

Add pyani tree subcommand #187

widdowquinn opened this issue May 25, 2020 · 12 comments · May be fixed by #370
Assignees
Labels
enhancement something we'd like pyani to do that it doesn't already interface issues related to how the user tells pyani to do something
Milestone

Comments

@widdowquinn
Copy link
Owner

Summary:

Add Newick tree output to pyani.

Description:

In #186 a question was raised about generating trees from pyani directly. At the moment this isn't implemented, but could be done fairly readily. One API implementation for writing might be:

pyani tree --formats [newick,nexus] <output_dir> <run ID>

or for graphical output:

pyani plot --formats [png,pdf] --method ete3 <output_dir> <run ID>

Current Output:

Not implemented

pyani Version:

v0.3+

@widdowquinn widdowquinn self-assigned this May 25, 2020
@widdowquinn widdowquinn added the enhancement something we'd like pyani to do that it doesn't already label May 25, 2020
@widdowquinn widdowquinn modified the milestones: 0.3.0, 0.3.1 May 28, 2020
@widdowquinn widdowquinn added the interface issues related to how the user tells pyani to do something label May 29, 2020
@peterjc
Copy link
Collaborator

peterjc commented Oct 19, 2021

I was just looking to see if this was implemented, as I wanted to extract a Newick format tree from pyANI to compare to another clustering method, e.g. visually with https://phylo.io/ https://doi.org/10.1093/molbev/msw080

@widdowquinn
Copy link
Owner Author

In the interim, it ought to be straightforward to produce a dendrogram in R or similar by taking in the ANI matrix and using hclust. You'll get more control over the clustering method then (rather than trusting our choice).

@peterjc
Copy link
Collaborator

peterjc commented Oct 19, 2021

Yep, I will try the R code snippet on the linked issue for this.

@widdowquinn
Copy link
Owner Author

@baileythegreen is working on this, just now. There should be a pyani tree option in v0.3 coming soon.

@baileythegreen
Copy link
Contributor

baileythegreen commented Oct 20, 2021

I have initially implemented this as an option within pyani plot, so that Newick formatted output and plotted dendrograms are created using an additional option in the pyani plot parser. This doesn't yet afford much control over which Newick files or dendrograms are created, but that will be possible, eventually.

The tree_186 branch will currently create a Newick file and a dendrogram for each axis of matrix, named accordingly. Species names (along with the gnome number) get added to the dendrograms, but the Newick files only show the numbers.

If you wanted to try this now, from the tree_186 branch you would run:

pyani plot -o <output_file> --run_id <run_id> --dbpath <database_file> -l <log_file> --tree

but be warned it will generate 16 plots and 10 Newick files.

It also adds some new dependencies.

@baileythegreen
Copy link
Contributor

Newick file output has been modified so there is only one file, with the 'name' of the tree inside a comment.

Thus far, I know this format works both with libraries I have tested in Python, and with the https://phylo.io site Peter linked.

[col_newick_identity_run1]	((((((((7:0.05,2:0.05):0.20,4:0.25):0.02,(34:0.16,30:0.16):0.11):0.04,((39:0.01,11:0.01):0.00,43:0.01):0.29):0.03,(((44:0.02,23:0.02):0.16,(12:0.01,5:0.01):0.17):0.01,((41:0.01,26:0.01):0.12,49:0.14):0.06):0.15):0.02,((25:0.00,19:0.00):0.06,27:0.07):0.29):0.04,((((42:0.01,29:0.01):0.01,45:0.02):0.21,((48:0.00,22:0.00):0.00,8:0.00):0.23):0.02,((46:0.01,10:0.01):0.00,18:0.01):0.24):0.15):0.05,(((((47:0.03,20:0.03):0.02,35:0.05):0.18,36:0.23):0.03,((40:0.03,33:0.03):0.04,13:0.06):0.20):0.07,((((((28:0.00,21:0.00):0.02,24:0.03):0.11,((38:0.00,14:0.00):0.00,37:0.00):0.13):0.03,((31:0.00,6:0.00):0.05,17:0.05):0.12):0.01,((32:0.00,9:0.00):0.01,15:0.01):0.17):0.06,((16:0.01,1:0.01):0.00,3:0.01):0.24):0.08):0.11);
[row_newick_identity_run1]	((((((((7:0.05,2:0.05):0.20,4:0.25):0.02,(34:0.16,30:0.16):0.11):0.04,((39:0.01,11:0.01):0.00,43:0.01):0.29):0.03,(((44:0.02,23:0.02):0.16,(12:0.01,5:0.01):0.17):0.01,((41:0.01,26:0.01):0.12,49:0.14):0.06):0.15):0.02,((25:0.00,19:0.00):0.06,27:0.07):0.29):0.04,((((42:0.01,29:0.01):0.01,45:0.02):0.21,((48:0.00,22:0.00):0.00,8:0.00):0.23):0.02,((46:0.01,10:0.01):0.00,18:0.01):0.24):0.15):0.05,(((((47:0.03,20:0.03):0.02,35:0.05):0.18,36:0.23):0.03,((40:0.03,33:0.03):0.04,13:0.06):0.20):0.07,((((((28:0.00,21:0.00):0.02,24:0.03):0.11,((38:0.00,14:0.00):0.00,37:0.00):0.13):0.03,((31:0.00,6:0.00):0.05,17:0.05):0.12):0.01,((32:0.00,9:0.00):0.01,15:0.01):0.17):0.06,((16:0.01,1:0.01):0.00,3:0.01):0.24):0.08):0.11);
[col_newick_coverage_run1]	((((((7:0.15,2:0.15):0.89,4:1.04):0.23,((39:0.07,11:0.07):0.07,43:0.15):1.12):0.92,((((42:0.02,29:0.02):0.05,45:0.07):0.55,((48:0.01,22:0.01):0.01,8:0.01):0.61):0.04,((46:0.07,10:0.07):0.29,18:0.36):0.30):1.53):0.66,((((41:0.09,26:0.09):0.08,49:0.17):0.09,(44:0.11,23:0.11):0.14):0.09,(12:0.10,5:0.10):0.23):2.51):0.63,(((34:0.39,30:0.39):1.33,((25:0.01,19:0.01):0.24,27:0.25):1.47):0.76,(((((47:0.19,20:0.19):0.04,35:0.24):0.36,((40:0.13,33:0.13):0.05,13:0.18):0.42):0.09,36:0.68):0.18,((((((31:0.02,6:0.02):0.12,17:0.14):0.18,((38:0.01,37:0.01):0.02,14:0.03):0.29):0.01,((28:0.14,21:0.14):0.06,24:0.20):0.13):0.19,((32:0.01,9:0.01):0.10,15:0.11):0.40):0.02,((16:0.13,3:0.13):0.01,1:0.14):0.40):0.33):1.61):1.00);
[row_newick_coverage_run1]	(((((((39:0.07,11:0.07):0.07,43:0.15):0.82,4:0.97):0.32,(7:0.15,2:0.15):1.13):0.91,((((46:0.08,10:0.08):0.14,18:0.22):0.37,((42:0.02,29:0.02):0.06,45:0.08):0.51):0.06,((48:0.01,22:0.01):0.01,8:0.01):0.64):1.55):0.67,((((41:0.09,26:0.09):0.08,49:0.17):0.08,(44:0.11,23:0.11):0.14):0.08,(12:0.12,5:0.12):0.21):2.52):0.60,(((34:0.37,30:0.37):1.35,((25:0.01,19:0.01):0.19,27:0.20):1.52):0.68,(((((47:0.19,20:0.19):0.05,35:0.24):0.33,36:0.57):0.03,((40:0.16,33:0.16):0.03,13:0.19):0.41):0.19,((((((28:0.10,21:0.10):0.10,24:0.20):0.12,((38:0.02,37:0.02):0.01,14:0.03):0.29):0.01,((31:0.02,6:0.02):0.12,17:0.14):0.19):0.09,((16:0.14,1:0.14):0.06,3:0.20):0.22):0.08,((32:0.01,9:0.01):0.15,15:0.16):0.34):0.28):1.61):1.06);
[col_newick_aln_lengths_run1]	((((((34:1729666.71,30:1729666.71):6021310.16,((25:57790.43,19:57790.43):1102066.41,27:1159856.85):6591120.02):2813471.27,(((7:680495.33,2:680495.33):3917876.19,4:4598371.52):1855310.71,((39:385588.94,11:385588.94):396069.71,43:781658.64):5672023.59):4110765.91):127135.17,((((41:326365.60,26:326365.60):312321.94,49:638687.54):350057.03,(44:422480.70,23:422480.70):566263.87):301298.21,(12:403469.92,5:403469.92):886572.86):9401540.53):1776281.31,((((46:320686.03,10:320686.03):1422273.38,18:1742959.42):1314588.33,((42:76629.66,29:76629.66):268554.45,45:345184.11):2712363.63):218423.56,((48:29763.67,22:29763.67):37698.65,8:67462.33):3208508.97):9191893.31):3884533.89,(((((47:920299.65,20:920299.65):196364.64,35:1116664.29):1703901.93,((40:589103.58,33:589103.58):267177.40,13:856280.99):1964285.23):349841.61,36:3170407.83):988677.82,((((16:579547.62,3:579547.62):60666.90,1:640214.52):1885771.47,((32:70434.69,9:70434.69):477082.81,15:547517.50):1978468.48):51421.91,((((28:683319.16,21:683319.16):295324.41,24:978643.58):594529.05,((38:45706.33,37:45706.33):80795.18,14:126501.51):1446671.11):54092.57,((31:94617.41,6:94617.41):610986.06,17:705603.46):921661.73):950142.71):1581677.75):12193312.86);
[row_newick_aln_lengths_run1]	((((((34:1729666.71,30:1729666.71):6021310.16,((25:57790.43,19:57790.43):1102066.41,27:1159856.85):6591120.02):2813471.27,(((7:680495.33,2:680495.33):3917876.19,4:4598371.52):1855310.71,((39:385588.94,11:385588.94):396069.71,43:781658.64):5672023.59):4110765.91):127135.17,((((41:326365.60,26:326365.60):312321.94,49:638687.54):350057.03,(44:422480.70,23:422480.70):566263.87):301298.21,(12:403469.92,5:403469.92):886572.86):9401540.53):1776281.31,((((46:320686.03,10:320686.03):1422273.38,18:1742959.42):1314588.33,((42:76629.66,29:76629.66):268554.45,45:345184.11):2712363.63):218423.56,((48:29763.67,22:29763.67):37698.65,8:67462.33):3208508.97):9191893.31):3884533.89,(((((47:920299.65,20:920299.65):196364.64,35:1116664.29):1703901.93,((40:589103.58,33:589103.58):267177.40,13:856280.99):1964285.23):349841.61,36:3170407.83):988677.82,((((16:579547.62,3:579547.62):60666.90,1:640214.52):1885771.47,((32:70434.69,9:70434.69):477082.81,15:547517.50):1978468.48):51421.91,((((28:683319.16,21:683319.16):295324.41,24:978643.58):594529.05,((38:45706.33,37:45706.33):80795.18,14:126501.51):1446671.11):54092.57,((31:94617.41,6:94617.41):610986.06,17:705603.46):921661.73):950142.71):1581677.75):12193312.86);
[col_newick_sim_errors_run1]	(((((((31:7951.26,6:7951.26):226334.73,17:234286.00):432930.35,((32:5819.48,9:5819.48):35993.47,15:41812.94):625403.40):45283.52,(((28:61050.96,21:61050.96):69708.02,24:130758.98):448708.50,((38:5065.01,37:5065.01):3643.35,14:8708.36):570759.12):133032.38):178830.84,((16:47324.48,1:47324.48):1570.55,3:48895.03):842435.68):70755.85,((((47:131768.58,20:131768.58):73675.48,35:205444.06):454534.31,36:659978.37):126937.89,((40:107620.12,33:107620.12):147284.15,13:254904.27):532012.00):175170.29):402086.76,(((((44:74423.27,23:74423.27):486600.08,(12:26988.70,5:26988.70):534034.65):32481.97,((41:50073.89,26:50073.89):403490.32,49:453564.21):139941.11):248590.32,((((25:4576.29,19:4576.29):266115.40,27:270691.68):75627.70,30:346319.38):82470.23,34:428789.61):413306.03):67447.45,((((((7:184456.97,2:184456.97):227902.71,4:412359.68):98009.14,((39:55158.79,11:55158.79):4652.39,43:59811.18):450557.63):89079.90,((46:28134.13,10:28134.13):120563.12,18:148697.25):450751.46):67935.11,((42:8742.68,29:8742.68):87944.86,45:96687.55):570696.28):66128.94,((48:1673.39,22:1673.39):1350.79,8:3024.18):730488.58):176030.33):454630.22);
[row_newick_sim_errors_run1]	(((((((31:7951.26,6:7951.26):226334.73,17:234286.00):432930.35,((32:5819.48,9:5819.48):35993.47,15:41812.94):625403.40):45283.52,(((28:61050.96,21:61050.96):69708.02,24:130758.98):448708.50,((38:5065.01,37:5065.01):3643.35,14:8708.36):570759.12):133032.38):178830.84,((16:47324.48,1:47324.48):1570.55,3:48895.03):842435.68):70755.85,((((47:131768.58,20:131768.58):73675.48,35:205444.06):454534.31,36:659978.37):126937.89,((40:107620.12,33:107620.12):147284.15,13:254904.27):532012.00):175170.29):402086.76,(((((44:74423.27,23:74423.27):486600.08,(12:26988.70,5:26988.70):534034.65):32481.97,((41:50073.89,26:50073.89):403490.32,49:453564.21):139941.11):248590.32,((((25:4576.29,19:4576.29):266115.40,27:270691.68):75627.70,30:346319.38):82470.23,34:428789.61):413306.03):67447.45,((((((7:184456.97,2:184456.97):227902.71,4:412359.68):98009.14,((39:55158.79,11:55158.79):4652.39,43:59811.18):450557.63):89079.90,((46:28134.13,10:28134.13):120563.12,18:148697.25):450751.46):67935.11,((42:8742.68,29:8742.68):87944.86,45:96687.55):570696.28):66128.94,((48:1673.39,22:1673.39):1350.79,8:3024.18):730488.58):176030.33):454630.22);
[col_newick_hadamard_run1]	((((((34:0.49,30:0.49):1.23,((25:0.01,19:0.01):0.28,27:0.29):1.43):0.41,(((7:0.19,2:0.19):0.93,4:1.12):0.24,((39:0.08,11:0.08):0.07,43:0.16):1.21):0.77):0.27,((((46:0.07,10:0.07):0.27,18:0.35):0.43,((42:0.02,29:0.02):0.07,45:0.09):0.69):0.05,((48:0.01,22:0.01):0.01,8:0.01):0.82):1.57):0.25,((((41:0.10,26:0.10):0.18,49:0.28):0.12,(44:0.13,23:0.13):0.28):0.08,(12:0.11,5:0.11):0.37):2.17):0.61,(((((47:0.21,20:0.21):0.06,35:0.28):0.49,((40:0.15,33:0.15):0.08,13:0.23):0.54):0.04,36:0.80):0.25,((((((28:0.13,21:0.13):0.09,24:0.22):0.21,((38:0.01,37:0.01):0.02,14:0.02):0.41):0.03,((31:0.02,6:0.02):0.17,17:0.18):0.28):0.18,((32:0.01,9:0.01):0.11,15:0.12):0.53):0.05,((16:0.14,3:0.14):0.01,1:0.15):0.55):0.35):2.21);
[row_newick_hadamard_run1]	((((((34:0.48,30:0.48):1.24,((25:0.01,19:0.01):0.24,27:0.26):1.47):0.44,((((39:0.08,11:0.08):0.07,43:0.16):0.93,4:1.09):0.28,(7:0.19,2:0.19):1.17):0.80):0.25,((((46:0.08,10:0.08):0.14,18:0.22):0.53,((42:0.02,29:0.02):0.08,45:0.09):0.65):0.07,((48:0.01,22:0.01):0.01,8:0.01):0.80):1.59):0.25,((((41:0.10,26:0.10):0.18,49:0.28):0.12,(44:0.13,23:0.13):0.28):0.08,(12:0.12,5:0.12):0.35):2.18):0.57,(((((47:0.21,20:0.21):0.06,35:0.28):0.43,36:0.71):0.05,((40:0.17,33:0.17):0.06,13:0.24):0.52):0.22,((((((28:0.09,21:0.09):0.13,24:0.22):0.21,((38:0.02,37:0.02):0.01,14:0.02):0.41):0.04,((31:0.02,6:0.02):0.17,17:0.19):0.28):0.13,((16:0.15,1:0.15):0.05,3:0.20):0.40):0.05,((32:0.01,9:0.01):0.14,15:0.15):0.49):0.33):2.25);

@widdowquinn
Copy link
Owner Author

Can confirm the format works with FigTree - thanks @baileythegreen

@peterjc
Copy link
Collaborator

peterjc commented Dec 14, 2021

I wrongly assumed this was already on master, is https://github.com/widdowquinn/pyani/commits/tree_186 the latest version of pyani plot --tree if I wanted to try this out?

@baileythegreen
Copy link
Contributor

It is. Right now it plots trees as part of the plotting subcommand, and does not offer much customisation. I am in the process of refining this, and also creating a separate subcommand that allows more customisation. All of that is to be done on the tree_186 branch.

@peterjc
Copy link
Collaborator

peterjc commented Jan 5, 2022

There isn't a (draft) PR for tree_186 yet is there? I would have comments, to start with it needs to declare the added ete3 dependency.

Also I seem to be missing some graphical dependency stuff... getting these warnings multiple times (despite not warning to actually use the display).

WARNING: QApplication was not created in the main() thread.
qt.qpa.xcb: could not connect to display 
qt.qpa.plugin: Could not load the Qt platform plugin "xcb" in "" even though it was found.
This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.

The above seems not to write a tree (i.e. something aborts after the warnings).

It looks like right now the branch only adds the tree to the heatmap code in the seaborn plotting pyani/pyani_graphics/sns/__init__.py (others still to be implemented), but adding any --method XXX argument including --method seaborn seems to skip the trees.

@baileythegreen
Copy link
Contributor

Hi Peter,

There was not a draft PR for this; but I've made one here, to make it easier for you to give feedback/comments.

The ete3 dependency is listed in requirements.txt on the tree_186 branch; in master, this file is used when installing, but if you are using your normal installation just on this branch, that probably wouldn't happen as that package has not been used elsewhere in pyani. You will also need PyQt (or PyQT5), which I think might address your graphical dependency warnings (I haven't seen those before). This is also listed in the requirements.txt file.

I've responded to your comment on the last commit here; that's an issue that I need to solve; the test suite on my computer seems happy with the current version, but CircleCI here is failing at this point. I'm testing a potential solution right now, but may also discuss this with Leighton tomorrow.

It looks like right now the branch only adds the tree to the heatmap code in the seaborn plotting pyani/pyani_graphics/sns/init.py (others still to be implemented), but adding any --method XXX argument including --method seaborn seems to skip the trees.

I will look into this; it seems odd.

@peterjc
Copy link
Collaborator

peterjc commented Jan 10, 2022

Thank you!

#370

@baileythegreen baileythegreen changed the title Add tree output to pyani Add pyani tree subcommand May 11, 2022
@baileythegreen baileythegreen linked a pull request May 11, 2022 that will close this issue
21 tasks
@github-project-automation github-project-automation bot moved this to In progress in pyani Aug 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement something we'd like pyani to do that it doesn't already interface issues related to how the user tells pyani to do something
Projects
Status: In progress
Development

Successfully merging a pull request may close this issue.

3 participants