Skip to content

Latest commit

 

History

History
56 lines (44 loc) · 2.03 KB

README.md

File metadata and controls

56 lines (44 loc) · 2.03 KB

Syngraph

Toolkit for evolutionary analyses of linkage groups

Dependencies

Best addressed via conda

$ conda install -c conda-forge networkx pandas docopt tqdm ete3 pygraphviz

Usage

Usage: syngraph <module> [<args>...] [-D -V -h]

  [Modules]
    build               Build graph from orthology data (e.g. BUSCO *.full_table.tsv)
    infer               Model rearrangements over a tree
    query               Get info on inferred ancestral genomes [TBI]
    viz                 Visualise graph/data [TBI]
    
  [Options]
    -h, --help          Show this screen.
    -D, --debug         Print debug information [TBI]
    -v, --version       Show version

  [Dependencies] 
    ------------------------------------------------------------------------------
    | $ conda install -c conda-forge networkx pandas docopt tqdm ete3 pygraphviz |
    ------------------------------------------------------------------------------

Build a syngraph from BUSCO data, allowing for missingness

syngraph build -d directory_of_tsv_files -m -o test

Model fissions and fusions over a tree, record rearrangements using taxon_1 as a reference

syngraph infer -g test.pickle -t newick.txt -r 2 -s taxon_1 -o test

Model translocations, fissions and fusions over a tree

syngraph infer -g test.pickle -t newick.txt -r 3 -s taxon_1 -o test

Input data

When inferring rearrangements, input data should only contain markers from chromosome scale sequences as unscaffolded contigs will result in excess fission events being inferred.

If using BUSCO data, tsv files should be named My_taxon.\*.tsv where My_taxon is also a leaf in the newick tree. The file should contain the first five columns in the *full_table.tsv file generated by BUSCO (Busco_id, Status, Sequence, Gene_Start, Gene_End). E.g.:

0at7088 Complete        HG995313.1      5723272 5863707
1at7088 Complete        HG995286.1      19966914        20084934
2at7088 Complete        HG995296.1      11128843        11215510