Skip to content

Toolkit for evolutionary analyses of linkage groups

License

Notifications You must be signed in to change notification settings

LohseLab/syngraph

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

67 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Syngraph

Toolkit for evolutionary analyses of linkage groups

Dependencies

Best addressed via conda

$ conda install -c conda-forge networkx pandas docopt tqdm ete3 pygraphviz

Usage

Usage: syngraph <module> [<args>...] [-D -V -h]

  [Modules]
    build               Build graph from orthology data (e.g. BUSCO *.full_table.tsv)
    infer               Model rearrangements over a tree
    query               Get info on inferred ancestral genomes [TBI]
    viz                 Visualise graph/data [TBI]
    
  [Options]
    -h, --help          Show this screen.
    -D, --debug         Print debug information [TBI]
    -v, --version       Show version

  [Dependencies] 
    ------------------------------------------------------------------------------
    | $ conda install -c conda-forge networkx pandas docopt tqdm ete3 pygraphviz |
    ------------------------------------------------------------------------------

Build a syngraph from BUSCO data, allowing for missingness

syngraph build -d directory_of_tsv_files -m -o test

Model fissions and fusions over a tree, record rearrangements using taxon_1 as a reference

syngraph infer -g test.pickle -t newick.txt -r 2 -s taxon_1 -o test

Model translocations, fissions and fusions over a tree

syngraph infer -g test.pickle -t newick.txt -r 3 -s taxon_1 -o test

Input data

When inferring rearrangements, input data should only contain markers from chromosome scale sequences as unscaffolded contigs will result in excess fission events being inferred.

If using BUSCO data, tsv files should be named My_taxon.\*.tsv where My_taxon is also a leaf in the newick tree. The file should contain the first five columns in the *full_table.tsv file generated by BUSCO (Busco_id, Status, Sequence, Gene_Start, Gene_End). E.g.:

0at7088 Complete        HG995313.1      5723272 5863707
1at7088 Complete        HG995286.1      19966914        20084934
2at7088 Complete        HG995296.1      11128843        11215510

About

Toolkit for evolutionary analyses of linkage groups

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%