Broccoli is designed to infer with high precision orthologous groups and pairs of proteins using a mixed phylogeny-network approach. Briefly, Broccoli performs ultra- fast phylogenetic analyses on most proteins and builds a network of orthologous relationships. Orthologous groups are then identified from the network using a parameter-free machine learning algorithm (label propagation). Broccoli also detects chimeric proteins resulting from gene-fusion events and assigns these proteins to the corresponding orthologous groups.
Reference: Broccoli: combining phylogenetic and network analyses for orthology assignment
To run Broccoli, you need (see the manual for installation advices):
- a Unix system (MacOS or Linux)
- Python version 3.6 or above
- ete3 library
- Diamond version 0.9.30 or above
- FastTree version 2.1.11 or above (single-thread version)
All parameters and options are available using the -help
argument (see also the manual for more details):
python broccoli.py -help
To test Broccoli with the small example dataset present in the directory example_dataset
(30 sec to 1mn):
python broccoli.py -dir example_dataset
Broccoli will store the temporary and output files in 4 directories named dir_step1
to dir_step4
(one for each step) located in the current directory.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.
See "LICENSE" for full terms and conditions of usage.