IFDP

The inferred fiber degradtion computational framework couples metagenomic sequencing with careful annotation of polysaccharide degrading enzymes and DFs structures has been established to allow the in-depth characterization of the microbiome ability to degrade and breakdown dietary fibers.

The simple framework allows to generate the Inferred Fiber Degradtion Profile (IFDP) in a single command necessitating the database (pre-built or re-built with a different version of diamond) fasta (or fastq) file.

Dependencies:

Diamond (database was built with v0.9.9, yet the database can be re-built using the database attached)
Numpy
Pandas

Installation

1.Install dependencies:

conda install numpy pandas diamond ## install depandicies

2.Download the repo:

git clone https://github.com/borenstein-lab/IFDP.git

3.Extract the databaase and build it with diamond:

gunzip ec_full.fasta.gz;
diamond makedb --in ec_full.fasta -d ec_full

4.Test the installation by running this example:

run_sample.sh -d ec_full.dmnd -i GCF_002075875.1_Bbif1898B_genomic.fna -o output

Run the pipeline using a simple one line command:

run_sample.sh -d [DATABASE] -i [INPUT] -o [OUTPUT]

If you would like to run the pipeline from any directory, add this line to your .bashrc file or run it before running the pipeline:

export PATH=$PATH:/home/labs/elinav/yotamco/IFDP2/

Tutorial output and testing

For easy testing of the framework, we have uplaoded three genomes (which are relatively small in size, memory and run time requirements) as simple use cases. To run any of the genomes just use this command, while changing the genome file name.

run_sample.sh -d ec_full.dmnd -i GCF_002075875.1_Bbif1898B_genomic.fna -o output

In order to run and explore the results, a user must specify the database he wishes to use the input fasta/fastq file and an output name for the diamond output.

run_sample.sh -d [DATABASE] -i [INPUT] -o [OUTPUT]

you can also specify the amount of threads using -p argument.

Three outputs will be visible following the completion of the run:

[OUTPUT] - The diamond mapping output file

[OUTPUT]_counts - The enzyme counts

[OUTPUT]_IFDP - The IFDP profile

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
Fiber_Enzyme_matrix.csv		Fiber_Enzyme_matrix.csv
Figure1.svg		Figure1.svg
GCF_002075875.1_Bbif1898B_genomic.fna		GCF_002075875.1_Bbif1898B_genomic.fna
GCF_002075935.1_Bbif1897B_genomic.fna		GCF_002075935.1_Bbif1897B_genomic.fna
GCF_002075965.1_Bbif1892B_genomic.fna		GCF_002075965.1_Bbif1892B_genomic.fna
README.md		README.md
count_ec.py		count_ec.py
ec_full.dmnd.gz		ec_full.dmnd.gz
ec_full.fasta.gz		ec_full.fasta.gz
ecn_map.sh		ecn_map.sh
proteins_ids.csv		proteins_ids.csv
run_sample.sh		run_sample.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IFDP

Dependencies:

Installation

Tutorial output and testing

About

Releases 1

Packages

Contributors 2

Languages

borenstein-lab/IFDP

Folders and files

Latest commit

History

Repository files navigation

IFDP

Dependencies:

Installation

Tutorial output and testing

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages