-
Notifications
You must be signed in to change notification settings - Fork 42
Eukaryotic draft bins with EukRep and EukCC
Francisco Zorrilla edited this page Mar 22, 2021
·
2 revisions
EukRep
is implemented as follows:
rule eukrep:
input:
f'/home/fz274/rds/hpc-work/routy/all_bins/dump'
output:
f'/home/fz274/rds/hpc-work/routy/all_bins/euks_all.csv'
message:
"""
Filters concoct bins for eukaryotic MAGs/contigs with EukRep function filter_euk_bins.py.
Parameters are set very loosely to capture any bin that has at least 1Mbp of eukaryotic dna present.
1. minl length = min_contig length from assembly = 1kbp
2. eukratio = ratio of euk to prok dna in bins = 0
3. minbp = minimum mag length = 1Mbp
4. minbpeuks = minimum euk length in mag = 1kbp
Assumes that concoct bins have been dumped into folder /home/fz274/rds/hpc-work/routy/all_bins/dump
"""
shell:
"""
set +u;source activate {config[envs][metabagpipes]};set -u;
cd $(dirname {input})
filter_euk_bins.py --output euks_all.csv \
--threads 56 \
--minl 1000 \
--eukratio 0 \
--minbp 1000000 \
--minbpeuks 1000 \
dump/*.fa
"""
EukCC
is implemented as follows:
rule eukcc:
input:
f'/home/fz274/rds/hpc-work/routy/all_bins/mags/{{binIDs}}.fa'
output:
f'/home/fz274/rds/hpc-work/routy/all_bins/eukcc/{{binIDs}}'
message:
"""
Grabs mags from input folder and runs them in series through eukcc to get
completeness and lineage info. Bins with > 0.5 Mbp were included, total ~300.
Tried running with pygmes as shown below but did not work
eukcc --db eukccdb -o . --ncorespplacer 1 --ncores 16 --pygmes --diamond uniref50_pygmes.dmnd genome.fna
"""
shell:
"""
set +u;source activate metagem2;set -u;
eukcc --db /home/fz274/rds/hpc-work/eukcc/eukccdb --outdir {output} --ncorespplacer 1 --ncores 16 {input}
"""
- Quality filter reads with fastp
- Assembly with megahit
- Draft bin sets with CONCOCT, MaxBin2, and MetaBAT2
- Refine & reassemble bins with metaWRAP
- Taxonomic assignment with GTDB-tk
- Relative abundances with bwa
- Reconstruct & evaluate genome-scale metabolic models with CarveMe and memote
- Species metabolic coupling analysis with SMETANA