How to test yacrd and fpa

If you want run yacrd and fpa speedly you can run :

./script/small_test.sh

This script download the E. coli Nanopore dataset run a subsampling on it and run yacrd, fpa and a combination of this tools on this dataset.

Requirements

This tools need to be avaible in your path :

seqtk 1.3-r106
fpa 0.5
yacrd 0.6
dascrubber commit 0e90524 you can follow dascrubber-wrapper instruction to install all dascrubber requirements
snakemake 5.4.3
wtdbg2 2.3
miniasm 0.3-r179
quast v5.0.2
nucmer 4.0.0beta2
bwa mem 0.7.17
samtools 1.9
reference seeker 1.2

You need change path of this tools in snakemake pipeline file:

miniscrub commit 3d11d3e
ra commit 07364a1
porechop v0.2.3-C++11
shasta 0.1.0

Update miniscrub path in file pipeline/scrubbing.snakefile line 136. Update porechop path in pipeline/analysis.snakefile line 68.

If you execute conda env create -f conda_env.yml conda create environment yacrd_fpa with all dependency except dascrubber, miniscrub, ra, porechop and shasta.

Dataset

Reference:
- E. coli CFT073 5.231428 Mb
- D. melanogaster 143.726002 Mb
- C. elegans 100.2 Mb
- H. sapiens chr1 248.9 Mb
Reads:
- E. coli CFT073:
  - Pacbio
  - Oxford nanopore
- Oxford nanopore D melanogaster
- Oxford nanopore H sapiens chr1
- Pacbio RS P6-C4 C elegans
- NCTC Sequel dataset
- Pacbio RSII and Nanopore data from publication https://doi.org/10.1099/mgen.0.000294

Build data directory

run script script/dl_data.sh, warning this script can take many time it's download more than 65 dataset.

Rerun analysis

Run scrubbing+assembly+analysis snakemake --snakefile pipeline/uncorrected.snakefile all
Run fpa+assembly+analysis snakemake --snakefile pipeline/fpa.snakefile all
Run comparaison against minimap+miniasm and yacrd+minimap+fpa+miniasm pipeline snakemake --snakefile pipeline/combo.snakefile all

Analysis script

Get information about reads (it could be very long):

./script/read_info.py

Get information about assembly:

./script/asm_info.py

Get information about runing time and memory usage of scrubbing and assembly:

./script/timming.py

Name		Name	Last commit message	Last commit date
Latest commit History 167 Commits
config		config
data		data
pipeline		pipeline
script		script
.gitignore		.gitignore
conda_env.yml		conda_env.yml
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

How to test yacrd and fpa

Requirements

Dataset

Build data directory

Rerun analysis

Analysis script

About

Releases

Packages

Languages

natir/yacrd-and-fpa-upstream-tools-for-lr-genome-assembly

Folders and files

Latest commit

History

Repository files navigation

How to test yacrd and fpa

Requirements

Dataset

Build data directory

Rerun analysis

Analysis script

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages