MemPanG23 Assembly & Evaluation
- Understand my genome - Meryl v1.4 (recommend to install with gcc 10+)
- Create an assembly - Verkko v1.3.1 (recommend to install through conda)
- Check the assembly graph on BandageNG (local)
- Label chromosomes, find telomeres - MashMap v2.0, seqtk v1.4
- Align long-reads back to the assembly - Winnowmap2 v2.03, Samtools v1.17
- Collect anomalies Merqury (clone latest from github; has dependencies on R v4.3.0 (argparse, ggplot2, and scales), Bedtools v2.30.0, Java Runtime Environment) T2T-Polish (clone latest from github; no additional installation required for this course) seqrequester (clone latest on github and install with gcc 10+)
- Browse through the data with IGV v2.16.1 (local). I prefer the IGV and IGV Command Line Version to easily manipulate memory limit.
- Browse through some real issues on HG002 diploid assembly with IGV (local, if time permits)
All the input and results are available under a_thal.
Feel free to check the results here and compare to yours.
Note: Raw input sequence data (*.fq.gz
files) is available under /opt/assembly_data/
export MERQURY=/opt/merqury
export T2T_Polish=/opt/T2T-Polish
export tools=/opt
We will work in day3_assembly_evaluation
.
mkdir -p ~/day3_assembly_evaluation
cd day3_assembly_evaluation