These are release notes for Canu version 2.3, which was released on December 17th, 2024. Canu is specialized for assembly of single-molecule sequences. Full documentation can be found at http://canu.readthedocs.org/.
This release provides a stable, tested, and documented version of the software. The binary distributions should work on any relatively recent version of the respective OS and are the recommended way to install Canu. The source code distribution contains everything you need to create a binary distribution for your own specific OS.
Citation
- Koren S, Walenz BP, Berlin K, Miller JR, Phillippy AM. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Research. (2017).
- Koren S, Rhie A, Walenz BP, Dilthey AT, Bickhart DM, Kingan SB, Hiendleder S, Williams JL, Smith TPL, Phillippy AM. De novo assembly of haplotype-resolved genomes with trio binning. Nature Biotechnology. (2018).
- Nurk S, Walenz BP, Rhiea A, Vollger MR, Logsdon GA, Grothe R, Miga KH, Eichler EE, Phillippy AM, Koren S. HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads. Genome Research. (2020).
Minimum Requirements
- 8GB minimum memory; 16GB strongly suggested
- Perl 5.12.0, or File::Path 2.08
- Java SE 8
- gnuplot 5.2 (optional, for generating diagnostic graphs)
- macOS before 10.10 Yosemite is not supported.
- Windows of any flavor is not supported.
See README.md for requirements to compile from source.
Installation
Users can download Canu as source code or as pre-compiled binaries. The binary distribution is the recommended install method, assuming it is available for your platform. The source code package needs to be compiled and installed before it can be used.
To install from a binary distribution (recommended):
curl -LRO https://github.com/marbl/canu/releases/download/v2.3/canu-2.3.Linux-amd64.tar.xz
tar -xJf canu-2.3.*.tar.xz
or canu-2.3.Darwin-aarch64.tar.xz for MacOS (Apple Silicon only).
Confirm the MD5 matches the expected value:
50721b8440fd0e0926e833c03715d224 canu-2.3.Darwin-aarch64.tar.xz
5f5e537346f21e91393b0e5447f45bb3 canu-2.3.Linux-amd64.tar.xz
1cd4e97705b153caf7e8fdc55768c56f canu-2.3.tar.xz
Canu will be installed at canu-2.3/bin/canu.
To install from source code (DO NOT download the Source code files provided by GitHub as these will not compile, use the canu-2.3.tar.gz instead):
curl -L https://github.com/marbl/canu/releases/download/v2.3/canu-2.3.tar.xz --output canu-2.3.tar.xz
tar -xJf canu-2.3.tar.xz
cd canu-2.3/src
make -j 8
cd ..
Canu will be installed at canu-2.3/build/bin/canu.
Changes
Canu v2.3 IS (expected to be) compatible with assemblies started with Canu v2.2, v2.1, and v2.1.1 but NOT with any earlier version. However, we DO NOT recommend mixing versions.
- Support for Apple Silicon (aarch64) and Linux ARM64 builds. Issues #2261, #2269 and #2016, at least.
- Generate
.bam
outputs for contigs. b871d94 and many others. - Support reading input sequence from sam/bam/cram.
- Remove the
overlapper
parameter because it would inadvertently set correction to use the very slowovl
overlapper. Issue #1924. - Bogart memory requirements reduced. 8a09ff5, issue #1788.
Bug Fixes
- Fixed crash detecting reads in a cycle in bogart. afdf6db.
- Fixed crashes in consensus. 13de2b8, bf659b1, f40a204, 1a710da, 7041856.
- Fixed crash in read correction. e8c6822.
- Fix ignoring overlap size specified by user., f29343c.
- Fix runs with <100 reads. 7fb66bb.
- Fix out of bounds error in sequence iteration. 5453c3f.
- Fix default coverage sampling for HiFi reads. e0ed3bb, issue #2241 .
- Fail earlier if more than 4095 sequence input files are supplied. 6eb6d2c, issue #1910.
Known Issues
See the issues page for up-to date open issues, or to report a problem.
- Large memory usage and runtime for long reads (e.g., Nanopore) when using the
overlapper=ovl
algorithm, and during Overlap Error Adjustment. The-fast
option enables a significantly faster algorithm, especially for nanopore data, but may produce slightly less contiguous assemblies. - No support for trio binning of HiFi data. As a workaround, specify the HiFi data as -pacbio-raw and run only the haplotyping step (-haplotype) followed by assembly of the partitioned reads.
See the FAQ for many suggestions, including suggestions for specific data types, e.g., Nanopore r9 reads.
Goodbye.
Do not expect another release. This is it, folks. The sequencing technology has moved on and Canu is all but obsolete now. Thanks for all the feedback, citations and bug reports.
Legal
Canu is derived from Celera Assembler and includes code from many other projects. Most, but not all, of the code is GPL licensed. See the README.licenses file and individual source code files for details.