Skip to content

0.12.0

Compare
Choose a tag to compare
@paoloshasta paoloshasta released this 12 May 20:37
· 503 commits to main since this release

Additions and improvements since release 0.11.1

  • Assembly mode 3.

    • Phased assembly using new computational techniques. A detailed description of computational techniques will become available at a later time.

    • This is a preliminary version, released in this form to encourage experimentation. It has known issues that will be addressed in future releases. Please share your experiences by filing issues in the Shasta GitHub repository.

    • Despite the known issues, it produces useful phased assemblies. See this presentation for an analysis of assembly results.

    • Initially only supported for the new high accuracy Oxford Nanopore reads from the 2023.12 data release. It is possible that future releases will also support ONT R10 reads.

    • Invoke using --config Nanopore-ncm23-May2024. This assembly configuration was only tested on human genomes at coverage 40x to 60x, but may be functional at higher or lower coverage, within reasonable limits. It includes limited adaptivity to coverage. Only use with reads of accuracy comparable to the ONT 2023.12 data release.

    • Released with minimal usage documentation.

  • New alignment method 5 takes into account k-mer uniqueness when computing an alignment between two reads. Invoke with --Align.alignMethod 5.
    Used by assembly configuration Nanopore-ncm23-May2024.

  • Automatic settings of min/max bucket size using a simple heuristics. Invoked by setting both --MinHash.minBucketSize and --MinHash.maxBucketSize to 0. Used by assembly configuration Nanopore-ncm23-May2024.

  • Longer markers (command line option --Kmers.k). Maximum allowed marker length is now 31 bases for assembly mode 0 and 30 bases for assembly modes 2 and 3.
    Longer markers become useful as read quality improves.

  • New option --Reads.handleDuplicates to deal with reads with duplicate names.
    The default is to keep only one copy of each duplicate read, but other choices are also available.

Platforms

Linux

  • The shasta-Linux-0.12.0 executable will run on most current 64-bit Linux systems that use kernel version 3.2.0 or later. This includes all Ubuntu versions starting at 12.04 plus CentOS 7 and 8. It is statically linked and has no dependencies, so it can be used directly without installation.

  • The release includes tar file shasta-Ubuntu-22.04-0.12.0.tar which is a complete Shasta build on Ubuntu 22.04. It will not be needed by most users.

macOS

As announced with the release of Shasta 0.10.0, macOS versions of Shasta are no longer provided.

Windows

As in previous releases, the Linux executable shasta-Linux-0.12.0 can be used on Windows under Windows Subsystem for Linux (WSL).

Linux ARM

The ARM executable, shasta-Linux-ARM-0.12.0, can be used on 64-bit ARM version 8 platforms. It is known to work at least in the following environments:

  • Graviton, Graviton2, Graviton3 processors on AWS EC2 instances.
  • Raspberry Pi Model 4 or 5 running 64-bit a Debian-derived Linux distribution. This includes Ubuntu.

It will not work on macOS systems with ARM processors.

It is statically linked and has no dependencies, so it can be used directly without installation.

Compatibility

This release is not compatible with previous releases. There were incompatible changes in some command line option names, the binary formats used, and the Python API. You cannot use release 0.12.0 for postprocessing of an assembly done using a previous release.