Skip to content

Version 4.0.0

Compare
Choose a tag to compare
@MikeAxtell MikeAxtell released this 13 Mar 19:31
· 65 commits to master since this release
d9b09e4

ShortStack version 4 is a major update. The major changes are:

  • Completeley re-written in python3.
  • Streamlined installation using a conda recipe hosted on bioconda.
  • All compute-intensive processes are now multi-threaded, so execution times are faster when the user specifies higher values of --threads.
  • Much more reliance on other tools (bedtools, cutadapt for instance) .. less re-inventing of wheels.
  • Output of hairpin structure visualizations using strucVis.
  • Output of genome-browser-ready quantitative coverage tracks of aligned small RNAs using ShortTracks.
  • MIRNA locus identification has been thoroughly changed to increase sensitivity while maintaining specificity.
  • MIRNA locus identification can now be guided by user-provided 'known RNAs'. In contrast, truly de novo annotation of MIRNA loci, in the absence of matching the sequence of a 'known RNA' is disabled by default. This change in philosophy acknowledges that, in most well-studied organisms, most high-confidence microRNA families are already known.
  • Change the license to MIT from GPL3.

Option changes:

  • Drop support for cram format (options --cram, --cramfile eliminated)
  • Drop support for colorspace (option --cquals eliminated)
  • Replace option --bowtie_cores with --threads
  • Eliminate option --bowtie_m. Now -k 50 is always used.
  • Eliminate option --ranmax. Now mmappers will always be placed (except mode u)
  • Eliminate SAM tags XY:Z:O and XY:Z:M .. no more suppression of mmap reads
  • Add SAM tag XY:Z:H .. highly repetitive read (50 or more hits, not all known).
  • Add SAM tag YS:Z .. small RNA size information
  • Eliminate option --keep_quals. Quality values will always be stored in the bam file if input was fastq.
  • Modify option --locus so that it only accepts a single locus query.
  • Eliminate option --total_primaries .. instead use a fast hack to rapidly calculate this.
  • Option --locifile now understands .bed and .gff3 formats, as well as the original simple tab-delimited format.
  • Added options --autotrim and --autotrim_key. This allows automatic detection of 3' adapters by tallying the most common sequence that occurs after a known, highly abundant small RNA (given by autotrim_key).
  • Add option --knownRNAs. Provide a FASTA file of known mature small RNA sequences to search for and to nucleate searches for qualifying MIRNA loci.
  • Add option --dn_mirna. The --dn_mirna activates a de novo search for MIRNA loci independent of those that align to the 'known RNAs' provided by the user. By default, --dn_mirna is not active.