Skip to content

Version 0.3

Compare
Choose a tag to compare
@jsh58 jsh58 released this 04 Oct 14:23
· 79 commits to master since this release
  • New default peak-calling method: area under the curve (AUC). For a peak to be called, the total significance of the region must exceed a minimum value (-a <float>, default 20.0).

    • The total significance is calculated as the sum of the -log(q) values above the -q threshold over the length of the region (i.e. the area under the -log(q) "curve"). If a -p threshold is specified, the area under the -log(p) curve is calculated.
    • The maximum gap parameter (-g <int>) still allows multiple regions to be linked.
    • No minimum length is required for a peak to be called.
    • Can be overridden by specifying -l <int>, in which case peak-calling reverts to the previous method, with the given minimum length for peaks.
  • Option to provide a BED file of genomic regions to exclude from analysis (-E <file>).

    • The regions will affect peak calls, such that no peak may extend into or around an excluded region.
    • The regions' lengths will be subtracted from the genome length calculated by the program.
    • In the output log files, excluded regions will have treatment/control pileup values of 0.0 and p-/q-values of NA.
    • Multiple BED files can be specified, comma-separated (or space-separated, in quotes).
  • Accessory script findNs.py to produce a BED file of 'N' homopolymers from a fasta file (e.g. a reference genome). The output can (and should) be given to Genrich via -E (above).

  • Option to keep unpaired alignments, with lengths changed to a given value, has been changed to -w <int> (formerly -a <int>).