RELEASE-NOTES

RepeatModeler 2.0.6
===================
  - Updated to RepeatScout 1.0.7 which has the capability of reporting the
    extended coordinates of the seeds.  This reduces the complexity searching
    for examplars using the consensus sequence.
  - Reduced the minimum family instances for RepeatScout families.
  - Fixed a bug that caused RepeatClassifier to run with only one thread in the
    last release.
  - Fixed a bug that caused some RepeatScout derived exemplars to be reported 
    on the wrong strand.
  - Isolate long sequence identifiers from LTR_retriever which will alter 
    names longer than 13 characters.
  - Fixes for LTR_retriever changes in TRF parameter.
  - MultAln - new API for calculating Kimura divergence
            - new API for slicing operations on MSA
            - switch to Smitten V2 for default stk output
            - improvements to the Linup reporting format
  - Refiner reorganization for future extension algorithm

RepeatModeler 2.0.5
===================
  - Fixed a bug that caused failures in "absolute" reproducibility.  Prior
    to this release use of the "-srand" would not gaurantee that the ouputs
    consensi.fa.classified and families-classified.stk were exactly the same
    in sequence and sequence order.  It did gaurantee that the same samples
    were drawn from the genome, and that equivalent scoring families were
    derived at at each step.  In this release secondary sorts were added 
    to gaurantee a fixed sort order among equally scoring results, generating
    exactly the same output files each time the random number generator seed
    is used.  NOTE: This change only applies to results generated with 
    this version and future releases.
  - Added "-long" option to faToTwoBit to support larger genomes.
    
RepeatModeler 2.0.4
====================
  - Improved the gathering of RepeatScout examplars for building
    seed alignments.
  - Parallelized and improved the masking between rounds for faster
    runs and fewer redicoveries.
  - The 'pa' (parallel batches) option has been replaced with a new
    'threads' option which maps directly to the maximum number of
    threads the program will attempt to use.
  - Takes advantage of RepeatMasker 4.1.4 and RMBlast 2.13.0 parallel
    query feature.
  - Larger default sample sizes are now possible with speed improvements.
    The original sampling strategy can be selected with the new 'quick'
    option.
  - ABBlast is no longer supported.
  - Fixed a few visual artifacts in the viewMSA.pl html 
    visualization.

RepeatModeler 2.0.3
===================
  - The program now generates a logfile in the working directory named
    <seqfile>-rmod.log.  This file contains the random seed number used
    and some high level stats on the run for use with reporting problems
    with the program. 
  - Fixed a problem with the orientation/coordinates provided in the
    Stockholm output format that affected a subset of the sequences.
  - Fixed a bug affecting the trim functions of the Linup tool.

RepeatModeler 2.0.2
===================
  - First release of a set of manual curation tools for use with
    de-novo generated TE libraries.
  - Added generateSeedAlignments.pl to generate Dfam compatible 
    seed alignments given a consensus based TE library and 
    RepeatMasker output.
  - Fixed bug in N50 calculation.
  - Fixed Ruzzo-Tompa maximal scoring subsequences implementation.
  - lots of small bugfixes.

RepeatModeler 2.0.1
===================
  - Work around a bug introduced in the new NCBI Blast 2.10.0 with 
    version 5 databases.
  - RepeatScout in some rare cases will generate models for very 
    long-period satellites. This can cause Refiner to go crazy creating 
    tons of off-diagonal alignments. This version filters out 
    these rare cases.
  - This version now prints out the version of each dependency in the log.

RepeatModeler 2.0
=================
  - Many RepeatModeler seed alignments and consensus sequences contained 
    stretch of N’s due masking performed by TRF.  Instead of using the masked 
    sequences in the final seed alignment the original genomic sequences 
    are used, preserving low-complexity regions in the final consensus called.
  - The batching system was refactored to improve handling of short input 
    sequences often found in early assemblies or low-coverage genome projects.
  - The interface to TRF was inefficient and a major bottleneck in the 
    RepeatModeler pipeline.  The simple/tandem repeat masking step was 
    refactored to improve performance.
  - Added code to remove excess temporary files from the RM_ directory
    after they are no longer needed.
  - Container releases ( Docker and Singularity )
  - Minor Bugfixes:
      - Stockholm sequence coordinates did not consistently show
        strandedness using coordinate reversal.
      - Temporary file generation in BuildDatabase did not use
        a unique filename.  This could cause race conditions in
        environments where two BuildDatabase processes ran in the
        same directory at the same time.
  - Classification improvements:
      o Fixed a bug that caused some TE families to be labeled with the 
        RepeatMasker internal-use label “buffer”.
      o Fixed problem with TRF masking the input sequences that caused some 
        low-complexity sequences incorrectly match TE library sequences.
      o Fixed problem with conflict resolution that often let a lower scoring 
        match define the class for the family.

RepeatModeler 1.0.11
====================
  - Bugfix release. Avoid bailing if Refiner could not produce a consensus 
    from the family sequence set.

RepeatModeler 1.0.10
====================
  - Bugfix release. Fixes a bug that will cause RepeatModeler to exit 
    if no new repeats are detected after round-2. This may happen if the
    sample size is small enough or contains a repeat-poor set of sequences
    and may not reflect the chance that repeats will be found in later rounds.
    Typically new repeats are found in all rounds of RepeatModeler.

RepeatModeler 1.0.9
===================
  - RepeatModeler employs a genome sampling approach that is based
    on a random number generator. In this release of RepeatModeler
    we print out the random number generator seed at the start of 
    a run.  This number can be used with the "-srand ####" flag in future
    runs to exactly reproduce the samples taken from a given database.

  - The final output files are now placed in the same directory as
    the input database.  

  - An additional output file is now generated containing the seed 
    alignment for each discovered family.  This alignment is the source
    of the final consnesus and is stored in a Dfam compatible Stockholm
    file.  The new output files are named <database_name>-families.fa and
    <database_name>-families.stk.
    
  - Support for Dfam_consensus has been built into this release. Two
    utilities dfamConsensusTool.pl and renameIds.pl can be found in the
    RepeatModeler util/ directory.  The dfamConsensusTool script enables
    one to upload curated seed alignments to the open Dfam_consensus 
    database from the command line.  The renameIds script simplifies
    the process of coming up with unique identifiers for a set of 
    RepeatModeler generated families given a naming template.

RepeatModeler 1.0.8
===================
  - New "-recoverDir <dir>" option.  This allows RepeatModeler to 
        restart where it left off after a failure occurs.  
  - BLAST searches can now be run in parallel on a shared memory 
        multiprocessor machine.  The new "-pa #" option is used to
    specify how many parallel searches to start up at any one time.

RepeatModeler 1.0.7
===================
  - Added support for new RMBlastn version 2.2.27
      - TRFMask updated to work with the new RepeatMasker 4.0 release.
          - New viewMSA.pl utility