Skip to content

Recovering 10-20% more intact LTR elements

Compare
Choose a tag to compare
@oushujun oushujun released this 04 Dec 17:44
· 85 commits to master since this release

Major update

I recently identified a bug for dropping intact LTR elements, which have an imbalance LTR length > 15bp due to InDels. After manual checks, I determined these are still high-quality intact elements and thus salvage them in the output. This will marginally improve the sensitivity especially for genomes with limited LTR sequences (e.g. Arabidopsis, ~7%) and the margin decreases for those with decent amounts of LTRs, such as rice (~25%) and maize (~75%), because the abundance of intact elements has been sufficient to construct a comprehensive library. However, the number of intact LTR elements could increase for 10-20% comparing to the last version (v2.7), which has some positive effects on the calculation of LAI. Some benchmarking results:

Arabidopsis (TAIR10) v1.x v2.0 v2.8
Sensitivity 90.70% 90.90% 95.04%
Specificity 99.00% 99.00% 98.88%
Accuracy 98.50% 98.50% 98.64%
Precision 86.60% 86.50% 84.99%
Rice (MSUv7) v1.x v2.0 v2.5 v2.8
Sensitivity 95.00% 95.30% 96.30% 96.71%
Specificity 95.00% 94.60% 94.00% 93.87%
Accuracy 95.00% 94.80% 94.50% 94.54%
Precision 85.40% 84.50% 83.10% 83.09%

Minor updates

  1. Allow for mirrored candidates produced by LTRharvest
  2. Improve the convert_ltrdetector.pl for the published version (v1.0) of LtrDetector (contributed by @baozg)
  3. Add a convertor convert_ltr_finder2.pl to convert LTR_FINDER -w 2 table format into LTRharvest screen output format
  4. For LAI, allow the -all file to contain other TEs (i.e., whole-genome TE annotation)