Major update
I recently identified a bug for dropping intact LTR elements, which have an imbalance LTR length > 15bp due to InDels. After manual checks, I determined these are still high-quality intact elements and thus salvage them in the output. This will marginally improve the sensitivity especially for genomes with limited LTR sequences (e.g. Arabidopsis, ~7%) and the margin decreases for those with decent amounts of LTRs, such as rice (~25%) and maize (~75%), because the abundance of intact elements has been sufficient to construct a comprehensive library. However, the number of intact LTR elements could increase for 10-20% comparing to the last version (v2.7), which has some positive effects on the calculation of LAI. Some benchmarking results:
Arabidopsis (TAIR10) |
v1.x |
v2.0 |
v2.8 |
Sensitivity |
90.70% |
90.90% |
95.04% |
Specificity |
99.00% |
99.00% |
98.88% |
Accuracy |
98.50% |
98.50% |
98.64% |
Precision |
86.60% |
86.50% |
84.99% |
Rice (MSUv7) |
v1.x |
v2.0 |
v2.5 |
v2.8 |
Sensitivity |
95.00% |
95.30% |
96.30% |
96.71% |
Specificity |
95.00% |
94.60% |
94.00% |
93.87% |
Accuracy |
95.00% |
94.80% |
94.50% |
94.54% |
Precision |
85.40% |
84.50% |
83.10% |
83.09% |
Minor updates
- Allow for mirrored candidates produced by LTRharvest
- Improve the convert_ltrdetector.pl for the published version (v1.0) of LtrDetector (contributed by @baozg)
- Add a convertor convert_ltr_finder2.pl to convert LTR_FINDER -w 2 table format into LTRharvest screen output format
- For LAI, allow the -all file to contain other TEs (i.e., whole-genome TE annotation)