Code for identifying signatures of 2 speed genomes
- GFF-upstream-downstream-distances.pl (outputs tab file of intergenic distances)
- Intergenic_distances_to_median_quadrants.pl (outputs tab of gene locations and quadrant)
- Markov_chain_from_quadrant_file.pl (identifies significant numbers of consecutive genes of a given quadrant)
Cumulative_probability_of_consecutive_genes is the probability of finding that number of consecutive genes in that given quadrant
Quadrants are defined as:
Q1 (upper left) | Q2 (upper right)
Q4 (lower left) | Q3 (lower right)
1 = < median log10(downstream length) && > median log10(upstream length)
2 = > median log10(downstream length) && > median log10(upstream length)
3 = > median log10(downstream length) && < median log10(upstream length)
4 = < median log10(downstream length) && < median log10(upstream length)