Identifying Significant Correlations between Histone Modifications and Ultraconserved Genomic Regions
By investigating the intersection of five differentiated cell types, namely muscle cells (muschlehsmm), umbilical cells (umbilicalhuvec), brain stem cells (stemh1), lung cells (lungnhlf) and arbitrary epidermal cells (skinnhek) on chromosome 17 for mice, canines, and humans we will derive statistical correlation between said intersection via four evaluative measures -- Uniform Random Permutation, Synthetic Data Creation, K-Clustering, and Simple Counting. As a biological confirmation of statistical correlation from the previous step, this research paper intends to explore how these identified correlations relates to correlation yielded from the intersection of the H327AC histone modification and the H3K36me3 promoter binding site.
Much of data was gracefully provided by the UCSC Genome Browser. More specifically, for replicating this paper, please consider downloading H327ac and its intersections with coding exons along with the consindelsHgMmCanFam.bed files. For the later dataset please refer to: https://academic.oup.com/mbe/article/29/7/1757/1069045.
Below are illustrations of the result of the statistical investigation. More-so, the clustering, uniform random permutations and simple counting all points to the strong correlation between histone modifications and ultraconserved genomic regions. We speculate that these regions hold conservation due to evolutionary importance.