My final project for Advanced Bioinformatics (BIOI 500) was a comparison of three local ancestry estimation softwares: LAMP-LD, RFMix, and ELAI. This repository builds upon the work in the original class project and runs six populations of different ancestral backgrounds and proportions with these softwares to compare accuracy and resource usage. We work to convert the genotypic format requirements of these softwares and streamline their usage. We will use these local ancestry softwares in real genotypic and transcriptomic data from the Multi-Ethnic Study of Atherosclerosis (MESA) and Modeling the Epidemiologic Transition Study to improve gene expression prediction in diverse populations. Analyses performed include:
- Generating ancestral proportions from 1000G populations
- Simulating genotypes from 1000G CEU and YRI
- Also see adsim scripts and notes
NAT | CEU | YRI | |
---|---|---|---|
MXL | 60% | 40% | 0% |
PUR | 16% | 84% | 0% |
ACB | 3% | 10% | 87% |
ASW | 1% | 26% | 73% |
EVEN | 33% | 33% | 33% |
BRYC | 18% | 65% | 6% |
- Converting to software format and run
- Measure accuracy and resource use of softwares