GitHub - farnooshoa/PCA-ON-GENOMTYPE: Run PCA/TSNE on some population genotype data.

Genotype PCA Analysis

This project involves reading genotype data from a VCF file, processing it, performing Principal Component Analysis (PCA), and visualizing the results. The primary goal is to analyze and visualize genotype data to gain insights into genetic variations across samples.

Features

VCF File Reading: Extracts genotype data, sample IDs, and variant IDs from a VCF file.
Panel File Reading: Reads sample IDs and population codes from a panel file.
Matrix Creation: Converts genotype data into a matrix format suitable for PCA.
PCA Analysis: Performs PCA on the genotype matrix to reduce dimensionality and visualize genetic variations.
DataFrame Creation: Constructs a DataFrame with genotype data, variant IDs, and population codes.
CSV Export: Saves the resulting matrix to a CSV file.
Visualization: Plots the PCA results to visualize the distribution of genetic variations.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Protein-Protein Inter.code-workspace		Protein-Protein Inter.code-workspace
README.md		README.md
integrated_call_samples_v3.20130502.ALL.panel		integrated_call_samples_v3.20130502.ALL.panel
matrix.csv		matrix.csv
phase1_integrated_calls.20101123.ALL.panel		phase1_integrated_calls.20101123.ALL.panel
vcf_to_matrix.py		vcf_to_matrix.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Genotype PCA Analysis

Features

About

Releases

Packages

Languages

farnooshoa/PCA-ON-GENOMTYPE

Folders and files

Latest commit

History

Repository files navigation

Genotype PCA Analysis

Features

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages