Skip to content

Longitudinal Traits (Random Regression Model)

Jiayi Qu edited this page May 28, 2021 · 4 revisions

Data Description

A naive longitudinal dataset with each individual having records on 5 time points is simulated to demonstrate the use of JWAS for random regression model. 50 individuals are randomly selected from the example dataset and values in "y1" and "y2" are assumed to be the random regression coefficients of additive genetic effects for the individuals. A linear regression of normal time covariates (Tmat) is used to simulate the phenotypes of individuals at each of the 5 time points. A consistent noise is added to the phenotypes.

using JWAS,DataFrames,CSV,Statistics,JWAS.Datasets,StatsBase,LinearAlgebra,Kronecker 
pedfile    = dataset("pedigree.csv")
phenofile  = dataset("phenotypes.csv")
pedigree   = get_pedigree(pedfile,separator=",",header=true);
phenotypes = CSV.read(phenofile,DataFrame,delim = ',',header=true,missingstrings=["NA"])

# assume y1,y2 are linear readom regression coefficients for BV
phenotype = phenotypes[sample(1:size(phenotypes,1),50,replace=false),[:ID,:y1,:y2]]

# Generate longitudinal phenotypes given random regression coefficients
n = size(phenotype,1)
Idt = Matrix{Float64}(I, n, n)
nTime = 5
Tmat = [ones(nTime) collect(1:nTime)]
Z = Idt  Tmat
us = [[] for i in 1:n]
for i in 1:n
    us[i] = Matrix(phenotype)[i,2:end]
end
uhat = vcat(us...)
gBLUP = Z*uhat
gBLUP= convert(Array{Float64,1}, gBLUP)
phenotype = DataFrame(ID = repeat(phenotype[!,:ID], inner = nTime),
    time = repeat(collect(1:nTime),outer=n),Y = gBLUP)

# Add consistent noise
phenotype[!,:Y] = phenotype[!,:Y] + randn(size(phenotype,1))

# fixed linear regression covariates and random effects
phenotype[!,:x2] = phenotype[!,:time]
phenotype[!,:pe] = phenotype[!,:ID]

show(phenotype, allcols=true)

JWAS Analysis

# Step 1: Load packages
using JWAS,DataFrames,CSV,Statistics,JWAS.Datasets,StatsBase

# Step 2: Read genotype data 
genofile   = dataset("genotypes.csv")
genotypes  = get_genotypes(genofile,separator=',',method="BayesC");
# Step 3: Build Model Equations
model_equation  = "Y = intercept + x2 + pe + genotypes"
model = build_model(model_equation);

# Step 4: Set Factors or Covariates
set_covariate(model,"x2");

# Step 5: Set Random or Fixed Effects
set_random(model,"pe"); #i.i.d. permanent environmental effects to account for the covariance between observations from the same individual

# Step 6: Run Analysis
out=runMCMC(model,phenotypes,RRM=Tmat); # Tmat is the covariate function you want to use for the random regression model (e.g., Legendre polynomials)

# Step 7: Get estimated random regression coefficients
u1 = out["EBV_1"] # estimated first random regression coefficients for individuals
u2 = out["EBV_2"] # estimated second random regression coefficients for individuals
Clone this wiki locally