GlobalEpistasis

install and load data

Install julia (see julialang.org, on macos use homebrew).
Learn some julia (e.g. wikibooks)
Update packages Pkg.update()
add package Pkg.clone("https://github.com/jotwin/GlobalEpistasis.jl"), note this will add NLopt, CSV, and DataFrames packages
I recommend IJulia for an interactive notebook interface.

Load data into julia, CSV doc. Test data is a subset of Wu et al. eLife 2016

using CSV
using DataFrames
dataf = CSV.read("test/wu4sites_test.txt")

prepare data

using GlobalEpistasis
d = prepdata(dataf, :mut, :sequence, "VDGV", :f, cname = :concentration, vname = :v)

arguments are

dataframe of your data
identifier of the column with sequence data
type of sequence data is either :sequence the sequence, or :listofmuts a list of substitutions where each substitution is a string with wild-type amino acid, position, and mutated amino acid, e.g. "V3A". Multiple substitutions per sequence can be concatenated together e.g. "V3A-G2B".
wild-type sequence, or if your data is in list of mutants format its the wild-type identifier, e.g. "WT"
name of the column with measured phenotype

optional keyword arguments are

vname: name of the column of variance estimates
cname: name of the column of conditions, which can be strings or numbers.
delim: delimiter for multiple substitutions in listofmuts format. Default '-'
condition_type: whether the condition variable is treated as a continuous or categorical predictor (:categorical or :continuous)

non-epistatic model

@time mlin = nonepistatic_model(d)

output is a dictionary with keys:

:beta is a data frame of additive effects
:r2 r-squared
:sigma2 HOC epistasis variance
:prediction data frame of measured phenotype (y), inferred measured phenotype (yhat), and inferred additive trait (phi, rescaled), and (optional) conditions
:rmse root mean squared error
:ll log-likelihood

global epistasis model

@time m = fit(mlin, nk = 4, tol = 1e-14)

nk is number of knots in the I-spline (default 4)
tol is the tolerance of the optimization (it can take a long time or fail if its too small)

output is a dictionary of keys similar to above

writing output

write csv files

CSV.write("betas.txt", m[:beta])
CSV.write("prediction.txt", m[:prediction])

modifying initial conditions

the fit function first argument is an initial model m with entries m[:a] and m[:b]. These can be modified to change the intial parameters. Also you can change the non-epistatic model e.g. mlin[:a] = [0, 0, 0, 0, 0]

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
GB1		GB1
GFP		GFP
betalactamase		betalactamase
src		src
test		test
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
REQUIRE		REQUIRE

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GlobalEpistasis

install and load data

prepare data

non-epistatic model

global epistasis model

writing output

modifying initial conditions

About

Releases

Packages

Languages

License

jotwin/GlobalEpistasis.jl

Folders and files

Latest commit

History

Repository files navigation

GlobalEpistasis

install and load data

prepare data

non-epistatic model

global epistasis model

writing output

modifying initial conditions

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages