Skip to content

alex-hh/deep-protein-generation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Generating novel protein variants with variational autoencoders

This code provides implementations of variational autoencoder models designed to work with aligned and unaligned protein sequence data as described in the manuscript Generating novel protein variants with variational autoencoders.

Dependencies

The code requires Python 3. Variational autoencoder models were implemented in keras (2.1.2) using the tensorflow backend (tensorflow 1.0.0). Full python dependencies are listed in requirements.txt.

Individual models were trained on a single Tesla K80 GPU with cuda 8.0.0, cudnn v5 and Python 3.6.0.

Installation

To run code locally, first clone the repository, then install all dependencies (pip install -r requirements.txt)

Training models

To train models run the corresponding script (training logs will be written to output/logs, and weights saved to output/weights at the end of training.)

python scripts/train_msa.py

or

python scripts/train_raw.py

For the latter we recommend the use of a GPU, the former can run in a few hours on a standard CPU.

Generating sequences (demo)

To generate sequences by sampling from the prior run scripts/generate_from_prior.py, passing the name of the weights file, and specifying the --unaligned flag if using an ARVAE model. Generated sequences will be written to a new fasta file in output/generated_sequences/

python scripts/generate_from_prior.py data/weights/msavae.h5

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages