GitHub - alex-hh/deep-protein-generation

Generating novel protein variants with variational autoencoders

This code provides implementations of variational autoencoder models designed to work with aligned and unaligned protein sequence data as described in the manuscript Generating novel protein variants with variational autoencoders.

Dependencies

The code requires Python 3. Variational autoencoder models were implemented in keras (2.1.2) using the tensorflow backend (tensorflow 1.0.0). Full python dependencies are listed in requirements.txt.

Individual models were trained on a single Tesla K80 GPU with cuda 8.0.0, cudnn v5 and Python 3.6.0.

Installation

To run code locally, first clone the repository, then install all dependencies (pip install -r requirements.txt)

Training models

To train models run the corresponding script (training logs will be written to output/logs, and weights saved to output/weights at the end of training.)

python scripts/train_msa.py

or

python scripts/train_raw.py

For the latter we recommend the use of a GPU, the former can run in a few hours on a standard CPU.

Generating sequences (demo)

To generate sequences by sampling from the prior run scripts/generate_from_prior.py, passing the name of the weights file, and specifying the --unaligned flag if using an ARVAE model. Generated sequences will be written to a new fasta file in output/generated_sequences/

python scripts/generate_from_prior.py data/weights/msavae.h5

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
data		data
models		models
scripts		scripts
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Generating novel protein variants with variational autoencoders

Dependencies

Installation

Training models

Generating sequences (demo)

About

Releases

Packages

Languages

License

alex-hh/deep-protein-generation

Folders and files

Latest commit

History

Repository files navigation

Generating novel protein variants with variational autoencoders

Dependencies

Installation

Training models

Generating sequences (demo)

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages