PUFFIN

Download the trained models

Here we provide PUFFIN models trained on peptitde-MHC binding affinity datasets from NetMHCpan3.0 (for class I MHC) and NetMHCIIpan-3.2 (for class II MHC). The five folds are combined and split into training and validation set.

Note that because here we no longer need to hold out data as test set, this training setup is different from what's used in the paper (Table 1 and Table 2), where the training/test set provided by Bhattacharya et al. was used for class I MHC and the 5-fold cross-validation split from NetMHCIIpan-3.2 was used for class II MHC (the performance on each fold was evaluated by a model trained on the other four folds).

Download the trained model from here.

Set up the environment

We provide a Conda environment that provides all the Python packages required by PUFFIN. Build and activate this environment by:

conda env create -f environment.yml
source activate puffin

To deactivate this environment:

source deactivate

Preprocess the input MHC-peptide pairs

Save all MHC-peptide pairs to evaluate in a tab-delimited file with three columns, each of which denotes the peptide sequence, the observed binding affinity (use any placeholder number when it is not available), and the MHC allele respectively. The MHC allele names supported are in the first column of this file. (class I example, class II example)

Then preprocess the data by:

python preprocess.py -i DATAFILE -o OUTDIR -c CLASS

DATAFILE: the file that contains the MHC-peptide pairs
OUTDIR: the directory to save all the output
CLASS: "1" for class I and "2" for class II

Make predictions using the trained models

python score.py -o OUTDIR -c CLASS -g GPU

OUTDIR: same as above
CLASS: same as above
GPU: a comma-delimited string that denotes the index(es) of the GPU(s) to run the models on (eg. "0,1,2,3"). We recommend using multiple GPUs if possible to speed up the prediction.

The predictions are saved in $OUTDIR/PUFFIN.combined. It's a tab-delimited file with four columns, each of which denotes the predicted mean affinity, epistemic uncertainty, aleatoric uncertainty, and the binding likelihood for a 500 nM binding threshold.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
examples		examples
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
embed.py		embed.py
environment.yml		environment.yml
getdata.sh		getdata.sh
getmodels.py		getmodels.py
main.py		main.py
preprocess.py		preprocess.py
score.py		score.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PUFFIN

Download the trained models

Set up the environment

Preprocess the input MHC-peptide pairs

Make predictions using the trained models

About

Releases

Packages

Languages

License

gifford-lab/PUFFIN

Folders and files

Latest commit

History

Repository files navigation

PUFFIN

Download the trained models

Set up the environment

Preprocess the input MHC-peptide pairs

Make predictions using the trained models

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages