ESM-Ezy

Dataset and checkpoint

To get dataset and model checkpoint, please refer to .

Download the data.zip file and extract it to the data directory.

Download the ckpt.zip file and extract it to the ckpt directory.

Training

To train ESM-Ezy, follow the steps below:

Clone the repository:

git clone https://github.com/westlake-repl/ESM-Ezy.git

Install the required packages:

conda env create -f environment.yml

Download the pre-trained ESM-1b model:

wget https://dl.fbaipublicfiles.com/fair-esm/models/esm1b_t33_650M_UR50S.pt -O ckpt/esm1b_t33_650M_UR50S.pt
wget https://dl.fbaipublicfiles.com/fair-esm/models/esm1b_t33_650M_UR50S-contact-regression.pt -O ckpt/esm1b_t33_650M_UR50S-contact-regression.pt

Train ESM-Ezy:

python scripts/train.py --train_positive_data data/train/train_positive.fa --train_negative_data data/train/train_negative.fa --test_positive_data data/train/test_positive.fa --test_negative_data data/train/test_negative.fa --model_path ckpt/esm1b_t33_650M_UR50S.pt

inference

inference from uniref50 database:

python scripts/inference.py --model_path ckpt/esm1b_t33_650M_UR50S.pt --checkpoint_path ckpt/model_laccase.pkl --inference_data data/inference/uniref50.fasta  --output_path data/retrieval

Search

load the trained ESM-Ezy model and inference on the candidate sequences:

python scripts/retrieval.py --model_path ckpt/esm1b_t33_650M_UR50S.pt --checkpoint_path ckpt/model_laccase.pkl --candidate_data data/retrieval/candidate.fa --seed_data data/retrieval/fitness.fa  --output_path data/retrieval

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
ckpt		ckpt
dataset		dataset
model		model
result		result
scripts		scripts
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
run_gen_repr.sh		run_gen_repr.sh
run_gen_repr_nockpt.sh		run_gen_repr_nockpt.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ESM-Ezy

Dataset and checkpoint

Training

inference

Search

About

Releases

Packages

Languages

License

westlake-repl/ESM-Ezy

Folders and files

Latest commit

History

Repository files navigation

ESM-Ezy

Dataset and checkpoint

Training

inference

Search

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages