ESM-Ezy

Dataset and checkpoint

To get dataset and model checkpoint, please refer to .

Download the data.zip file and extract it to the data directory.

Download the ckpt.zip file and extract it to the ckpt directory.

Training

To train ESM-Ezy, follow the steps below:

Clone the repository:

git clone https://github.com/westlake-repl/ESM-Ezy.git

Install the required packages:

conda env create -f environment.yml

Download the pre-trained ESM-1b model:

wget https://dl.fbaipublicfiles.com/fair-esm/models/esm1b_t33_650M_UR50S.pt -O ckpt/esm1b_t33_650M_UR50S.pt
wget https://dl.fbaipublicfiles.com/fair-esm/models/esm1b_t33_650M_UR50S-contact-regression.pt -O ckpt/esm1b_t33_650M_UR50S-contact-regression.pt

Train ESM-Ezy:

python scripts/train.py --train_positive_data data/train/train_positive.fa --train_negative_data data/train/train_negative.fa --test_positive_data data/train/test_positive.fa --test_negative_data data/train/test_negative.fa --model_path ckpt/esm1b_t33_650M_UR50S.pt

inference

inference from uniref50 database:

python scripts/inference.py --model_path ckpt/esm1b_t33_650M_UR50S.pt --checkpoint_path ckpt/model_laccase.pkl --inference_data data/inference/uniref50.fasta  --output_path data/retrieval

Search

load the trained ESM-Ezy model and inference on the candidate sequences:

python scripts/retrieval.py --model_path ckpt/esm1b_t33_650M_UR50S.pt --checkpoint_path ckpt/model_laccase.pkl --candidate_data data/retrieval/candidate.fa --seed_data data/retrieval/fitness.fa  --output_path data/retrieval

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

ESM-Ezy

Dataset and checkpoint

Training

inference

Search

Files

README.md

Latest commit

History

README.md

File metadata and controls

ESM-Ezy

Dataset and checkpoint

Training

inference

Search