Skip to content

westlake-repl/ESM-Ezy

Repository files navigation

ESM-Ezy

Dataset and checkpoint

To get dataset and model checkpoint, please refer to DOI.

Download the data.zip file and extract it to the data directory.

Download the ckpt.zip file and extract it to the ckpt directory.

Training

To train ESM-Ezy, follow the steps below:

  1. Clone the repository:
git clone https://github.com/westlake-repl/ESM-Ezy.git
  1. Install the required packages:
conda env create -f environment.yml
  1. Download the pre-trained ESM-1b model:
wget https://dl.fbaipublicfiles.com/fair-esm/models/esm1b_t33_650M_UR50S.pt -O ckpt/esm1b_t33_650M_UR50S.pt
wget https://dl.fbaipublicfiles.com/fair-esm/models/esm1b_t33_650M_UR50S-contact-regression.pt -O ckpt/esm1b_t33_650M_UR50S-contact-regression.pt
  1. Train ESM-Ezy:
python scripts/train.py --train_positive_data data/train/train_positive.fa --train_negative_data data/train/train_negative.fa --test_positive_data data/train/test_positive.fa --test_negative_data data/train/test_negative.fa --model_path ckpt/esm1b_t33_650M_UR50S.pt

inference

  1. inference from uniref50 database:
python scripts/inference.py --model_path ckpt/esm1b_t33_650M_UR50S.pt --checkpoint_path ckpt/model_laccase.pkl --inference_data data/inference/uniref50.fasta  --output_path data/retrieval

Search

  1. load the trained ESM-Ezy model and inference on the candidate sequences:
python scripts/retrieval.py --model_path ckpt/esm1b_t33_650M_UR50S.pt --checkpoint_path ckpt/model_laccase.pkl --candidate_data data/retrieval/candidate.fa --seed_data data/retrieval/fitness.fa  --output_path data/retrieval

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published