You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Congratulations on a great paper! We are currently trying to reproduce the steps for finetuning ProGen (ProGen1, nature biotech version) with lysozyme, and then use the same pipeline for another protein of our interest. However, while trying to reproduce the finetuning pipeline, we discovered that the GitHub repo link posted in the NBT paper is for ProGen2. Although the Zenodo link contains the desired ProGen1 codes, the datasets for training (both the pre-training dataset and the finetuning ones) are missing. Furthermore, the Zenodo codebase is apparently used for development and not cleaned for public use and would be difficult for us (experimental biologists) to run. If it is possible, would you guys mind releasing a cleaned open-source version (like ProGen2) with instructions to do sampling and pretraining? This would be most welcomed by the AI+Protein community and by experimental biologists who are not experts with transformer models. Many thanks in advance!
The text was updated successfully, but these errors were encountered:
Congratulations on a great paper! We are currently trying to reproduce the steps for finetuning ProGen (ProGen1, nature biotech version) with lysozyme, and then use the same pipeline for another protein of our interest. However, while trying to reproduce the finetuning pipeline, we discovered that the GitHub repo link posted in the NBT paper is for ProGen2. Although the Zenodo link contains the desired ProGen1 codes, the datasets for training (both the pre-training dataset and the finetuning ones) are missing. Furthermore, the Zenodo codebase is apparently used for development and not cleaned for public use and would be difficult for us (experimental biologists) to run. If it is possible, would you guys mind releasing a cleaned open-source version (like ProGen2) with instructions to do sampling and pretraining? This would be most welcomed by the AI+Protein community and by experimental biologists who are not experts with transformer models. Many thanks in advance!
The text was updated successfully, but these errors were encountered: