Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Releasing cleaned version of ProGen1? #30

Open
canallee opened this issue Apr 22, 2023 · 1 comment
Open

Releasing cleaned version of ProGen1? #30

canallee opened this issue Apr 22, 2023 · 1 comment

Comments

@canallee
Copy link

Congratulations on a great paper! We are currently trying to reproduce the steps for finetuning ProGen (ProGen1, nature biotech version) with lysozyme, and then use the same pipeline for another protein of our interest. However, while trying to reproduce the finetuning pipeline, we discovered that the GitHub repo link posted in the NBT paper is for ProGen2. Although the Zenodo link contains the desired ProGen1 codes, the datasets for training (both the pre-training dataset and the finetuning ones) are missing. Furthermore, the Zenodo codebase is apparently used for development and not cleaned for public use and would be difficult for us (experimental biologists) to run. If it is possible, would you guys mind releasing a cleaned open-source version (like ProGen2) with instructions to do sampling and pretraining? This would be most welcomed by the AI+Protein community and by experimental biologists who are not experts with transformer models. Many thanks in advance!

@Admire7494
Copy link

Exactly what problem I have met! Please provide an instruction! Many thanks in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants