This is a fork of ProteinGym that adds the ability to include Foldseek sequences in the DMS data CSV files.
To add Foldseek sequences to the DMS data CSV files, run the following command (after modifying the script for your local machine):
bash update_data_with_foldseek.sh
This will create a new directory containing the updated CSV files.
Note: You will need to have Foldseek installed in order to run this script. You can download Foldseek from here.
The original README for ProteinGym can be found here.
This project is available under the MIT license found in the LICENSE file in this GitHub repository.