Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add demo data file #11

Open
jusjosgra opened this issue Jun 24, 2023 · 2 comments
Open

Add demo data file #11

jusjosgra opened this issue Jun 24, 2023 · 2 comments

Comments

@jusjosgra
Copy link

I tried running the CDR redesign example you provide by downloading the fasta file for 1JPT from PDB however the file format does not appear to be compatible with the expectations of IgLM. Could you provide an explicit example of what the input files should look like please? This would help with accessibility.

Thanks for providing the code & in general the interface is very usable.

@jusjosgra
Copy link
Author

jusjosgra commented Jun 24, 2023

I am able to initiate infilling with the following fasta file and command

fasta

>1JPT_1
DIQMTQSPSSLSASVGDRVTITCRASRDIKSYLNWYQQKPGKAPKVLIYYATSLAEGVPSRFSGSGSGTDYTLTISSLQPEDFATYYCLQHGESPWTFGQGTKVEIKRTVAAPSVFIFPPSDEQLKSGTASVVCLLNNFYPREAKVQWKVDNALQSGNSQESVTEQDSKDSTYSLSSTLTLSKADYEKHKVYACEVTHQGLSSPVTKSFNRGEC
>1JPT_2
EVQLVESGGGLVQPGGSLRLSCAASGFNIKEYYMHWVRQAPGKGLEWVGLIDPEQGNTIYDPKFQDRATISADNSKNTAYLQMNSLRAEDTAVYYCARDTAAYFDYWGQGTLVTVSSASTKGPSVFPLAPSSKSTSGGTAALGCLVKDYFPEPVTVSWNSGALTSGVHTFPAVLQSSGLYSLSSVVTVPSSSLGTQTYICNVNHKPSNTKVDKKVEPKSCDKTHT

command
iglm_infill data/antibodies/rcsb_pdb_1JPT.fasta 1JPT_2 98 106 --chain_token [HEAVY] --species_token [HUMAN] --num_seqs 100

However this results in the following error (seemingly produced in an infinite loop)

Input length of input_ids is 221, but `max_length` is set to 150. This can lead to unexpected behavior. You should consider increasing `max_new_tokens`.
Input length of input_ids is 221, but `max_length` is set to 150. This can lead to unexpected behavior. You should consider increasing `max_new_tokens`.
Input length of input_ids is 221, but `max_length` is set to 150. This can lead to unexpected behavior. You should consider increasing `max_new_tokens`.

I have tried updating max_length to 300 but this had no effect.

@jusjosgra
Copy link
Author

I was able to successfully run CDR infill by truncating the length of the input sequence however this is not really a solution, in particular since I am working with the example PDB you provide. Could you advise how I should use this with Ab sequences that are longer than 150 amino acids?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant