About arguments in hallucination.py for two-chain hallucination #9

jongseo-park · 2022-07-27T05:05:52Z

I have several questions about free hallucination to generate protein binders.

Here is my command line input.

python3 /opt/tools/RFDesign/hallucination/hallucinate.py 
--pdb=./binder.pdb 
--out=test/test 
--steps=m300 
--num=10 
--start_num=1 
--mask=13,A9-15,60 
--spike=0.05 
--spike_fas=./binder_80aa.fasta 
--exclude_aa=C 
--receptor=./receptor_trunc.pdb 
--rec_placement=second 
--w_rog=1 
--rog_thresh=16 
--save_pdb=True 
--track_step 1 
--use_template=A9-15

Q1
I want to know whether the --mask=10,A6-10,65 (total 80 aa) and --mask=80 are same when the argument --spike is set to 0.05.

In addition,
Since the --spike=0.05 means random sequence, I think that there is no significancy which sequence is enter to the --spike_fas, but there is significancy only in the sequence length. is it right ?

Q2
In the Fig S17-C of the 2022 Science paper,

it seems that only the binding target (= receptor?) was used as template.

If it is right, how can I enter the binding target as template ? which argument is required ?

or is binding target automatically provided as template to hallucination if the binding target pdb file is provided to the --receptor argument ?

Q3
To perform free hallucination, I run the script without --use_template, but some error was occurred.

Saving /proc_1/RFDesign/binder/test/test_1: Traceback (most recent call last):
  File "/opt/tools/RFDesign/hallucination/hallucinate.py", line 739, in <module>
    main()
  File "/opt/tools/RFDesign/hallucination/hallucinate.py", line 731, in main
    optimization.save_result(out_prefix, Net, ml, trb['msa'], args, trb, 
  File "/opt/tools/RFDesign/hallucination/util/../optimization.py", line 179, in save_result
    idx_tmpl = net_kwargs['idx'].cpu().numpy()[0]
KeyError: 'idx'

When I enter the dummy value (like Z999-1000) to the --use_template, then the script worked well.

is it proper way to perform free hallucination ?

Sincerely,

jongseo

The text was updated successfully, but these errors were encountered:

jueseph · 2022-07-27T21:10:45Z

Sorry, the free binder hallucination was added somewhat recently, toward the end of our paper, so these features are not as well documented. See my responses below.

*Q1* I want to know whether the --mask=10,A6-10,65 (total 80 aa) and --mask=80 are same when the argument --spike is set to 0.05.

your first --mask argument specifies to hallucinate 10 AAs, followed by a motif consisting of residues A6-10 from the input pdb, followed by 65 hallucinated residues. The second mask argument would in theory specify a completely hallucinated protein of 80 residues, but in practice will probably trigger an error. If you want to do completely unconstrained hallucination, you should put at least 1 motif residue, like 80,A6, but then set the cce loss weight to 0 (--w_cce 0). That way the motif won't affect the optimization. the --spike argument specifies an initialization for the sequence, so the behavior will still be different depending on which --mask you input.

In addition, Since the --spike=0.05 means random sequence, I think that there is no significancy which sequence is enter to the --spike_fas, but there is significancy only in the sequence length. is it right ?

Yes I think this is right. Nowadays we only really use --spike=0.99 (or something else close to 1) with MCMC because that tells the script to completely initialize with a particular sequence. It's unclear how useful --spike is for gradient descent.

*Q2* In the Fig S17-C of the 2022 Science paper, it seems that only the binding target (= receptor?) was used as template. If it is right, how can I enter the binding target as template ? which argument is required ? or is binding target automatically provided as template to hallucination if the binding target pdb file is provided to the --receptor argument ?

Yes, providing --receptor will automatically cause the receptor to be used as template

*Q3* To perform free hallucination, I run the script without --use_template, but some error was occurred. Saving /proc_1/RFDesign/binder/test/test_1: Traceback (most recent call last): File "/opt/tools/RFDesign/hallucination/hallucinate.py", line 739, in <module> main() File "/opt/tools/RFDesign/hallucination/hallucinate.py", line 731, in main optimization.save_result(out_prefix, Net, ml, trb['msa'], args, trb, File "/opt/tools/RFDesign/hallucination/util/../optimization.py", line 179, in save_result idx_tmpl = net_kwargs['idx'].cpu().numpy()[0] KeyError: 'idx' When I enter the dummy value (like Z999-1000) to the --use_template, then the script worked well.

If you don't want to template any part of the binder, you can input --use_template no_contig. You can also put --use_template True (to template all of the contigs, usually a binding interface motif) or --use_template A6-10 (representing, e.g., some subset of the interface motif residues).Message ID: ***@***.***>

…

jongseo-park · 2022-07-28T01:26:07Z

Thank you for kind and helpful reply !

I have last two questions.

Q1

Is there any way to specify binder binding site on the receptor without --use_template ?

I want to design binders using unconstrained hallucination with the specific binding site on the receptor.

or is it a right way that use the --mask and --use_template as well as --w_cce=0

Q2
In the case of protein binder design, I already read 2~3 previous papers that describe docking > seq. design > scoring.

(Cao et al., 2022 Nature // Dauparas et al., 2022 BioRxiv // Bennett et al., 2022 BioRxiv)
https://www.nature.com/articles/s41586-022-04654-9
https://www.biorxiv.org/content/10.1101/2022.06.03.494563v1
https://www.biorxiv.org/content/10.1101/2022.06.15.495993v1

Comparing those two similar methods with this RFDesign, which approach has a higher success rate to design binder proteins ?

Sincerely,

Jongseo

jueseph · 2022-07-28T02:56:01Z

*Q1* Is there any way to specify binder binding site on the receptor without --use_template ? I want to design binders using unconstrained hallucination with the specific binding site on the receptor. or is it a right way that use the --mask and --use_template as well as --w_cce=0 There isn't a way to specify the binding site in this version of the code.

One of the authors is working on a followup manuscript with loss functions for controlling the binding site, but this isn't ready to be released publicly yet.

*Q2* In the case of protein binder design, I already read 2~3 previous papers that describe docking > seq. design > scoring. (Cao et al., 2022 Nature // Dauparas et al., 2022 BioRxiv // Bennett et al., 2022 BioRxiv) https://www.nature.com/articles/s41586-022-04654-9 https://www.biorxiv.org/content/10.1101/2022.06.03.494563v1 https://www.biorxiv.org/content/10.1101/2022.06.15.495993v1 Comparing those two similar methods with this RFDesign, which approach has a higher success rate to design binder proteins ?

The highest success rate will be the method in Bennett et al.

…

Message ID: ***@***.***>

jongseo-park closed this as completed Jul 28, 2022

geraseva mentioned this issue Nov 21, 2023

Hallucination w/ receptor crashing while saving output #32

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About arguments in hallucination.py for two-chain hallucination #9

About arguments in hallucination.py for two-chain hallucination #9

jongseo-park commented Jul 27, 2022

jueseph commented Jul 27, 2022 via email

jongseo-park commented Jul 28, 2022

jueseph commented Jul 28, 2022 via email

About arguments in hallucination.py for two-chain hallucination #9

About arguments in hallucination.py for two-chain hallucination #9

Comments

jongseo-park commented Jul 27, 2022

jueseph commented Jul 27, 2022 via email

jongseo-park commented Jul 28, 2022

jueseph commented Jul 28, 2022 via email