Question about intuition behind AF2 monomer for validation #70

eswan01 · 2024-10-22T17:11:22Z

Thanks for sharing this great binder generation pipeline! I have a few questions about the MPNN redesign/re-prediction step:

In your paper, you state: "These [MPNN-]optimized sequences are then re-predicted using the AF2 monomer model, with 3 recycles and 2-template based models in single sequence mode, to ensure robust and unbiased complex assessment." I'm finding that for some target/binder pairs, my trajectories produce good-looking binders, but they fail on monomer validation, specifically on iptm, as they're poorly docked by the monomer model. Could you please elaborate on the intuition behind using the AF2 monomer weights to fold the designed complex?

Supposing that I only use multimer weights in design and validation, do you think one could achieve similar "robustness" by using models [0, 1, 2] for design and [3, 4] for validation?

The complex and binder prediction models are defined and prepared in

BindCraft/bindcraft.py

Lines 194 to 203 in d2d3cd0

 complex_prediction_model = mk_afdesign_model(protocol="binder", num_recycles=advanced_settings["num_recycles_validation"], data_dir=advanced_settings["af_params_dir"], 

 use_multimer=multimer_validation) 

 complex_prediction_model.prep_inputs(pdb_filename=target_settings["starting_pdb"], chain=target_settings["chains"], binder_len=length, rm_target_seq=advanced_settings["rm_template_seq_predict"], 

 rm_target_sc=advanced_settings["rm_template_sc_predict"]) 

 # compile binder monomer prediction model 

 binder_prediction_model = mk_afdesign_model(protocol="hallucination", use_templates=False, initial_guess=False, 

 use_initial_atom_pos=False, num_recycles=advanced_settings["num_recycles_validation"], 

 data_dir=advanced_settings["af_params_dir"], use_multimer=multimer_validation) 

 binder_prediction_model.prep_inputs(length=length)

The complex prediction model is then called in

BindCraft/bindcraft.py

Lines 223 to 228 in d2d3cd0

 ### Predict mpnn redesigned binder complex using masked templates 

 mpnn_complex_statistics, pass_af2_filters = masked_binder_predict(complex_prediction_model, 

 mpnn_sequence['seq'], mpnn_design_name, 

 target_settings["starting_pdb"], target_settings["chains"], 

 length, trajectory_pdb, prediction_models, advanced_settings, 

 filters, design_paths, failure_csv)

When I dig into the masked_binder_predict code, it doesn't look to me like the templates are being masked, since by default in the advanced settings, rm_template_seq_predict and rm_template_sc_predict are both false. Could you please elaborate on the intention behind masked_binder_predict (beyond just refolding with different weights) and how it's being used here?

Best,
Erik

The text was updated successfully, but these errors were encountered:

martinpacesa · 2024-10-23T09:49:43Z

Hi there! BindCraft was built to be robust, our goal was to make a pipeline where every binder would be potentially working in the lab, rather than having to screen hundreds experimentally, that's why some of the steps might seem over the top. So there is definitely a trade off between design accuracy and success, and certain targets might be failing or even some good binders might be filtered out. That's why in the end we use the monomer model to filter, which has never seen complexes, because if that model thinks it is likely to form a complex it's a more confident prediction than with multimer which has the propensity to form complexes to begin with. Also, with multimer we design with 5 models, rather than 2 if we used monomer, which reduces the chances of making adverserial sequences. Therefore I would avoid using multimer for both design and validation as you might end up with sequences that look good to AF2 but might be overfitted to please the model.

Yeah the masked binder predict needs to be renamed, we used to do masking but moved away from it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about intuition behind AF2 monomer for validation #70

Question about intuition behind AF2 monomer for validation #70

eswan01 commented Oct 22, 2024

martinpacesa commented Oct 23, 2024

Question about intuition behind AF2 monomer for validation #70

Question about intuition behind AF2 monomer for validation #70

Comments

eswan01 commented Oct 22, 2024

martinpacesa commented Oct 23, 2024