Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducing workflow from BespokeFit paper #356

Open
aqemia-jasmin-guven opened this issue Jul 30, 2024 · 3 comments
Open

Reproducing workflow from BespokeFit paper #356

aqemia-jasmin-guven opened this issue Jul 30, 2024 · 3 comments

Comments

@aqemia-jasmin-guven
Copy link

Description

Hi!

First of all, I wanted to say thanks to @j-wags for having a chat with us about OpenFF tools! We had a very productive conversation and he encouraged me to raise an issue here about some of the questions I had about BespokeFit.

I'm trying to recreate the results from the BespokeFit paper to help me understand the tool before using it in new projects. My main point of confusion is how to run the workflow for a congeneric series of ligands, such as the TYK-2 set?

From the paper, I understood the workflow as follows:

  1. Run the openff-fragmenter on the whole series and save to JSON: Is this how the bespokefit_fragment_inputs.json file from the SI of the BespokeFit paper was generated? The main question I have here is, how do we generate the target torsion SMARTS strings with just the atoms of the central bond of the torsion labelled, instead of all of them?
  2. Run openff-bespokefit on just one ligand, e.g. EJM31.
  3. Remove duplicates from fragment list
  4. Turn off fragmentation in bespokefit
  5. Run bespokefit on custom fragments

Is the cache updated here at some point along the workflow as well?

Essentially, I am struggling to understand the workflow given in the python scripts from the paper Zenodo.

Additionally, Jeff mentioned that bespokefit should internally deduplicate the fragments, however I don't think I'm seeing this behaviour. For this, I launched the executor once and then submitted a single SDF containing all the ligands.

Thanks a lot in advance for your help!

Context

Software versions

  • Which operating system and version are you using?
    • Ubuntu 22.04.4 LTS (GNU/Linux x86_64)
  • How did you install BespokeFit?

Installed with environment.yml:

name: bsfit
channels:
  - conda-forge
  - defaults
dependencies:
  - python=3.11
  - ambertools=23
  - openff-bespokefit
  - openff-fragmenter
  - openff-forcefields
  - openff-qcsubmit
  - openff-toolkit
  - openmmforcefields
  - psi4=1.9
  - qcportal=0.15
  - awswrangler
  • Are you using Apple Silicon? If so, are you running BespokeFit in Rosetta or directly?
    • No.
  • What is the output of running conda list?
Output of conda list

Please place the output of conda list here

packages in environment at /home/ubuntu/miniconda3/envs/bsfit:

Name Version Build Channel

_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 2_kmp_llvm conda-forge
ambertools 23.6 cuda_None_nompi_py311h4a53416_105 conda-forge
amberutils 21.0 pypi_0 pypi
amqp 5.2.0 pyhd8ed1ab_1 conda-forge
anyio 4.4.0 pyhd8ed1ab_0 conda-forge
argcomplete 3.4.0 pyhd8ed1ab_0 conda-forge
argon2-cffi 23.1.0 pyhd8ed1ab_0 conda-forge
argon2-cffi-bindings 21.2.0 py311h459d7ec_4 conda-forge
arpack 3.9.1 nompi_h77f6705_101 conda-forge
arrow 1.3.0 pyhd8ed1ab_0 conda-forge
asttokens 2.4.1 pyhd8ed1ab_0 conda-forge
astunparse 1.6.3 pyhd8ed1ab_0 conda-forge
async-lru 2.0.4 pyhd8ed1ab_0 conda-forge
async-timeout 4.0.3 pyhd8ed1ab_0 conda-forge
attrs 23.2.0 pyh71513ae_0 conda-forge
aws-c-auth 0.7.22 hbd3ac97_10 conda-forge
aws-c-cal 0.7.1 h87b94db_1 conda-forge
aws-c-common 0.9.23 h4ab18f5_0 conda-forge
aws-c-compression 0.2.18 he027950_7 conda-forge
aws-c-event-stream 0.4.2 h7671281_15 conda-forge
aws-c-http 0.8.2 he17ee6b_6 conda-forge
aws-c-io 0.14.10 h826b7d6_1 conda-forge
aws-c-mqtt 0.10.4 hcd6a914_8 conda-forge
aws-c-s3 0.6.0 h365ddd8_2 conda-forge
aws-c-sdkutils 0.1.16 he027950_3 conda-forge
aws-checksums 0.1.18 he027950_7 conda-forge
aws-crt-cpp 0.27.3 hda66527_2 conda-forge
aws-sdk-cpp 1.11.329 h46c3b66_9 conda-forge
awswrangler 3.9.0 pyhd8ed1ab_0 conda-forge
azure-core-cpp 1.13.0 h935415a_0 conda-forge
azure-identity-cpp 1.8.0 hd126650_2 conda-forge
azure-storage-blobs-cpp 12.11.0 hd2e3451_2 conda-forge
azure-storage-common-cpp 12.7.0 h10ac4d7_1 conda-forge
azure-storage-files-datalake-cpp 12.10.0 haa04155_2 conda-forge
babel 2.14.0 pyhd8ed1ab_0 conda-forge
backports.zoneinfo 0.2.1 py311h38be061_8 conda-forge
basis_set_exchange 0.10 pyhd8ed1ab_1 conda-forge
beautifulsoup4 4.12.3 pyha770c72_0 conda-forge
billiard 4.2.0 py311h459d7ec_0 conda-forge
bleach 6.1.0 pyhd8ed1ab_0 conda-forge
blosc 1.21.6 hef167b5_0 conda-forge
boto3 1.34.148 pyhd8ed1ab_0 conda-forge
botocore 1.34.148 pyge310_1234567_0 conda-forge
brotli 1.1.0 hd590300_1 conda-forge
brotli-bin 1.1.0 hd590300_1 conda-forge
brotli-python 1.1.0 py311hb755f60_1 conda-forge
bson 0.5.9 py_0 conda-forge
bzip2 1.0.8 h4bc722e_7 conda-forge
c-ares 1.32.3 h4bc722e_0 conda-forge
c-blosc2 2.15.0 h6d6b9e4_1 conda-forge
ca-certificates 2024.7.4 hbcca054_0 conda-forge
cached-property 1.5.2 hd8ed1ab_1 conda-forge
cached_property 1.5.2 pyha770c72_1 conda-forge
cachetools 5.4.0 pyhd8ed1ab_0 conda-forge
cairo 1.18.0 hebfffa5_3 conda-forge
celery 5.3.6 pyhd8ed1ab_0 conda-forge
certifi 2024.7.4 pyhd8ed1ab_0 conda-forge
cffi 1.16.0 py311hb3a22ac_0 conda-forge
chardet 5.2.0 py311h38be061_1 conda-forge
charset-normalizer 3.3.2 pyhd8ed1ab_0 conda-forge
chemper 1.0.1 pyhd8ed1ab_0 conda-forge
click 8.1.7 unix_pyh707e725_0 conda-forge
click-didyoumean 0.3.1 pyhd8ed1ab_0 conda-forge
click-option-group 0.5.6 pyhd8ed1ab_0 conda-forge
click-plugins 1.1.1 py_0 conda-forge
click-repl 0.3.0 pyhd8ed1ab_0 conda-forge
colorama 0.4.6 pyhd8ed1ab_0 conda-forge
comm 0.2.2 pyhd8ed1ab_0 conda-forge
contourpy 1.2.1 py311h9547e67_0 conda-forge
cudatoolkit 11.8.0 h4ba93d1_13 conda-forge
cycler 0.12.1 pyhd8ed1ab_0 conda-forge
debugpy 1.8.2 py311h4332511_0 conda-forge
decorator 5.1.1 pyhd8ed1ab_0 conda-forge
defusedxml 0.7.1 pyhd8ed1ab_0 conda-forge
dkh 1.2 hd59d2e7_0 conda-forge
edgembar 0.2 pypi_0 pypi
entrypoints 0.4 pyhd8ed1ab_0 conda-forge
exceptiongroup 1.2.2 pyhd8ed1ab_0 conda-forge
executing 2.0.1 pyhd8ed1ab_0 conda-forge
expat 2.6.2 h59595ed_0 conda-forge
fastapi 0.86.0 pyhd8ed1ab_0 conda-forge
fftw 3.3.10 nompi_hf1063bd_110 conda-forge
font-ttf-dejavu-sans-mono 2.37 hab24e00_0 conda-forge
font-ttf-inconsolata 3.000 h77eed37_0 conda-forge
font-ttf-source-code-pro 2.038 h77eed37_0 conda-forge
font-ttf-ubuntu 0.83 h77eed37_2 conda-forge
fontconfig 2.14.2 h14ed4e7_0 conda-forge
fonts-conda-ecosystem 1 0 conda-forge
fonts-conda-forge 1 0 conda-forge
fonttools 4.53.1 py311h61187de_0 conda-forge
forcebalance 1.9.6 py311h2b7392c_2 conda-forge
fqdn 1.5.1 pyhd8ed1ab_0 conda-forge
freetype 2.12.1 h267a509_2 conda-forge
freetype-py 2.3.0 pyhd8ed1ab_0 conda-forge
future 1.0.0 pyhd8ed1ab_0 conda-forge
gau2grid 2.0.7 h4ab18f5_3 conda-forge
geometric 1.0.2 pyhd8ed1ab_0 conda-forge
gflags 2.2.2 he1b5a44_1004 conda-forge
glog 0.7.1 hbabe93e_0 conda-forge
greenlet 3.0.3 py311hb755f60_0 conda-forge
gtest 1.14.0 h434a139_2 conda-forge
h11 0.14.0 pyhd8ed1ab_0 conda-forge
h2 4.1.0 pyhd8ed1ab_0 conda-forge
h5py 3.11.0 nompi_py311h439e445_102 conda-forge
hdf4 4.2.15 h2a13503_7 conda-forge
hdf5 1.14.3 nompi_hdf9ad27_105 conda-forge
hpack 4.0.0 pyh9f0ad1d_0 conda-forge
httpcore 1.0.5 pyhd8ed1ab_0 conda-forge
httpx 0.27.0 pyhd8ed1ab_0 conda-forge
hyperframe 6.0.1 pyhd8ed1ab_0 conda-forge
icu 75.1 he02047a_0 conda-forge
idna 3.7 pyhd8ed1ab_0 conda-forge
importlib-metadata 8.2.0 pyha770c72_0 conda-forge
importlib_metadata 8.2.0 hd8ed1ab_0 conda-forge
importlib_resources 6.4.0 pyhd8ed1ab_0 conda-forge
ipykernel 6.29.5 pyh3099207_0 conda-forge
ipython 8.26.0 pyh707e725_0 conda-forge
ipywidgets 8.1.3 pyhd8ed1ab_0 conda-forge
isoduration 20.11.0 pyhd8ed1ab_0 conda-forge
jedi 0.19.1 pyhd8ed1ab_0 conda-forge
jinja2 3.1.4 pyhd8ed1ab_0 conda-forge
jmespath 1.0.1 pyhd8ed1ab_0 conda-forge
joblib 1.4.2 pyhd8ed1ab_0 conda-forge
json5 0.9.25 pyhd8ed1ab_0 conda-forge
jsonpointer 3.0.0 py311h38be061_0 conda-forge
jsonschema 4.23.0 pyhd8ed1ab_0 conda-forge
jsonschema-specifications 2023.12.1 pyhd8ed1ab_0 conda-forge
jsonschema-with-format-nongpl 4.23.0 hd8ed1ab_0 conda-forge
jupyter-lsp 2.2.5 pyhd8ed1ab_0 conda-forge
jupyter_client 8.6.2 pyhd8ed1ab_0 conda-forge
jupyter_core 5.7.2 py311h38be061_0 conda-forge
jupyter_events 0.10.0 pyhd8ed1ab_0 conda-forge
jupyter_server 2.14.2 pyhd8ed1ab_0 conda-forge
jupyter_server_terminals 0.5.3 pyhd8ed1ab_0 conda-forge
jupyterlab 4.2.4 pyhd8ed1ab_0 conda-forge
jupyterlab_pygments 0.3.0 pyhd8ed1ab_1 conda-forge
jupyterlab_server 2.27.3 pyhd8ed1ab_0 conda-forge
jupyterlab_widgets 3.0.11 pyhd8ed1ab_0 conda-forge
keyutils 1.6.1 h166bdaf_0 conda-forge
kiwisolver 1.4.5 py311h9547e67_1 conda-forge
kombu 5.3.7 py311h38be061_0 conda-forge
krb5 1.21.3 h659f571_0 conda-forge
lcms2 2.16 hb7c19ff_0 conda-forge
ld_impl_linux-64 2.40 hf3520f5_7 conda-forge
lerc 4.0.0 h27087fc_0 conda-forge
libabseil 20240116.2 cxx17_he02047a_1 conda-forge
libaec 1.1.3 h59595ed_0 conda-forge
libarrow 17.0.0 h0a637a3_1_cpu conda-forge
libarrow-acero 17.0.0 he02047a_1_cpu conda-forge
libarrow-dataset 17.0.0 he02047a_1_cpu conda-forge
libarrow-substrait 17.0.0 hc9a23c6_1_cpu conda-forge
libblas 3.9.0 20_linux64_mkl conda-forge
libboost 1.84.0 h0ccab89_4 conda-forge
libboost-python 1.84.0 py311h06317a3_4 conda-forge
libbrotlicommon 1.1.0 hd590300_1 conda-forge
libbrotlidec 1.1.0 hd590300_1 conda-forge
libbrotlienc 1.1.0 hd590300_1 conda-forge
libcblas 3.9.0 20_linux64_mkl conda-forge
libcrc32c 1.1.2 h9c3ff4c_0 conda-forge
libcurl 8.9.0 hdb1bdb2_0 conda-forge
libdeflate 1.20 hd590300_0 conda-forge
libecpint 1.0.7 h3ecfda7_10 conda-forge
libedit 3.1.20191231 he28a2e2_2 conda-forge
libev 4.33 hd590300_2 conda-forge
libevent 2.1.12 hf998b51_1 conda-forge
libexpat 2.6.2 h59595ed_0 conda-forge
libffi 3.4.2 h7f98852_5 conda-forge
libgcc-ng 14.1.0 h77fa898_0 conda-forge
libgfortran-ng 14.1.0 h69a702a_0 conda-forge
libgfortran5 14.1.0 hc5f4f2c_0 conda-forge
libglib 2.80.3 h8a4344b_1 conda-forge
libgomp 14.1.0 h77fa898_0 conda-forge
libgoogle-cloud 2.26.0 h26d7fe4_0 conda-forge
libgoogle-cloud-storage 2.26.0 ha262f82_0 conda-forge
libgrpc 1.62.2 h15f2491_0 conda-forge
libhwloc 2.11.1 default_hecaa2ac_1000 conda-forge
libiconv 1.17 hd590300_2 conda-forge
libint 2.9.0 h9bbc0ff_0 conda-forge
libjpeg-turbo 3.0.0 hd590300_1 conda-forge
liblapack 3.9.0 20_linux64_mkl conda-forge
libnetcdf 4.9.2 nompi_h135f659_114 conda-forge
libnghttp2 1.58.0 h47da74e_1 conda-forge
libnsl 2.0.1 hd590300_0 conda-forge
libparquet 17.0.0 h9e5060d_1_cpu conda-forge
libpcm 1.2.3 h4175798_8 conda-forge
libpng 1.6.43 h2797004_0 conda-forge
libpq 16.3 ha72fbe1_0 conda-forge
libprotobuf 4.25.3 h08a7969_0 conda-forge
librdkit 2024.03.5 h79cfef2_1 conda-forge
libre2-11 2023.09.01 h5a48ba9_2 conda-forge
libsodium 1.0.18 h36c2ea0_1 conda-forge
libsqlite 3.46.0 hde9e2c9_0 conda-forge
libssh2 1.11.0 h0841786_0 conda-forge
libstdcxx-ng 14.1.0 hc0a3c3a_0 conda-forge
libthrift 0.19.0 hb90f79a_1 conda-forge
libtiff 4.6.0 h1dd3fc0_3 conda-forge
libutf8proc 2.8.0 h166bdaf_0 conda-forge
libuuid 2.38.1 h0b41bf4_0 conda-forge
libwebp-base 1.4.0 hd590300_0 conda-forge
libxc-c 6.2.2 cpu_h1b64f48_4 conda-forge
libxcb 1.16 hd590300_0 conda-forge
libxcrypt 4.4.36 hd590300_1 conda-forge
libxml2 2.12.7 he7c6b58_4 conda-forge
libxslt 1.1.39 h76b75d6_0 conda-forge
libzip 1.10.1 h2629f0a_3 conda-forge
libzlib 1.3.1 h4ab18f5_1 conda-forge
llvm-openmp 18.1.8 hf5423f3_0 conda-forge
lxml 5.2.2 py311hc0a218f_0 conda-forge
lz4-c 1.9.4 hcb278e6_0 conda-forge
lzo 2.10 hd590300_1001 conda-forge
markdown-it-py 3.0.0 pyhd8ed1ab_0 conda-forge
markupsafe 2.1.5 py311h459d7ec_0 conda-forge
matplotlib-base 3.9.1 py311hffb96ce_0 conda-forge
matplotlib-inline 0.1.7 pyhd8ed1ab_0 conda-forge
mda-xdrlib 0.2.0 pyhd8ed1ab_0 conda-forge
mdtraj 1.10.0 py311h3f233a9_0 conda-forge
mdurl 0.1.2 pyhd8ed1ab_0 conda-forge
mistune 3.0.2 pyhd8ed1ab_0 conda-forge
mkl 2023.2.0 h84fe81f_50496 conda-forge
mmpbsa-py 16.0 pypi_0 pypi
msgpack-python 1.0.8 py311h52f7536_0 conda-forge
munkres 1.1.4 pyh9f0ad1d_0 conda-forge
nbclient 0.10.0 pyhd8ed1ab_0 conda-forge
nbconvert-core 7.16.4 pyhd8ed1ab_1 conda-forge
nbformat 5.10.4 pyhd8ed1ab_0 conda-forge
ncurses 6.5 h59595ed_0 conda-forge
nest-asyncio 1.6.0 pyhd8ed1ab_0 conda-forge
netcdf-fortran 4.6.1 nompi_h228c76a_104 conda-forge
networkx 3.3 pyhd8ed1ab_1 conda-forge
nglview 3.1.2 pyhceb8b5e_1 conda-forge
notebook 7.2.1 pyhd8ed1ab_0 conda-forge
notebook-shim 0.2.4 pyhd8ed1ab_0 conda-forge
numexpr 2.10.0 mkl_py311haeb1ab9_0 conda-forge
numpy 1.26.4 py311h64a7726_0 conda-forge
ocl-icd 2.3.2 hd590300_1 conda-forge
ocl-icd-system 1.0.0 1 conda-forge
openff-amber-ff-ports 0.0.4 pyhca7485f_0 conda-forge
openff-bespokefit 0.2.3 pyhd8ed1ab_1 conda-forge
openff-forcefields 2024.07.0 pyhff2d567_0 conda-forge
openff-fragmenter 0.2.2 pyhd8ed1ab_0 conda-forge
openff-fragmenter-base 0.2.2 pyhd8ed1ab_0 conda-forge
openff-interchange 0.3.18 pyhd8ed1ab_0 conda-forge
openff-interchange-base 0.3.18 pyhd8ed1ab_0 conda-forge
openff-models 0.1.2 pyhca7485f_0 conda-forge
openff-qcsubmit 0.5.0 pyhd8ed1ab_0 conda-forge
openff-toolkit 0.14.5 pyhd8ed1ab_1 conda-forge
openff-toolkit-base 0.14.5 pyhd8ed1ab_1 conda-forge
openff-units 0.2.2 pyhca7485f_0 conda-forge
openff-utilities 0.1.12 pyhd8ed1ab_0 conda-forge
openjpeg 2.5.2 h488ebb8_0 conda-forge
openmm 8.1.2 py311he040c58_2 conda-forge
openmmforcefields 0.14.1 pyhd8ed1ab_0 conda-forge
openssl 3.3.1 h4bc722e_2 conda-forge
optking 0.2.1 pyhd8ed1ab_0 conda-forge
orc 2.0.1 h17fec99_1 conda-forge
overrides 7.7.0 pyhd8ed1ab_0 conda-forge
packaging 23.2 pyhd8ed1ab_0 conda-forge
packmol-memgen 2024.2.9 pypi_0 pypi
pandas 2.2.2 py311h14de704_1 conda-forge
pandocfilters 1.5.0 pyhd8ed1ab_0 conda-forge
panedr 0.8.0 pyhd8ed1ab_0 conda-forge
parmed 4.2.2 py311hb755f60_1 conda-forge
parso 0.8.4 pyhd8ed1ab_0 conda-forge
pcmsolver 1.2.3 py_9 conda-forge
pcre2 10.44 h0f59acf_0 conda-forge
pdb4amber 22.0 pypi_0 pypi
perl 5.32.1 7_hd590300_perl5 conda-forge
pexpect 4.9.0 pyhd8ed1ab_0 conda-forge
pickleshare 0.7.5 py_1003 conda-forge
pillow 10.4.0 py311h82a398c_0 conda-forge
pint 0.23 pyhd8ed1ab_1 conda-forge
pip 24.0 pyhd8ed1ab_0 conda-forge
pixman 0.43.2 h59595ed_0 conda-forge
pkgutil-resolve-name 1.3.10 pyhd8ed1ab_1 conda-forge
platformdirs 4.2.2 pyhd8ed1ab_0 conda-forge
plotly 5.23.0 pyhd8ed1ab_0 conda-forge
prometheus_client 0.20.0 pyhd8ed1ab_0 conda-forge
prompt-toolkit 3.0.47 pyha770c72_0 conda-forge
prompt_toolkit 3.0.47 hd8ed1ab_0 conda-forge
psi4 1.9.1 py311he3e7f2e_3 conda-forge
psutil 6.0.0 py311h331c9d8_0 conda-forge
pthread-stubs 0.4 h36c2ea0_1001 conda-forge
ptyprocess 0.7.0 pyhd3deb0d_0 conda-forge
pugixml 1.14 h59595ed_0 conda-forge
pure_eval 0.2.3 pyhd8ed1ab_0 conda-forge
py-cpuinfo 9.0.0 pyhd8ed1ab_0 conda-forge
pyarrow 17.0.0 py311hbd00459_0 conda-forge
pyarrow-core 17.0.0 py311h9460f28_0_cpu conda-forge
pybind11-abi 4 hd8ed1ab_3 conda-forge
pycairo 1.26.1 py311h64ab44a_0 conda-forge
pycparser 2.22 pyhd8ed1ab_0 conda-forge
pydantic 1.10.16 py311h331c9d8_0 conda-forge
pyedr 0.8.0 pyhd8ed1ab_0 conda-forge
pygments 2.18.0 pyhd8ed1ab_0 conda-forge
pymbar 3.1.1 py311h7c22f60_3 conda-forge
pymsmt 22.0 pypi_0 pypi
pyparsing 3.1.2 pyhd8ed1ab_0 conda-forge
pysocks 1.7.1 pyha2e5f31_6 conda-forge
pytables 3.9.2 py311ha8f287f_3 conda-forge
python 3.11.9 hb806964_0_cpython conda-forge
python-constraint 1.4.0 py_0 conda-forge
python-dateutil 2.9.0 pyhd8ed1ab_0 conda-forge
python-fastjsonschema 2.20.0 pyhd8ed1ab_0 conda-forge
python-json-logger 2.0.7 pyhd8ed1ab_0 conda-forge
python-tzdata 2024.1 pyhd8ed1ab_0 conda-forge
python_abi 3.11 4_cp311 conda-forge
pytraj 2.0.6 pypi_0 pypi
pytz 2024.1 pyhd8ed1ab_0 conda-forge
pyyaml 6.0.1 py311h459d7ec_1 conda-forge
pyzmq 26.0.3 py311h08a0b41_0 conda-forge
qcelemental 0.28.0 pyhd8ed1ab_0 conda-forge
qcengine 0.30.0 pyhd8ed1ab_0 conda-forge
qcportal 0.15.8 pyhd8ed1ab_0 conda-forge
qhull 2020.2 h434a139_5 conda-forge
rdkit 2024.03.5 py311h845bd92_1 conda-forge
re2 2023.09.01 h7f4b329_2 conda-forge
readline 8.2 h8228510_1 conda-forge
redis-py 5.0.7 pyhd8ed1ab_0 conda-forge
redis-server 7.2.5 he19d79f_0 conda-forge
referencing 0.35.1 pyhd8ed1ab_0 conda-forge
regex 2024.7.24 py311h61187de_0 conda-forge
reportlab 4.2.2 py311h331c9d8_0 conda-forge
requests 2.32.3 pyhd8ed1ab_0 conda-forge
rfc3339-validator 0.1.4 pyhd8ed1ab_0 conda-forge
rfc3986-validator 0.1.1 pyh9f0ad1d_0 conda-forge
rich 13.7.1 pyhd8ed1ab_0 conda-forge
rlpycairo 0.2.0 pyhd8ed1ab_0 conda-forge
rpds-py 0.19.1 py311hb3a8bbb_0 conda-forge
s2n 1.4.17 he19d79f_0 conda-forge
s3transfer 0.10.2 pyhd8ed1ab_0 conda-forge
sander 22.0 pypi_0 pypi
scipy 1.14.0 py311h517d4fd_1 conda-forge
send2trash 1.8.3 pyh0d859eb_0 conda-forge
setuptools 71.0.4 pyhd8ed1ab_0 conda-forge
six 1.16.0 pyh6c4a22f_0 conda-forge
smirnoff99frosst 1.1.0 pyh44b312d_0 conda-forge
snappy 1.2.1 ha2e4443_0 conda-forge
sniffio 1.3.1 pyhd8ed1ab_0 conda-forge
soupsieve 2.5 pyhd8ed1ab_1 conda-forge
sqlalchemy 2.0.31 py311h331c9d8_0 conda-forge
stack_data 0.6.2 pyhd8ed1ab_0 conda-forge
starlette 0.20.4 pyhd8ed1ab_1 conda-forge
tbb 2021.12.0 h434a139_3 conda-forge
tenacity 8.5.0 pyhd8ed1ab_0 conda-forge
terminado 0.18.1 pyh0d859eb_0 conda-forge
tinycss2 1.3.0 pyhd8ed1ab_0 conda-forge
tinydb 4.8.0 pyhd8ed1ab_0 conda-forge
tk 8.6.13 noxft_h4845f30_101 conda-forge
tomli 2.0.1 pyhd8ed1ab_0 conda-forge
tornado 6.4.1 py311h331c9d8_0 conda-forge
torsiondrive 1.1.0 pyhd8ed1ab_0 conda-forge
tqdm 4.66.4 pyhd8ed1ab_0 conda-forge
traitlets 5.14.3 pyhd8ed1ab_0 conda-forge
types-python-dateutil 2.9.0.20240316 pyhd8ed1ab_0 conda-forge
typing-extensions 4.12.2 hd8ed1ab_0 conda-forge
typing_extensions 4.12.2 pyha770c72_0 conda-forge
typing_utils 0.1.0 pyhd8ed1ab_0 conda-forge
tzdata 2024a h0c530f3_0 conda-forge
unidecode 1.3.8 pyhd8ed1ab_0 conda-forge
uri-template 1.3.0 pyhd8ed1ab_0 conda-forge
urllib3 2.2.2 pyhd8ed1ab_1 conda-forge
uvicorn 0.30.3 py311h38be061_0 conda-forge
validators 0.33.0 pyhd8ed1ab_0 conda-forge
vine 5.1.0 pyhd8ed1ab_0 conda-forge
wcwidth 0.2.13 pyhd8ed1ab_0 conda-forge
webcolors 24.6.0 pyhd8ed1ab_0 conda-forge
webencodings 0.5.1 pyhd8ed1ab_2 conda-forge
websocket-client 1.8.0 pyhd8ed1ab_0 conda-forge
wheel 0.43.0 pyhd8ed1ab_1 conda-forge
widgetsnbextension 4.0.11 pyhd8ed1ab_0 conda-forge
xmltodict 0.13.0 pyhd8ed1ab_0 conda-forge
xorg-kbproto 1.0.7 h7f98852_1002 conda-forge
xorg-libice 1.1.1 hd590300_0 conda-forge
xorg-libsm 1.2.4 h7391055_0 conda-forge
xorg-libx11 1.8.9 hb711507_1 conda-forge
xorg-libxau 1.0.11 hd590300_0 conda-forge
xorg-libxdmcp 1.1.3 h7f98852_0 conda-forge
xorg-libxext 1.3.4 h0b41bf4_2 conda-forge
xorg-libxrender 0.9.11 hd590300_0 conda-forge
xorg-libxt 1.3.0 hd590300_1 conda-forge
xorg-renderproto 0.11.1 h7f98852_1002 conda-forge
xorg-xextproto 7.3.0 h0b41bf4_1003 conda-forge
xorg-xproto 7.0.31 h7f98852_1007 conda-forge
xz 5.2.6 h166bdaf_0 conda-forge
yaml 0.2.5 h7f98852_2 conda-forge
zeromq 4.3.5 h75354e8_4 conda-forge
zipp 3.19.2 pyhd8ed1ab_0 conda-forge
zlib 1.3.1 h4ab18f5_1 conda-forge
zlib-ng 2.2.1 he02047a_0 conda-forge
zstandard 0.23.0 py311h5cd10c7_0 conda-forge
zstd 1.5.6 ha6fb4c9_0 conda-forge

@j-wags
Copy link
Member

j-wags commented Jul 31, 2024

Thanks for following up, @aqemia-jasmin-guven. @jthorton is currently on vacation, but he should be able to provide more useful answers that I gave when he returns.

@jthorton
Copy link
Contributor

Hi @aqemia-jasmin-guven thanks for trying out bespokefit!

From the paper, I understood the workflow as follows:

That's not quite the production workflow, you might be getting it a little confused with some of the examples we did in the paper which were slightly more complicated. In practice its as simple as just submitting a ligand to a running server and it will handle everything for you following the automated workflow defined here. You won't need to worry about deduplicating the fragments or making the smirks patterns this will all be done for you. I recommend starting with the quick start guide to ensure things are running as expected and then moving onto the TYK2 set.

Is the cache updated here at some point along the workflow as well?

The automated workflow will update the cache after every stage allowing the reuse of parameters and QC data, this is stored in the directory folder provided to the CLI in the redis.db file.

Additionally, Jeff mentioned that bespokefit should internally deduplicate the fragments, however I don't think I'm seeing this behaviour. For this, I launched the executor once and then submitted a single SDF containing all the ligands.

That is correct, this is the recommended way of running, in this mode each molecule will be fragmented and for any overlapping fragments (in TYK2 there are a lot) only a single set of QC calculations should be performed on each unique fragment. Is there something indicating this is not the case?

I hope this helps, let me know if you have any other issues!

@aqemia-jasmin-guven
Copy link
Author

Hi @jthorton, thanks so much for getting back so quickly!

I have some follow-up questions to your reply:

That's not quite the production workflow, you might be getting it a little confused with some of the examples we did in the paper which were slightly more complicated. In practice its as simple as just submitting a ligand to a running server and it will handle everything for you following the automated workflow defined here. You won't need to worry about deduplicating the fragments or making the smirks patterns this will all be done for you.

Just to clarify, for production with multiple ligands, is it correct to input a single sdf containing all the liga

nds (which is what I have done for the TYK-2 ligands), or is it better to submit the individual ligands using separate submit commands, presumably in the same directory with the same executor?

If we're using separate commands, would it be possible to submit ligands on separate machines, e.g. with the distributed workers option from bespokefit?

I recommend starting with the quick start guide to ensure things are running as expected and then moving onto the TYK2 set.

So I actually already ran the acetaminophen example with the semi-empirical method, and didn't have problems there.

The automated workflow will update the cache after every stage allowing the reuse of parameters and QC data, this is stored in the directory folder provided to the CLI in the redis.db file.

Is it possible to include local files in this? For example, if we run the work flow for a series of ligands, and then afterwards want to run new molecules, sharing a common scaffold with the previous series, is it possible to update the local cache with the runs we have run ourselves? Is this what the --file option in the update cache command is for?

That is correct, this is the recommended way of running, in this mode each molecule will be fragmented and for any overlapping fragments (in TYK2 there are a lot) only a single set of QC calculations should be performed on each unique fragment. Is there something indicating this is not the case?

I ran the workflow with the TYK-2 ligands from a single sdf (attached input.sdf.zip) and ended up with 98 fragments in total. Is that expected? Is there a way to actually tell if QM data for a fragment was computed from scratch or if it was taken from the database? I think the reason I got confused was that I have the outputs and QM scans for all fragments and there are some duplicates across ligands, so I just assumed that these were all computed from scratch.

Thanks again for your help so far! Please let me know if I need to clarify any of the above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants