-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in sampling #7
Comments
As far as I can see, the problem is that the first and last stereocentres are not real. You should remove those and try again. |
Okay |
I am sorry but I realize that I have pasted the wrong SMILES string. I see now that you posted We will have a look into this case and see how to resolve this. |
So, the issue is with the current prior models for Mol2Mol. Those have been trained on pairs from ChEMBL but pruned for molecules that did not come from the same publication. This was done under the assumption that the molecules are from the same series hence following chemical intuition. This, unfortunately, leads to a more limited chemistry in the models including the sulfoxide you have in your model. At the end, those priors are essentially just proof-of concept. At some point in the future we will release models trained on the larger PubChem dataset without making assumptions how pairs were/should be constructure. For the time being you can only try to remove the stereochemistry annotation on the sulfur. |
Okay Thanks!! |
(reinvent4) rinku@admin:~/REINVENT4/configs/toml$ reinvent -l sampling.log sampling.toml
Traceback (most recent call last):
File "/home/rinku/miniconda3/envs/reinvent4/bin/reinvent", line 8, in
sys.exit(main())
File "/home/rinku/miniconda3/envs/reinvent4/lib/python3.10/site-packages/reinvent/Reinvent.py", line 284, in main
runner(input_config, actual_device, tb_logdir, responder_config)
File "/home/rinku/miniconda3/envs/reinvent4/lib/python3.10/site-packages/reinvent/runmodes/samplers/run_sampling.py", line 101, in run_sampling
sampled = sampler.sample(input_smilies)
File "/home/rinku/miniconda3/envs/reinvent4/lib/python3.10/site-packages/reinvent/runmodes/samplers/mol2mol.py", line 50, in sample
dataset = Dataset(smilies, self.model.get_vocabulary(), tokenizer)
File "/home/rinku/miniconda3/envs/reinvent4/lib/python3.10/site-packages/reinvent/models/mol2mol/dataset/dataset.py", line 25, in init
enc = self._vocabulary.encode(tokenized)
File "/home/rinku/miniconda3/envs/reinvent4/lib/python3.10/site-packages/reinvent/models/mol2mol/models/vocabulary.py", line 60, in encode
ohe_vect[i] = self._tokens[token]
KeyError: '[S@+]'
Error occured during sampling
The text was updated successfully, but these errors were encountered: