Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hangs on some invalid SMILES inputs with hanging open parenthesis #60

Closed
supersciencegrl opened this issue Aug 13, 2021 · 2 comments
Closed
Labels
bug Something isn't working

Comments

@supersciencegrl
Copy link

I haven't looked into it, but when testing whether invalid SMILES would work properly (expected to return None), I tried this:

import selfies as sf
smiles = 'cc('
enc = sf.encoder(smiles)

which caused the program to hang and CPU usage to go really high.
When I did a KeyboardInterrupt, this is where it got stuck:

Traceback (most recent call last):
  File "<pyshell#12>", line 1, in <module>
    sf.encoder(smiles)
  File "C:\Users\S1024501\AppData\Local\Programs\Python\Python39\lib\site-packages\selfies\encoder.py", line 66, in encoder
    all_selfies.append(_translate_smiles(s))
  File "C:\Users\S1024501\AppData\Local\Programs\Python\Python39\lib\site-packages\selfies\encoder.py", line 178, in _translate_smiles
    selfies, _ = _translate_smiles_derive(smiles_gen, rings, derive_counter)
  File "C:\Users\S1024501\AppData\Local\Programs\Python\Python39\lib\site-packages\selfies\encoder.py", line 230, in _translate_smiles_derive
    N_as_symbols = get_symbols_from_n(branch_len - 1)
  File "C:\Users\S1024501\AppData\Local\Programs\Python\Python39\lib\site-packages\selfies\grammar_rules.py", line 315, in get_symbols_from_n
    n //= base
KeyboardInterrupt

This was reproducible, and also happened for 'co(', 'cccc(', 'cccc(1', 'cccccc(', '('. Any other value of smiles (valid or invalid!) I've tried so far has worked as expected, including the strings 'c(', 'ccc(', 'ccccc(', 'cccc)', 'cccc(c'.
'ccc(c' did actually return a valid SELFIE for butadiene, which seems reasonable. Interestingly, ')' returned only '' (empty string rather than None).
Sorry for semi-deliberately breaking your awesome program :D

@MarioKrenn6240 MarioKrenn6240 added the bug Something isn't working label Aug 14, 2021
@MarioKrenn6240
Copy link
Collaborator

Thank you for this bug report, we will fix it in a new version thats comming out very soon!

No reason to say sorry -- we say thanks. A program should never loop infinitly, so please continue try to break our codes :-)

@alstonlo
Copy link
Collaborator

Hi @supersciencegrl,

In selfies v2.0.0, we have implemented more stringent error checking, which, in particular, also checks for hanging open and closing branch brackets. For example,

import selfies as sf

sf.encoder("cc(")

now raises a selfies.EncoderError with a description of the error cause

selfies.exceptions.SMILESParserError: 
	SMILES: cc(
	          ^
	Index:  2
	Reason: hanging '(' bracket
...
...
selfies.exceptions.EncoderError: failed to parse input
	SMILES: cc(

Thanks for the bug report!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants