Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

random selfies to valid smiles #123

Open
DawnJYe opened this issue Dec 6, 2024 · 7 comments
Open

random selfies to valid smiles #123

DawnJYe opened this issue Dec 6, 2024 · 7 comments
Labels
question Further information is requested

Comments

@DawnJYe
Copy link

DawnJYe commented Dec 6, 2024

Hi!Can this convert any combination of selfies into valid smiles? I converted random selfies into smiles, and then used rdkit's Chem.MolFromSmiles method to convert the smiles, but the result was empty. Does this mean that the smiles converted from selfies are invalid?
Thank you!

@MarioKrenn6240
Copy link
Collaborator

Hi @DawnJYe -- Try this code, hope it works for you. Yes arbitrary SELFIES can lead to a semantically and syntactically valid molecule.

@DawnJYe
Copy link
Author

DawnJYe commented Dec 12, 2024

valid
Thank you for your response. If I create a selfies dictionary using my own smiles, will any combination still be valid? I tried using a custom dictionary to form selfies and encountered an error when running it through the selfies.decoder. Do I have to use the get_semantic_robust_alphabet() function instead?

@MarioKrenn6240
Copy link
Collaborator

They should still be valid. If not, i guess there is a problem somewhere. Can you share cases which fail? (Note: We regard an empty string as valid, maybe thats the difference to RDKit's output?)

@DawnJYe
Copy link
Author

DawnJYe commented Dec 19, 2024

Do you mean that after passing selfies through a decoder, the resulting smiles may be an empty string? If so, then I think I'm encountering that situation. The error message I'm getting is “rdkit.Chem.rdmolfiles.MolToSmiles(NoneType)”.So is this situation considered valid in selfies and it's just that RDKit doesn't accept empty strings?

@DawnJYe
Copy link
Author

DawnJYe commented Dec 19, 2024

They should still be valid. If not, i guess there is a problem somewhere. Can you share cases which fail? (Note: We regard an empty string as valid, maybe thats the difference to RDKit's output?)

Sorry for interrupting again. I am facing a situation where selfies is represented as '[C][C][Sn][Branch1][Ring1][C][C][Branch1][#Branch2][C][=C][C][=C][C][=C][Ring1][=Branch1][C][Ring1][#Branch1][C][Ring1][=Branch1][C]', but after passing it through selfies.decoder, the generated SMILES is 'C1C[Sn]1(CC)(C2=CC3=CC=C2C)C3C'. However, it seems to be invalid for RDKit, as Chem.MolFromSmiles returns None when using RDKit. Could you please tell me what the reason for this is?

@robpollice
Copy link
Contributor

Are you using the default SELFIES constraints? If yes, the problem is that in the default constraints, Sn does not have specific constraints. That means that the default behavior is allowing for 8 valences, which is not meaningful for Sn. What this means is that you need to create your custom SELFIES constraints that constrain Sn to a meaningful number of valences, say 4 for instance, and then use SELFIES with these custom constraints.

@MarioKrenn6240
Copy link
Collaborator

Small addition, info to customizing selfies is here, and in chapter 4.3. "Customization functions" of our software-paper.

@MarioKrenn6240 MarioKrenn6240 added the question Further information is requested label Dec 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants