Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Obtaining polymer structures for hydroxyacid isomers with given distribution #12

Closed
eefm1 opened this issue Aug 1, 2024 · 5 comments
Closed

Comments

@eefm1
Copy link

eefm1 commented Aug 1, 2024

Hi makers from m2p,

Hopefully you can help me with an error I don't understand. I want to polymerize random co-polyesters with a given distribution. In order to do that, I used smiles from their corresponding hydroxyacid monomers and a given distribution. Thus, I model these polymers by monomer 1=A-B and monomer 2=A-C, where A is the diol and B & C are the diacids. Strangely, this gives an error if the length of the diol A is the same for both monomers. See sample code below:

data = pd.DataFrame({'smiles':[
'OCOC(C1=CC=C(C(O)=O)C=C1)=O.OCCOC(C1=CC=CC(C(O)=O)=C1)=O', #works
'OCCOC(C1=CC=C(C(O)=O)C=C1)=O.OCCOC(C1=CC=CC(C(O)=O)=C1)=O', #ERROR:Ester_ReactionFailed
'OCCCOC(C1=CC=C(C(O)=O)C=C1)=O.OCCOC(C1=CC=CC(C(O)=O)=C1)=O', #works
'OCCCCOC(C1=CC=C(C(O)=O)C=C1)=O.OCCOC(C1=CC=CC(C(O)=O)=C1)=O' #works
],
'distribution':[[1,1],[1,1],[1,1],[1,1]]
})
data['monomers'] = data.smiles.apply(lambda s: pm.get_monomers(s))
data = pm.thermoplastic(data,DP=5,mechanism='ester',replicate_structures=1)

If I don't specify the distribution, it works fine. Do you have any idea where things go wrong?

Thank you for your response.

Kind regards,
Evelien

@jlaw9
Copy link
Collaborator

jlaw9 commented Aug 1, 2024

Hi Evelien,

I'm not sure I understand how the description of your goal matches your code example, but from the example, I can see that the code is having issues with this monomer OCCOC(C1=CC=C(C(O)=O)C=C1)=O. If you don't give a distribution, then m2p is using only the second monomer OCCOC(C1=CC=CC(C(O)=O)=C1)=O to build the polymer which you can see if you visualize the polymer smiles e.g.,
Chem.Draw.MolsToGridImage([Chem.MolFromSmiles(s) for s in data.sort_index().smiles_polymer],molsPerRow=3, subImgSize=(250, 100)).
But if you do give a non-zero distribution value for both monomers, the code is forced to use the first monomer which is resulting in a failed polymerization. I'm not sure why the polymerization is failing though. Must be something in the poly_ester function. Try stepping/following through that function if you are able. I'll take a closer look soon.

Jeff

@eefm1
Copy link
Author

eefm1 commented Aug 1, 2024

with distribution
without distribution

Hi Jeff,

Thanks for your swift reply. I am trying to make structures of co-polyesters. For example, in PET a little bit of isophtalic acid (I) is often added. Thus, this PET will have a repeating unit structure of (ET)x-(EI)y. In order to model this, I split the two repeating units into their corresponding hydroxyacids HO-(ET)-COOH and HO-EI)-COOH with distribution [x,y] and polymerize them.

I added the outcomes of my script as attachments (note that here I used para and meta substituted benzene rings). The top one is with the distribution specified and reports 3 polymers and 1 error, the bottom is one without distribution and reports 4 polymers (no errors).

If I don't specify the distribution, I get a mix of para and meta benzene rings. That does not match your statement that if you don't specify a distribution only the 2nd monomer is taken, right? If I do specify a distribution (in this case all [1,1]), I obtain polymers for all cases where the diol length was unequal. I.e. not ethylene glycol-based for the 2nd monomer, but a C3 or C4 diol. However, these cases are very rare in the real-world ;-)

So that's why I was really puzzled. I was expecting it to work in neither case, or in all cases, but not in some. I will try to find some time next week to debug step by step, but if you have any idea where to look, that would be very much appreciated,

@jlaw9
Copy link
Collaborator

jlaw9 commented Aug 2, 2024

Ok thanks for the additional details. I think I found the problem. If you look closely at the second example, only the meta benzene ring is present. Looks like if both monomers have the same length of diol (e.g., both are C2), then that's when it fails. If either side has a mismatch diol (e.g., C2 with C1), then they are fine. Here's some code where example 1 and 3 have the same length diol, and example 2 has a mismatch (C2 with C1 diol):

data = pd.DataFrame({'smiles':[
    'OCCOC(C1=CC=C(C(O)=O)C=C1)=O.OCCOC(C1=CC=CC(C(O)=O)=C1)=O', # original
    'OCCOC(C1=CC=C(C(O)=O)C=C1)=O.OCOC(C1=CC=CC(C(O)=O)=C1)=O', # remove a C from para side
    'OCCCCOC(C1=CC=C(C(O)=O)C=C1)=O.OCCCCOC(C1=CC=CC(C(O)=O)=C1)=O', # add two C's to both sides
    # repeat to show with distribution
    'OCCOC(C1=CC=C(C(O)=O)C=C1)=O.OCCOC(C1=CC=CC(C(O)=O)=C1)=O', # original
    'OCCOC(C1=CC=C(C(O)=O)C=C1)=O.OCOC(C1=CC=CC(C(O)=O)=C1)=O', # remove a C from para side
    'OCCCCOC(C1=CC=C(C(O)=O)C=C1)=O.OCCCCOC(C1=CC=CC(C(O)=O)=C1)=O', # add two C's to both sides
    ],
    'distribution':[[],[],[],[1,1],[1,1],[1,1]]
})
data['monomers'] = data.smiles.apply(lambda s: pm.get_monomers(s))
data = pm.thermoplastic(data,DP=5,mechanism='ester',replicate_structures=1)
Chem.Draw.MolsToGridImage([Chem.MolFromSmiles(s) for s in data.sort_index().smiles_polymer],molsPerRow=3, subImgSize=(300, 200))
image

@jlaw9 jlaw9 closed this as completed in 1854f77 Aug 6, 2024
@jlaw9
Copy link
Collaborator

jlaw9 commented Aug 6, 2024

Ok I found the issue. pm.get_monomers() was sorting by molecular weight which used a dictionary with the MW as the key. If multiple monomers had the same MW, only one was kept. Thanks for reporting that bug!

@jlaw9
Copy link
Collaborator

jlaw9 commented Aug 14, 2024

A workaround would be to set sort_by_mw=False for example:

data['monomers'] = data.smiles.apply(lambda s: pm.get_monomers(s, sort_by_mw=False))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants