Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert CAS to SMILES #333

Open
pstahlhofen opened this issue Nov 10, 2021 · 8 comments
Open

Convert CAS to SMILES #333

pstahlhofen opened this issue Nov 10, 2021 · 8 comments
Assignees

Comments

@pstahlhofen
Copy link

Dear webchem developers,
I am aware that converting CAS to SMILES is usually not that complicated. However, under the current circumstances I've spended quite some time on this problem without success so far, so I thought you might be able to help me out.

What I have tried

  1. Converting CAS to SMILES using the cts_convert function. This didn't return a result for any CAS I tried, so I visited the CTS Proxy showing me "Error 500 When calling /rest/to values"
  2. Converting CAS to SMILES using ChemSpider i.e. the cs_convert function. At this ChemSpider Web Page I was able to turn a single CAS number into a molecule description including a SMILES code. I signed in to ChemSpider and created an API key to automate the process for multiple CAS numbers. However, the method cs_convert refused to accept the argument from="CAS", yielding Error in match.arg(from, choices=valid) : 'arg' should be one of "csid", "inchikey", "inchi", "smiles", "mol". If ChemSpider generally supports conversion of CAS registry numbers, it would be a nice feature to extend the cs_convert method to perform this conversion as well.
  3. I was able to convert CAS to SMILES using CACTUS, but I couldn't find support for this API in webchem (or any other R package) and I would rather not rely on shell scripts as they are brittle and highly platform dependent. Did I probably overlook some existing support for CACTUS?
  4. I tried the ci_query function to retrieve the SMILES code, which ran into Service not available. Returning NA

Any help is very appreciated

@stitam stitam self-assigned this Nov 10, 2021
@stitam
Copy link
Contributor

stitam commented Nov 10, 2021

Hi @pstahlhofen, thanks for raising this issue.

  1. cts_convert() should be able to convert cas to smiles, yet id doesn't and other examples seem to be failing as well, will look into it, thanks for flagging.
  2. While ChemSpider website supports many things, the APIs are more limited and last time I checked it did not offer conversions from/to CAS.
  3. CACTUS is supported through cir_query() and it seems to work! Example with ethanol: cir_query("64-17-5", from = "cas", to = "smiles")
  4. Example with ethanol works on my end: ci_query("64-17-5", from = "rn") returns a list, and using sapply(<list>, function(x) x$smiles) returns the smiles for the compound.

You can also use pubchem to get smiles from cas. Again for ethanol, get_cid("64-17-5", from = "xref/rn") returns the CID of the compound, and then pc_sect(702, "canonical smiles") returns the section "canonical smiles" from ethanol's pubchem page , https://pubchem.ncbi.nlm.nih.gov/compound/702#section=Canonical-SMILES

Let me know if these answer your question.

Also I'll keep this issue open until cts_convert() is resolved.

@pstahlhofen
Copy link
Author

Hi @stitam, thanks for the quick answer! cir_query solved my problem :) See below for details

  1. Hmm, the HTTP-Status is OK but the strings in the result always seem to be empty.
  2. Alright
  3. cir_query works great!
  4. Aha, ci_query works with from="rn" but not with from="cas". Thanks for the example. If this is permanent, you might want to update the documentation on ci_query, where it says that cas is also supported.

get_cid("64-17-5", from = "xref/rn") ran into Service not available, so did get_cid("64-17-5", from = "xref/RN") which is provided as an example in the docs.

@Aariq Aariq added data source New data source and removed data source New data source labels Nov 10, 2021
@Aariq
Copy link
Collaborator

Aariq commented Nov 10, 2021

It looks like CTS is down completely right now. Looks like someone has already opened an issue: https://bitbucket.org/fiehnlab/ctsproxy/issues/38/error-500

Looks like they haven't closed any issues in quite some time.

@Aariq
Copy link
Collaborator

Aariq commented Nov 10, 2021

Related? #257

@pstahlhofen
Copy link
Author

Yes, I think so

@Aariq
Copy link
Collaborator

Aariq commented Nov 11, 2021

To clarify, if I remember correctly cts_convert() doesn't currently use CTS's REST API, because it was broken for some time. cts_convert() uses a more web-scraping type approach, but #257 was a reminder to switch to using the REST API if it ever started working again. (Edit: I just checked and it's still broken over a year later because of an expired SSL certificate)

CTS has had a lot of issues in the past, probably because of all the API dependencies it has, and it might be worthwhile contacting someone at the Fiehn Lab to get an idea of their long-term goals for the project before putting any effort into changing/fixing cts_convert(). If the Fiehn Lab isn't planning on maintaining CTS long term (e.g. because they don't have funding or staff), then it's maybe time to consider cts_convert() soft deprecated / superseded.

@stitam
Copy link
Contributor

stitam commented Nov 12, 2021

Thanks @Aariq, that is correct, CTS REST API is not yet implemented in webchem. I contacted them last time the service was down, I'll contact them again, ask about their long-term goals and then we can decide..

@stitam
Copy link
Contributor

stitam commented Nov 25, 2021

Hi All,

Update on this issue: the service is back online, but queries are still not working as they used to.

This one works:

webchem::cts_convert("3380-34-5", "cas", "inchikey")
#> $`3380-34-5`
#> [1] "XEFQLINVKFYRCS-UHFFFAOYSA-N" "ZRWRPGGXCSSBAO-UHFFFAOYSA-N"

Created on 2021-11-25 by the reprex package (v2.0.1)

This one doesn't:

webchem::cts_convert("triclosan", "chemical name", "inchikey")
#> $triclosan
#> [1] NA

Created on 2021-11-25 by the reprex package (v2.0.1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants