You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Many thanks @francoiskroll for raising this issue! You are not missing anything, the reason you see this response is because this data field is not handled appropriately by the pc_sect() function.
PubChem pages are quite complex in terms of data structure, we can resolve some fields but we still need to figure out others. It seems some of data fields we see on the website do not "live" on the webpage and the website only points to and renders data from another data source. Because of this e.g. figures in general are also a problem for pc_sect().
I will mark this as "enhancement" and try to allocate time to fix it, but I cannot promise this will be resolved soon. If you can come up with a solution we would be more than happy to incorporate it in the package!
In case you want to work on a solution yourself: the pc_sect() function is a convenience wrapper around two other functions which are not exported: pc_page() downloads the section of the PubChem page in a convenient format and pc_extract() attempts to further extract the data from it.
Thanks a lot for your answer. Ok great, I might look into it.
Not an answer in the context of your package, but if it's useful for another user: a solution I found for specifically the data from the Therapeutic Target Database (TTD) is to download the data from them: http://idrblab.net/web/full-data-download; Drug to disease mapping with ICD identifiers.
The file is small (few Mb) and the format is fairly simple. It uses TTD IDs though, so you will need to convert the PubChem CIDs (or whatever you use). Luckily, TTD provides the necessary data as well: http://idrblab.net/web/full-data-download; Cross-matching ID between TTD drugs and public databases.
Happy to share more details/code if it's useful to anyone. Get in touch.
Thanks for an amazing package! Incredibly helpful.
I tagged this as "bug" but it might be a "Database suggestion", depends on the answer...
As example, let's take the PubChem page for aspirin (CID=2244), section "Associated Disorders and Diseases": https://pubchem.ncbi.nlm.nih.gov/compound/2244#section=Associated-Disorders-and-Diseases
Is there any way I can extract 'useful' data from this sort of sections? Namely the list of diseases here.
I tried
It does run, but it returns:
which is not really I am interested in...
Am I missing something?
Thanks!
The text was updated successfully, but these errors were encountered: