-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
expanded functionality for biolink:publications
edge-attribute
#677
Comments
Issues we'll want to deal with at some point:
Known issue, set-aside and out-of-scope for now: click here to expandThe documentation says we "MUST report only one identifier per publication" and must report the CURIE (not the full url) whenever we can. But right now, we won't follow this in these cases:
Finally: the spec says a second |
My current plan is to check if the prefix [with ":" so like "PMID:"] is there (in any casing), and if so, strip the prefix. Then just add the prefix. Is there any other cases that should be handled?
It seems like the current code is attempting to filter out only PMID IDs. But if they are using the same/known prefixes for the other ID types, then we have two options
If we wanted to use the CURIE's when possible, we could parse the URL looking for the URLs that identify PMID/PMCID (ie. http://www.ncbi.nlm.nih.gov/pmc/, http://www.ncbi.nlm.nih.gov/pubmed), and then translate it to a PMID/PMCID/etc |
Remember, you can search the PR / branch for SmartAPI yamls that contain the keywords you're testing Test for ref_isbn, ref_pmid, ref_url
Test for ref_doi, ref_clinicaltrials (and ref_pmid and ref_url)
Test for ref_pmc, (and ref_url)
Test using bindingdb: ref_pmid, ref_url, ref_doi
|
Converting urls to CURIEs may also not be possible in all cases:
Here's some lists of base URLs that we'd want to turn into CURIE prefixesThe stuff after the url should be the ID. These are the exact base urls I've found. Turn into Turn into
Turn into Turn into |
See PRs (description of behavior on api-response-transform PR) |
The corresponding SmartAPI updates have been done, and the registrations have been refreshed. NCATS-Tangerine/translator-api-registry#128 This means all instances with this code deployed (dev/ci/test) should begin working with this feature within minutes (after they pull the latest registry info). This update was need for this code to work properly. The code isn't back-compatible, so the old behavior (using the pubmed keyword in response-mapping) wasn't working on the instances that had a deployment with this code. EDIT: until the code from this issue is deployed on Prod, Prod will have wonkiness with how it handles publication info - since it doesn't have the code to process the new response-mapping keywords. Jackson has already made a post in Translator Slack (general channel) informing the consortium of this. |
And info from Aug 9-10th from UI team (Translator slack links):
|
@colleenXu can this be closed as completed? |
Yep let's close this as complete since it's been deployed. The limitations are:
|
The Translator UI is supposed to be able to handle more kinds of "references" (publications) for an edge - not just the PMIDs we provide in the
biolink:publications
edge-attribute right now. In Translator Slack comms, the UI team has confirmed that they plan to support the specification here.For now, we don't have to worry about "free-text description"-style references (we don't really have any of these).
And I'll explain the spec below...
Implementation
We'd like to adjust / expand our behavior to match this spec and provide more reference info to users....by taking the values from sometimes multiple fields, replacing/appending proper prefixes, and putting them into 1 edge-attribute.
Here's what's involved:
biolink:publications
'svalue
, and how they should be processed:ref_pmid
(previouslypubmed
): we want the output-strings to have the prefixPMID
ref_url
(previouslybiolink:source_web_page
): no processing needed. The strings are urlsref_pmcid
: we want the output-strings to have the prefixPMCID
(however, I made a biolink-model issue because which prefix to use was confusing Confusion on PMC vs PMCID biolink/biolink-model#1366)ref_clinicaltrials
: we want the output-strings to have the prefixclinicaltrials
. However, the spec said putting this data in this edge-attribute was temporary / in-flux...ref_doi
: we want the output-strings to have the prefixdoi
(biolink-model spelling ref)ref_isbn
: we want the output-strings to have the prefixisbn
(biolink-model spelling ref)SmartAPI overrides
Note: PharmGKB is excluded here because it isn't added to the config file yet. it can be added here if you want, but the stuff listed here should be plenty to test the 6 response-mapping keys above...
biolink:publications
edge-attribute (many-to-1). It should have this format:Potentially-helpful implementation notes:
null
or an empty string (ignore, don't add to output?)value
array, after the array has been assembled (and after records are merged into edges...)The text was updated successfully, but these errors were encountered: