-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Standard/conventions for representing supporting publications #398
Comments
Proposal:Given the state of affairs described above (i.e. the current preference of KPs to capture lists of pub ids in a single Attribute object, and the availability of TMKPs publication metadata endpoint to provision publication info on demand) the simplest path forward may be to:
|
Is it the mere existence of the clinical trial that is providing the evidence or is it the published outcome/report of the clinical trial that is providing the evidence? I would imagine the latter? Perhaps it is the final report (I assume there always is one? I don't actually know) from the trial that should be cited as a publication? Perhaps these identifiers identify studies, yes, but most crucial identify the published outcome? |
There are also considerations about the form/syntax of the value for a CT.gov record. Currently referenced as a string representing the identifier, e.g. NCT00222573. Should we require a CURIE or URL form of this be captured? Bioregistry.io already defines one we could use, and an expansion. See https://bioregistry.io/registry/clinicaltrials. Assuming this means we can require CURIE/URL-based representation of clinical trials, consider the concrete proposals below for how we reference supporting clinical trials. Examples below show supporting clinical trials alongside references to publications referenced as CURIEs/URLS vs free text, per the decision to split these into separate Attribute objects here. Option 1: Consider clinical trial record ids to be Publications, and capture them using the
|
I don't have a strong opinion, but I think I'm liking option 1. It depends a little on my abstract question above to which there was no answer provided:
|
May 22 Update:Clinical trial records in clinical-trials.gov are often referenced as support for edges reporting a drug to treat a condition. "value": "clinicaltrials:NCT00222573" We also decided that any URLs that allow users to link directly to the ct.gov site to explore these clinical trial records should be "value": "clinicaltrials:NCT00222573"
"value_url": "https://clinicaltrials.gov/search?id=%22NCT03074773%22" We have yet to settle on what edge property will be the Concrete examples of these competing proposals are presented below - showing how supporting trials would be represented alongside supporting publications for both approaches: Option 1: Use the
|
Note the 'value_url' as shown ( https://clinicaltrials.gov/search?id=%22NCT03074773%22 ) doesn't work. To get to the trial directly, for the current version of the system, the URL would be: https://clinicaltrials.gov/ct2/show/NCT03074773 Looks like they're developing a new version, and under it, the URL seems to be: https://beta.clinicaltrials.gov/study/NCT03074773 ...and the search syntax: https://beta.clinicaltrials.gov/search?id=NCT03074773 |
Thanks Gustavo - I used the search syntax in my examples only because this is the format of the links we get from sources like chembl as provided by MolePro. Ideally we would want to point people directly to individual trial pages. And good to know about the new version in development. I;d like to find out more about this from you or Kamileh. |
To clarify, the new version appears to be just for their UI. |
Outcome of 5-23-23 EPC Call - general preference for 'supporting_study' as the long term solution, but may have to wait to implement until after September as KPs may not be able to regenerate data to be compliant with this specification before then. In mean time, modeling team will get required edge property into Biolink, and draft an initial specification for representing supporting clinical trials / studies in TRAPI. |
Closing with the creation of the supporting_publications_specification- but as noted in the spec, we need to return to this and modify the specification for how to reference supporting clinical trials. I created #447 as a separate ticket for this issue. |
Testing efforts have identified variability w.r.t. how publications and other documents are represented as support for Edges and Study Result objects in TRAPI messages. The Biolink Model provides two edge properties (
supporting_document
, and its childpublications
- which are the topic of the Biolink ticket here), as well as aPublication
class.Here. I would like to finalize and document conventions for how these elements are applied in Attributes in TRAPI messages to structure this type of information.
Questions/Issues:
When to use these different properties? (ignore for now - as we are voting here on whether to have two properties at all))
How to represent the value of these properties?
a. Current convention is to use established document identifiers such as pmids, pmc ids, dois, etc. Should there set an order of preference here, or a requirement to use one over another (e.g. if a pub has a PMC id and PMID, always use the PMID)? Do we have a resource that maps between different types of identifiers for documents/publications?
In cases where there are multiple documents/pubs supporting a given Statement or Study Result, should these be captured as a list/array of document identifiers in the
Attribute.value
field, or one at a time in separate Attribute objects?a. It was decided earlier that separate attributes were best - as this would allow for additional metadata about each publication to be captured in other Attribute fields (e.g. url, description/title, etc). But it wasn't clear if this was a requirement or recommendation?
b. In practice, most KPs are providing lists of pmids in a single Attribute - esp in cases where there are very many supporting pubs (can be tens to hundreds in some cases). But the format/syntax used to represent such lists is varibale (e.g. a single string vs formal array of object references? if a string, use of comma vs pipe delimiters?)
Where/how should we capture additional metadata about a publication (e.g. title, year, url, MeSH terms, journal, authors, etc)?
a. Some of the use cases for applying such additional publication info are outlined in the ticket here - most relate to supporting O& O for ordering results, and which pubs to show first for a given result.
b. Some of this info could be captured using Attribute fields such as
value_url
ordescription
(see json example below) - but this doesn't work if we allow multiple pubs as a list in a single Attribute object.c. But TMKP already provides a publication metadata API that provides the types of info listed here - so representing an yof this in the
publications
Attribute seems duplicative.d. @bill-baumgartner I assume here that Publication/Documents are represented as instances of Biolink:Publication, which has a collection of node properties for describing things like title, authors, etc?
The text was updated successfully, but these errors were encountered: