You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have observed that recent changes in the vocabulary to have impact on the operating characteristics of phenotype algorithms - within the same data source. Traditionally, the performance of phenotype algorithms were considered dependent only on the cohort definition and the data sources tested. However, it's becoming apparent that the vocabulary version plays a crucial role.
To better manage, we propose including additional metadata into the cohort definition itself:
Include Vocabulary Version in Cohort Metadata:
Implement metadata capture within the cohort JSON for the user to record the vocabulary version where the cohort definition was last updated and/or evaluated. This addition will help users track changes and understand the impact of vocabulary updates on their studies.
Standardize Metadata Framework:
Develop a more generalized framework for metadata using name-value pairs in the JSON format. This should include:
Standard fields like vocabularyVersion, firstDevelopedDate, and lastUpdatedDate.
Extendable user-defined fields that can describe broader metadata aspects, such as:
Library cohort status (e.g., isLibraryCohort: true/false)
Peer review status (e.g., isPeerReviewed: true/false)
Approval status (e.g., isApproved: true/false)
Usage in specific studies (e.g., usedInStudy: Study A)
Descriptive text blobs providing additional context or notes.
Author(s) attribution
Add a global hash signature id that can uniquely identify the cohort json across atlas instances. This hash should update when changes are made to core cohort definition logic.
Some of these metadata are captured in public and private phenotype libraries. However, they are now becoming attributes of the cohort definition that is captured in the context of the library. If we can extend these attributes to be part of cohort json, then it can
Facilitate Metadata Transportability:
Ensure that this metadata is structured in a way that allows it to be easily transported with the cohort JSON across different systems and studies, enhancing reproducibility and transparency.
This structured approach to metadata management will not only improve the fidelity of cohort definitions in the face of vocabulary changes but also enhance the overall utility and governance of cohorts in Atlas.
This new metadata approach will make make public and private libraries of cohort definitions more easier to integrate. This allows Atlas to have a "librarian" role to curate definitions for reuse.
The text was updated successfully, but these errors were encountered:
Note: a generalizable idea is that this "metadata" can be replacement of other metadata like ideas in the cohort json such as "description text box", or "tags".
We have observed that recent changes in the vocabulary to have impact on the operating characteristics of phenotype algorithms - within the same data source. Traditionally, the performance of phenotype algorithms were considered dependent only on the cohort definition and the data sources tested. However, it's becoming apparent that the vocabulary version plays a crucial role.
To better manage, we propose including additional metadata into the cohort definition itself:
Include Vocabulary Version in Cohort Metadata:
Standardize Metadata Framework:
vocabularyVersion
,firstDevelopedDate
, andlastUpdatedDate
.isLibraryCohort: true/false
)isPeerReviewed: true/false
)isApproved: true/false
)usedInStudy: Study A
)Some of these metadata are captured in public and private phenotype libraries. However, they are now becoming attributes of the cohort definition that is captured in the context of the library. If we can extend these attributes to be part of cohort json, then it can
This structured approach to metadata management will not only improve the fidelity of cohort definitions in the face of vocabulary changes but also enhance the overall utility and governance of cohorts in Atlas.
Discussed this idea with @dimshitc, Azza Shoaibi
This new metadata approach will make make public and private libraries of cohort definitions more easier to integrate. This allows Atlas to have a "librarian" role to curate definitions for reuse.
The text was updated successfully, but these errors were encountered: