Skip to content

Commit

Permalink
Merge pull request #1247 from CaseyTa/master
Browse files Browse the repository at this point in the history
Add infores identifier for Columbia Clinical Data Warehouse
  • Loading branch information
sierra-moxon authored Mar 24, 2023
2 parents 3886ae3 + 79dafeb commit a9fdc42
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions infores_catalog_nodes.tsv
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,7 @@ released Clinical Profiles infores:clinical-profiles https://model.clinicalprofi
released ClinicalTrials.gov infores:clinicaltrials https://clinicaltrials.gov KP Information Resource
released ClinVar infores:clinvar https://www.ncbi.nlm.nih.gov/clinvar/ KP ClinVar is a freely accessible, public archive of reports of the relationships among human variations and phenotypes, with supporting evidence. Information Resource
released Connectivity Map infores:cmap https://clue.io/cmap CMAP KP genome-scale library of cellular signatures that catalogs transcriptional responses to chemical, genetic, and disease perturbation Information Resource
Columbia Clinical Data Warehouse for Health Patient EHR Data infores:columbia-cdw-ehr-data https://www.irvinginstitute.columbia.edu/services/clinical-data-warehouse-cdw-navigator-support KP The Columbia Clinical Data Warehouse (CDW) contains clinical information for over 4.5 million individuals treated at Columbia University Irving Medical Center (CUIMC) since the 1980s. Information Resource
released Columbia Open Health Data (COHD) infores:cohd https://cohd.io/about.html KP ['Clinical Data Provider'] The Columbia Open Health Data (COHD) API provides access to counts and frequencies (i.e., EHR prevalence) of conditions, procedures, drug exposures, and patient demographics, and the co-occurrence frequencies between them. Count and frequency data were derived from the [Columbia University Medical Center's](http://www.cumc.columbia.edu/) [OHDSI](https://www.ohdsi.org/) database including inpatient and outpatient data. Counts are the number of patients associated with the concept, e.g., diagnosed with a condition, exposed to a drug, or who had a procedure. Frequencies are the number of unique patients associated with the concept divided by the total number of patients in the dataset, i.e., prevalence in the electronic health records. To protect patient privacy, all concepts and pairs of concepts where the count <= 10 were excluded, and counts were randomized by the Poisson distribution. Four datasets are available: 1) 5-year non-hierarchical dataset: Includes clinical data from 2013-2017 2) lifetime non-hierarchical dataset: Includes clinical data from all dates 3) 5-year hierarchical dataset: Counts for each concept include patients from descendant concepts. Includes clinical data from 2013-2017. 4) BETA! Temporal co-occurrence data In the 5-year hierarchical data set, the counts for each concept include the patients from all descendant concepts. For example, the count for ibuprofen (ID 1177480) includes patients with Ibuprofen 600 MG Oral Tablet (ID 19019073 patients), Ibuprofen 400 MG Oral Tablet (ID 19019072), Ibuprofen 20 MG/ML Oral Suspension (ID 19019050), etc. While the lifetime dataset captures a larger patient population and range of concepts, the 5-year dataset has better underlying data consistency. Clinical concepts (e.g., conditions, procedures, drugs) are coded by their standard concept ID in the [OMOP Common Data Model](https://github.com/OHDSI/CommonDataModel/wiki). API methods are provided to map to/from other vocabularies supported in OMOP and other ontologies using the EMBL-EBI Ontology Xref Service (OxO). The following resources are available through this API: 1. Metadata: Metadata on the COHD database, including dataset descriptions, number of concepts, etc. 2. OMOP: Access to the common vocabulary for name and concept identifier mapping 3. Clinical Frequencies: Access to the counts and frequencies of conditions, procedures, and drug exposures, and the associations between them. Frequency was determined as the number of patients with the code(s) / total number of patients. 4. Concept Associations: Inferred associations between concepts using chi-square analysis, ratio between observed to expected frequency, and relative frequency. A [Python notebook](https://github.com/WengLab-InformaticsResearch/cohd_api/blob/master/notebooks/COHD_API_Example.ipynb) demonstrates simple examples of how to use the COHD API. COHD was developed at the [Columbia University Department of Biomedical Informatics](https://www.dbmi.columbia.edu/) as a collaboration between the [Weng Lab](http://people.dbmi.columbia.edu/~chw7007/), [Tatonetti Lab](http://tatonettilab.org/), and the [NCATS Biomedical Data Translator](https://ncats.nih.gov/translator) program (Red Team). This work was supported in part by grants: NCATS OT3TR002027, NLM R01LM009886-08A1, and NIGMS R01GM107145. The following external resources may be useful: [OHDSI](https://www.ohdsi.org/) [OMOP Common Data Model](https://github.com/OHDSI/CommonDataModel/wiki) [Athena](http://athena.ohdsi.org) (OMOP vocabularies, search, concept relationships, concept hierarchy) [Atlas](http://www.ohdsi.org/web/atlas/) (OMOP vocabularies, search, concept relationships, concept hierarchy, concept sets) Information Resource
released Columbia Open Health Data (COHD) for COVID-19 Research infores:cohd-covid https://research.columbia.edu/covid/devices/openhealth KP ['Clinical Data Provider'] The Columbia Open Health Data (COHD) for COVID-19 Research API provides access to counts and frequencies (i.e., EHR visit prevalence) of conditions, procedures, drug exposures, and the co-occurrence frequencies between them for a cohort of hospitalized COVID-19 patients and two comparator cohorts of hospitalized influenza patients and hospitalized patients. Count and frequency data were derived from the [Columbia University Medical Center's](http://www.cumc.columbia.edu/) [OHDSI](https://www.ohdsi.org/) database including inpatient. Counts are the number of inpatient visits associated with the concept, e.g., diagnosed with a condition, exposed to a drug, or a procedure was performed. Frequencies are the number of unique visits associated with the concept divided by the total number of visits in the dataset, i.e., prevalence in the electronic health records. To protect patient privacy, all concepts and pairs of concepts where the count <= 10 were excluded, and counts were randomized by the Poisson distribution. Datasets from three primary cohorts are available: 1) COVID-19: Hospitalized patients aged 18 or older with a COVID-19 related condition diagnosis and/or a confirmed positive COVID-19 test during their hospitalization period or within the prior 21 days. Date range: March 1, 2020 to September 1, 2020. This cohort is also further stratified by sex (male and female) and age (adult: 18-64, senior: 65+). 2) General inpatient: All hospitalized patients aged 18 or older. Date range: January 1, 2014 to December 31, 2019. 3) Influenza: Hospitalized patients aged 18 or older who had at least one occurrence of influenza conditions or pre-coordinated positive measurements or positive influenza testing in the prior 21 days or during their hospitalization period. Date range: January 1, 2014 to December 31, 2019. Both hierarchical and non-hierarchical datasets are available for each cohort. In the hierarchical datasets, the counts for each concept include the visits from all descendant concepts. For example, the count for ibuprofen (ID 1177480) includes visits with Ibuprofen 600 MG Oral Tablet (ID 19019073), Ibuprofen 400 MG Oral Tablet (ID 19019072), Ibuprofen 20 MG/ML Oral Suspension (ID 19019050), etc. Clinical concepts (e.g., conditions, procedures, drugs) are coded by their standard concept ID in the [OMOP Common Data Model](https://github.com/OHDSI/CommonDataModel/wiki). API methods are provided to map to/from other vocabularies supported in OMOP and other ontologies using the EMBL-EBI Ontology Xref Service (OxO). The following resources are available through this API: 1. Metadata: Metadata on the COHD database, including dataset descriptions, number of concepts, etc. 2. OMOP: Access to the common vocabulary for name and concept identifier mapping 3. Clinical Frequencies: Access to the counts and frequencies of conditions, procedures, and drug exposures, and the associations between them. Frequency was determined as the number of visits with the code(s) / total number of visits. 4. Concept Associations: Inferred associations between concepts using chi-square analysis, ratio between observed to expected frequency, and relative frequency. A [Python notebook](https://github.com/WengLab-InformaticsResearch/cohd_api/blob/master/notebooks/COHD_API_Example.ipynb) demonstrates simple examples of how to use the COHD API. COHD was developed at the [Columbia University Department of Biomedical Informatics](https://www.dbmi.columbia.edu/) as a collaboration between the [Weng Lab](http://people.dbmi.columbia.edu/~chw7007/), [Tatonetti Lab](http://tatonettilab.org/), and the [NCATS Biomedical Data Translator](https://ncats.nih.gov/translator) program (TReK Team). This work was supported in part by grants: NCATS 1OT2TR003434, NLM R01LM012895, NCATS OT3TR002027, NLM R01LM009886-08A1, and NIGMS R01GM107145. The following external resources may be useful: [OHDSI](https://www.ohdsi.org/) [OMOP Common Data Model](https://github.com/OHDSI/CommonDataModel/wiki) [Athena](http://athena.ohdsi.org) (OMOP vocabularies, search, concept relationships, concept hierarchy) [Atlas](http://www.ohdsi.org/web/atlas/) (OMOP vocabularies, search, concept relationships, concept hierarchy, concept sets) [NCATS Biomedical Data Translator](https://sites.google.com/ncats.nih.gov/translator-io/home) Information Resource
released Columbia Open Health Data for COVID-19 Research API infores:cohd-covid19-api https://covid.cohd.io/api COHD COVID-19 KP Information Resource
Expand Down

0 comments on commit a9fdc42

Please sign in to comment.