Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Tufts] Use case driven validation of GIS Vocabulary #374

Open
Tracked by #376
kzollove opened this issue Jan 24, 2025 · 1 comment
Open
Tracked by #376

[Tufts] Use case driven validation of GIS Vocabulary #374

kzollove opened this issue Jan 24, 2025 · 1 comment
Assignees

Comments

@kzollove
Copy link
Collaborator

Using a simple cohort definition, are we able to adequately leverage the OMOP GIS Vocabulary?

(What sort of cohort definition can we use? One that leverages structured data and ingested GIS data? Or more simple)

(What sort of analysis will we need to assess?)

@p-talapova
Copy link
Collaborator

p-talapova commented Feb 7, 2025

A simple cohort definition typically consists of individuals meeting specified inclusion criteria, such as a particular diagnosis, procedure, or medication exposure, within a defined time frame. To adequately leverage the OMOP GIS Vocabulary, we need to consider a bit more complex cohort definition that integrates GIS-derived attributes, such as:
-Environmental exposures (e.g., air pollution levels, climate factors)
-Socioeconomic indicators (e.g., census tract-level income, education)
-Geospatial accessibility (e.g., distance to healthcare facilities, urban vs. rural classification)

Ideas for cohorts:

  1. Patients diagnosed with asthma within the past five years, residing in high-pollution areas (GIS-exposure data on PM2.5 concentration).
  2. Patients admitted to the ICU + diagnosed with ARDS or acute exacerbation of COPD + GIS-derived air quality index (AQI) exposure mapped to patient residential locations.
  3. Patients with multi-organ failure requiring ICU-level care + GIS-linked social vulnerability index (SVI) and income data + time-to-ICU-admission as primary variable (calculated using timestamps from visit_occurrence table).
  4. Patients diagnosed with chronic kidney disease with long-term exposure to lead-contaminated water supplies based on GIS-derived water quality indices.
  5. Individuals with neurodegenerative diseases (e.g., Parkinson’s disease) who reside in regions with high exposure to pesticides.
  6. Patients prescribed opioids for chronic pain + sub-cohort of patients with a history of opioid overdose or those receiving naloxone treatment + GIS-linked data on pharmacy density, opioid prescription rates, and socioeconomic risk factors (e.g., unemployment, income level).

What sort of analysis will we need to assess? - To properly analyze the impact of GIS variables on patient outcomes, we will need to conduct a combination of cohort creation, baseline characterization, comparative analyses, and advanced statistical modeling. Work in ATLAS will focus on defining study populations, generating summary statistics, and performing initial comparisons, while external tools such as SQL, R, Python, and GIS platforms will be required for spatial mapping, epidemiological modeling, temporal assessments, and causal inference analyses.

To achieve a comprehensive understanding of GIS-related health effects, we will need to incorporate multiple analytical approaches, each addressing a different aspect of the relationship between environmental exposures and patient outcomes. These analyses will help identify spatial patterns of disease, quantify risk associations, evaluate long-term exposure effects, and establish causal relationships. Specifically, we will need:

  1. Spatial analysis:
    -Mapping patient distributions based on their residential locations and overlaying them with GIS-derived exposure datasets, such as pollution levels or temperature variations.
    -Creating heatmaps and conducting clustering analysis to identify geospatial patterns in disease prevalence and comorbidities.
    -Performing spatial accessibility analysis to determine proximity to healthcare resources, using GIS-based distance calculations and service area analyses.
  2. Epidemiological and statistical analysis:
    -Conducting comparative risk assessments by comparing disease prevalence in exposed versus unexposed regions to measure potential associations between environmental and social determinants and health outcomes.
    -Utilizing geospatial regression models to evaluate the effect of GIS-derived variables while adjusting for confounders like age, sex, and socioeconomic factors.
    -Applying propensity score matching to balance geographic and demographic characteristics between exposure groups and control groups.
  3. Temporal and longitudinal analysis:
    -Assessing exposure windows, considering that many GIS variables (e.g., pollution, climate, socioeconomic conditions) fluctuate over time. A time-series approach allows evaluation of how chronic exposure affects health outcomes.
    -Performing cohort survival analysis using Kaplan-Meier curve or Cox proportional hazards model to investigate whether GIS-related variables influence disease progression or survival.
  4. Causal inference and public health impact
    -Implementing instrumental variable analysis to address confounding and assess causal relationships between environmental exposures and health outcomes
    -Assessing health disparities to identify whether there are systematic inequities in health outcomes based on spatial or environmental variables.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants