Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Map OMOP data to inpatient_microbiology table #29

Open
razekmh opened this issue Sep 26, 2024 · 6 comments
Open

Map OMOP data to inpatient_microbiology table #29

razekmh opened this issue Sep 26, 2024 · 6 comments
Assignees
Labels
design Design, scoping and ToR work

Comments

@razekmh
Copy link

razekmh commented Sep 26, 2024

Extract data from the example OMOP data to fill the inpatient_microbiology table from the validate article. This is a split from #17

Please feel free to assign yourself to the issue. Please the respective branch for development.

@razekmh razekmh added the design Design, scoping and ToR work label Sep 26, 2024
@AngharadGreen AngharadGreen self-assigned this Sep 27, 2024
@AngharadGreen
Copy link

  • It has been difficult to map the simulated OMOP data I have been working with to the RAMSES inpatient_microbiology table.
  • There isn't a clear way of storing lab microbiology results in the OMOP CDM.
  • I have found this paper useful - https://pubmed.ncbi.nlm.nih.gov/35612079/
  • When we receive OMOP extract data from UCLH it will be useful to see how lab microbiology results are stored within the CDM.
  • I have been working on issue Map OMOP data to inpatient_episodes table  #28 instead

@zsenousy
Copy link

zsenousy commented Oct 9, 2024

  • It has been difficult to map the simulated OMOP data I have been working with to the RAMSES inpatient_microbiology table.
  • There isn't a clear way of storing lab microbiology results in the OMOP CDM.
  • I have found this paper useful - https://pubmed.ncbi.nlm.nih.gov/35612079/
  • When we receive OMOP extract data from UCLH it will be useful to see how lab microbiology results are stored within the CDM.
  • I have been working on issue Map OMOP data to inpatient_episodes table  #28 instead

@AngharadGreen, Would you like to discuss this further in a call?

@AngharadGreen
Copy link

  • It has been difficult to map the simulated OMOP data I have been working with to the RAMSES inpatient_microbiology table.
  • There isn't a clear way of storing lab microbiology results in the OMOP CDM.
  • I have found this paper useful - https://pubmed.ncbi.nlm.nih.gov/35612079/
  • When we receive OMOP extract data from UCLH it will be useful to see how lab microbiology results are stored within the CDM.
  • I have been working on issue Map OMOP data to inpatient_episodes table  #28 instead

@AngharadGreen, Would you like to discuss this further in a call?

Hi @zsenousy I have found some useful resources to help me with this mapping, I am just working through those today. I can meet tomorrow if there is a time that works for you?

@AngharadGreen
Copy link

AngharadGreen commented Oct 9, 2024

Following this paper https://pubmed.ncbi.nlm.nih.gov/35612079/ I will try to map the OMOP CDM to the inpatient_microbiology ramses table following this figure -
image

I have put together a summary presentation going over how the OMOP CDM can be mapped to the Ramses fata frame
Mapping of OMOP to RAMSES.pdf
image

image

@AngharadGreen
Copy link

AngharadGreen commented Dec 12, 2024

I now want to just pull out the relevant columns from the Merged_df_1 table created in code from #45 that represent the Ramses Inpatient_microbiology table to see if I can use the OMOP data to create the data frames required by Ramses

Ramses OMOP
patient_id person_id (Observation table)
specimen_id specimen_id (Specimen table)
status All entries are NA so will ignore
specimen_type_display concept_name_specimen (Concept table)
specimen_datetime specimen_datetime (Specimen table)
organism_display_name concept_name_observation (Concept table)
isolate_id observation_id (Observation Table)
agent_display_name Still need to look into this
rsi_code Still need to look into this
isolation_datetime observation_datetime (Observastion table)
## This is following on from the code in issue 45: 

Merged_df_2 <- subset(Merged_df_1, select = c("person_id","specimen_id","specimen_type_display","specimen_datetime","organism_display_name","observation_id","observation_datetime"))

## lets rename these columns to match the Ramses Inpatient_micrbiology table 

Merged_df_3 <- Merged_df_2%>% rename(patient_id = person_id,
                                     isolate_id = observation_id,
                                     isolation_datetime = observation_datetime)
## Lets add in the columns status, agent_display_name and rsi_code and fill them with NA as these have not been mapped to the OMOP data yet

Merged_df_3['status'] <- NA
Merged_df_3['agent_display_name'] <- NA
Merged_df_3['rsi_code'] <- NA

I will attempt to load this table into Ramses

library(Ramses)
library(ggplot2)
library(dplyr)

microbiology_data <- list()


> microbiology_data <- list()
> microbiology_data$raw <- Merged_df_3
> microbiology_data$raw <- microbiology_data$raw %>% 
+   mutate(
+     organism_code = AMR::as.mo(if_else(
+       organism_display_name == "No growth",
+       NA_character_,
+       organism_display_name)),
+     agent_code = AMR::as.ab(agent_display_name)) %>% 
+   mutate(organism_name = AMR::mo_name(organism_code),
+          agent_name = AMR::ab_name(agent_code))
i Microorganism translation was uncertain for "Anaerobic bacteria" (assumed
Anaerococcus lactolyticus) and "Mycobacterium, avium-intracellulare Group"
(assumed Mycobacterium avium-intracellulare complex). Run mo_uncertainties() to
review these uncertainties, or use add_custom_microorganisms() to add custom
entries.
i The following microorganisms were taxonomically renamed (use keep_synonyms =
TRUE to leave uncorrected):
  * Bacillus subtilis subtilis (Nakamura et al., 1999)  ->  Bacillus subtilis (Cohn, 1872)
  * Mycobacteroides abscessus (Gupta et al., 2018)  ->  Mycobacterium abscessus (Tortoli et al., 2016)
Error in `mutate()`:
ℹ In argument: `agent_code = AMR::as.ab(agent_display_name)`.
Caused by error:
! in AMR::as.ab(): argument x must be a text string, a (whole) number, or of
class 'factor', i.e. not be TRUE or `FALSE
Run `rlang::last_trace()` to see where the error occurred.


> microbiology_data$raw <- microbiology_data$raw %>% 
+   mutate(specimen_type_code = case_when(
+     specimen_type_display == "Blood Culture" ~ 
+       "446131002", # Blood specimen obtained for blood culture
+     specimen_type_display == "Faeces" ~ 
+       "119339001", # Stool specimen
+     specimen_type_display == "MRSA Screen" ~ 
+       "697989009", # Anterior nares swab 
+     specimen_type_display == "Urine" ~ 
+       "122575003", # Urine specimen
+     TRUE ~ NA_character_
+   )) %>% 
+   left_join(transmute(reference_specimen_type,
+                       specimen_type_code = conceptId,
+                       specimen_type_name = pt_term))
Joining with `by = join_by(specimen_type_code)`

## Lets try this again but remove reference to Agent_name and also use SNOMED_microorganism_code to get the organism_name with the AMR package as this column contains the relevant SNOMED codes 
microbiology_data$raw <- Merged_df_3
microbiology_data$raw <- microbiology_data$raw %>% 
  mutate(organism_name = AMR::mo_name(SNOMED_microorganism_code))

microbiology_data$raw <- microbiology_data$raw %>% 
  mutate(specimen_type_code = case_when(
    specimen_type_display == "Blood Culture" ~ 
      "446131002", # Blood specimen obtained for blood culture
    specimen_type_display == "Faeces" ~ 
      "119339001", # Stool specimen
    specimen_type_display == "MRSA Screen" ~ 
      "697989009", # Anterior nares swab 
    specimen_type_display == "Urine" ~ 
      "122575003", # Urine specimen
    TRUE ~ NA_character_
  )) %>% 
  left_join(transmute(reference_specimen_type,
                      specimen_type_code = conceptId,
                      specimen_type_name = pt_term))


microbiology_data$specimens <- microbiology_data$raw %>% 
  transmute(specimen_id,
            patient_id,
            status = "available",
            specimen_datetime,
            specimen_type_code,
            specimen_type_name,
            specimen_type_display) %>% 
  distinct() # Removing duplicates created by multiple isolates and susceptibility testing

microbiology_data$isolates <- microbiology_data$raw %>% 
  transmute(isolate_id,
            specimen_id,
            patient_id,
            organism_code = SNOMED_microorganism_code,
            organism_name,
            organism_display_name,
            isolation_datetime) %>% 
  distinct() # Removing duplicates created by susceptibility testing

The two data frames specimens = microbiology_data$specimens and isolates = microbiology_data$isolates have now been created from the OMOP mapping

I now need to find the relevant data to create the data frame susceptibilities = microbiology_data$susceptibilities)

Information on RSI code:

S - Susceptible, standard dosing regimen: A microorganism is categorised as "Susceptible, standard dosing regimen", when there is a high likelihood of therapeutic success using a standard dosing regimen of the agent.
I - Susceptible, increased exposure*: A microorganism is categorised as "Susceptible, Increased exposure*" when there is a high likelihood of therapeutic success because exposure to the agent is increased by adjusting the dosing regimen or by its concentration at the site of infection.
R - Resistant: A microorganism is categorised as "Resistant" when there is a high likelihood of therapeutic failure even when there is increased exposure.

@AngharadGreen
Copy link

  • We now have a data frame from OMOP where we have mapped 7 out of the 9 relevant columns for the Ramses Inpatient_microbiology table .
  • We now need to find the relevant information in OMOP that relates to agent_display_name and rsi_code
## I now want to identify where I can find information about agent display name and RSI code within the OMOP extract

## I will have a look at the drug_expsure table and identify how we can pull information about the drugs prescribed and map to the information we have in the Inpatient_microbiology table

## Lets read in the drug_exposure table
df_drug_exposure <- read_parquet("~/Ramses_extract_Nov2024/DRUG_EXPOSURE.parquet")

## I have tried to merge the Inpatient_microbiology _OMOP table with the drug_exposure table via the person_id but it keeps causing R to crash
Merge_6 <- merge(x=Inpatient_Microbiology_OMOP_2 , y=df_drug_exposure, by.x = "person_id.x", by.y = "person_id")

## I will look for a way to filter out just those person_id's that are present in the inpatient_microbiology table from the drug exposure table

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design Design, scoping and ToR work
Projects
None yet
Development

No branches or pull requests

3 participants