-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use this ETL as a way to provide MIMIC in OMOP directly on the Physionet website #52
Comments
Great idea! |
Thank you for this great work! |
Thanks for the suggestion @vojtechhuser. The mapping needs some work, but sharing the transformed dataset is something that we'd like to do once we're happy with it. We haven't been able to give this project the time it needs just because of competing priorities (research tasks, rebuilding PhysioNet, preparing the next release of MIMIC, etc), but it's on our to-do list. |
any updates on this? We would like to use mimic3 in a Data Quality totorial at OHDSI symposium and desperately need someone who ran the code from this repo and can collaborate with us. |
As soon as we publish something on PhysioNet we have to be able to support it and the ETL isn't ready. We are currently building ETLs for other ICU datasets so that our model doesn't overfit to MIMIC. If by data quality you mean running Achilles, then I have done that, but the results aren't that useful on MIMIC because of the unique data structure and deidentification approach (e.g. deidentified ages ~ 300). |
@alistairewj the use @vojtechhuser is referring to is for a tutorial on how to use Achilles and two other data quality tool sets designed for use with OMOP data sources. The version of MIMIC we use doesn't need to be free of defects. It just needs to be usable - i.e. it won't break the tools because there are empty or missing tables or missing required variables. To the extent that it will resemble a real world data set with typical data quality issues that the tools can identify, it will meet our needs. Before I spend the effort to get this to run, can you give your sense of how likely it is to meet those needs? |
MIMIC is a real world dataset, from a real hospital, but I don't know if I can fully answer your question without knowing the ins and outs of the tools you'll use. The ETL is incomplete; there are still a lot of unmapped concepts. I ran Achilles a few months ago and the output is hopefully informative for you (see below). You'll notice that there are a lot of reported "errors" around times/dates due to our deidentification approach (we randomly shift patient data into the future, therefore doing any analysis which aggregates distinct patients over time is flawed).
|
@alistairewj this is helpful. Thanks. |
Any updates on sharing a complete version of mimic in omop on physionet? Especially now in Covid19 times, I would very much like to work with a proper cdm at home, as I can't access my organisation's cdm. Thank you, Tom |
We would be happy to share an OMOP version of MIMIC-III on PhysioNet. See also MIT-LCP/mimic-code#725. I suggest that someone from the OMOP community takes responsibility for putting together a submission to PhysioNet. The person should:
Once we receive a well described version of the dataset, we can move forward with publication. For instructions on submitting the project, see: https://physionet.org/about/publish/#sharing |
That is great. I will work on a revised proposal that I am happy to revise multiple times until I hit all your requirements to the satisfaction of the PhysioNet reviewing team. (tagging @parisni ) |
Hi all. Good news. I would be pleased to give some help to make this possible. |
Today - I started a draft. I will add @parisni and other important people. |
@vojtechhuser those access settings are correct. Not sure about "OMOP shaped data" as the title of the dataset, but presumably this is a placeholder! |
I would like an invite! I would love to be able to skip ETLing the data and getting it in the OMOP format from source. |
If published as a credentialed project then it would be accessible to MIMIC users. The invite mentioned is for the authors of the project, i.e. those who helped create the ETL. |
Yes, I think separate projects for each dataset is best. One of the benefits is that the MIMIC demo is open access (https://physionet.org/content/mimiciii-demo/1.4/), so the same permissions could be applied to the OMOP version. |
Excellent point Tom. |
I'm seeing whether the N3C project can support some of this work - pay for some of people's time and get more hand on deck. Who has a guess at the amount of work involved? |
Folks leading that seem to have some leeway with unspecified cash allocations to fund it - it being the National Covid Cohort Collaborative (N3C) - and indicate potential interest in supporting this. So I'm eager to respond to their question about the amount of work. I'd take a guess myself but I'm the least fit amongst this group to do so. |
Interesting, thanks Andrew. @parisni @alistairewj @aparrot89 any thoughts on whether we should be putting in additional work to improve the mapping before the dataset is shared? |
Hi, I am interested to be part of this project and am already a registered user of Physionet. |
Formal funding would be great. See notes in this shared folder: https://drive.google.com/open?id=1j-x-rwuYJr2nIs5zxCW6ST_Q-vPc1tfN For folks willing to help, please put your name next to a table that you volunteer to tackle (port to GBQ or improve) |
I propose a plan were multiple versions are released. We need initial versions to make people aware of it. E.g., v0.1 with some tables. After that - some version (e.g., v1.0 can be using existing mapping) and v2.0 can be with improved mapping. Perfect should not be the enemy of the good enough. |
I can't say I agree with releasing an incomplete dataset on PhysioNet and justifying the lack of comprehension with a "v0.1" tag. |
google link permission was fixed. You can sign up for individual tables again here: |
@vojtechhuser, I'd be happy to help join this effort! I put myself on the measurement table. |
The project description is now also in Central Notes. At this link (pick file central notes) |
The project is from now on called Argos This OHDSI forum thread is used for major updates. https://forums.ohdsi.org/t/argos-project-2020-omoped-mimic-project/10926 technical items will still be posted here. |
What is the need for the codename? MIMIC-OMOP seems clearer. |
This ETL allows local user to download and convert-at-many sites
How about convert-once and allow sites to download the converted dataset.
This would save MIMIC users some effort and make MIMIC more used. (and published about; getting credit).
The text was updated successfully, but these errors were encountered: