You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
So I think we are at risk of designing something to just handle our legacy cerner dataset here, which is likely to cause problems down the line
This is somewhat related to #205 - what would be more reliable long term is a set of 'what do we do when dates are missing in way X', and it would be desirable to do try and do this in a holistic way that scales rather than targeting a specific field
So if I could get a requirements writeup, on a per resource basis:
which dates are substitutable in the context of a resource? (example: if period is missing in encounter, can/should we substitute participant.period?)
For periods, do we always attempt to populate a null value in start/end if the other is missing?
If multiple date types are allowed in a field (think lab observations), should we attempt to be clever at inspecting all of them?
With this we can at document expectations, which would help to head off certain kinds of questions (i.e. quality metrics failed on a bunch of dates, but I still see the dates in the core tables, what gives?)
While we're here - it might also be worth talking about our date bloat. we might be able to slim down some tables by including the highest resolution field, and having helpers for getting dates of a certain type.
select count(distinct encounter_ref) from core__encounter where period_start_day is null
2,147,614
Thats 10% out of 20M visits (20,210,548) which is A LOT
Workarounds:
(A) Do nothing.
Researchers may be very surprised that 10% visits are missing, and might have preferred ANY encounter date to be present.
(B) Set START date = END date which is almost always present.
For all Encounter Class that are not IMP or OBSENC, this is a reasonable assumption.
This would settle ALL but 9,582 encounters, which is a dramatic improvemnt.
class_code cnt
AMB 2137359
IMP 9582
NULL 588
EMER 85
(C) something else.
@comorbidity vote is for (B) as it fixes nearly all scenarios.
The text was updated successfully, but these errors were encountered: