You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We haven’t placed much priority to carefully handling location data, since the primarily useful product we provide has been availability data — for most clients, the location information we have is used to match our availability data to another database on their end, or the amount of location info they show from us is pretty minimal.
However, I’ve recently had to fix some issues in our location data (e.g. bad zip codes), and doing so was not straightforward because we have multiple sources live updating location information. It’s hard to:
Know what source bad data is coming from.
Override that data manually, since it might get reset in a few minutes when that bad source next sends an update.
It might be useful to have a separate provider_locations_raw table or something that stores the location data received from posts to /update, and stores that data by location_id + source (we could have a special source name, or NULL source for manual updates). Then we could store merged results from all the sources in the existing provider_locations table. (This is sort of like how we handle availability data by source.)
We might also prioritize sources differently when merging. For example, values from the manual/NULL source would win out over values from a state authority (#198) which would win out over Vaccinate the States (#189) which would win out over other more automated sources.
Established in this week’s meeting that this sounds like a good idea, but not top priority. However, we may not want to use this mechanism for storing manual overrides: #231.
This project has been shut down. This issue has been left open as a guide for anyone forking this project — addressing this issue (or at least knowing about it!) is likely to be worthwhile for you if you are maintaining a running copy or fork of UNIVAF.
We haven’t placed much priority to carefully handling location data, since the primarily useful product we provide has been availability data — for most clients, the location information we have is used to match our availability data to another database on their end, or the amount of location info they show from us is pretty minimal.
However, I’ve recently had to fix some issues in our location data (e.g. bad zip codes), and doing so was not straightforward because we have multiple sources live updating location information. It’s hard to:
It might be useful to have a separate
provider_locations_raw
table or something that stores the location data received from posts to/update
, and stores that data bylocation_id + source
(we could have a special source name, or NULL source for manual updates). Then we could store merged results from all the sources in the existingprovider_locations
table. (This is sort of like how we handle availability data by source.)We might also prioritize sources differently when merging. For example, values from the manual/NULL source would win out over values from a state authority (#198) which would win out over Vaccinate the States (#189) which would win out over other more automated sources.
/cc @astonm
The text was updated successfully, but these errors were encountered: