-
Notifications
You must be signed in to change notification settings - Fork 16
Closed
Labels
CTISImprovements and reporting for CTISImprovements and reporting for CTIS
Description
Currently, covid-19/facebook/prepare-extracts/covidalert-io-funs.R
contains a validation pipeline for Facebook. As I understand it, it does the following checks when we prepare a new day of data to upload.
- Ensure that the old data in the API matches the newly generated old data; that is, if it's currently June 1, make sure we didn't unexpectedly change data for May 25th.
- Do sanity checks on the new data, such as the geography types being reasonable, the geo_ids having the right format, the values and SEs being in the correct range, sample sizes are present, dates aren't missing, etc.
- Verify the number of geographical regions reporting hasn't suddenly changed.
- Verify the average variable values haven't suddenly changed.
Many of these checks can be made generic to multiple data sources and applied to our new pipeline. This would require
- adapt the script to work on Taylor's directory structure where data files are placed
- provide for configuration files that specify the checks that apply to each data source
- adapt the script to report all errors, rather than dying on the first one
- make it easy to automatically run for each
alldata source as a part of its automation job
Metadata
Metadata
Assignees
Labels
CTISImprovements and reporting for CTISImprovements and reporting for CTIS