-
Notifications
You must be signed in to change notification settings - Fork 59
Background
The Data Management System is a package with Stata commands for performing high-frequency checks (HFCs). An HFC is a check of some element of the data collection process, completed on a regular basis as new data comes in. At IPA/J-PAL, high-frequency checks are typically implemented in Stata, after the data flow is complete. High-frequency checks can provide information about any of the following elements of data collection:
- The quality of the data
- Enumerator performance
- Errors in electronic Survey Programme
- Other systematic flaws in the data flow
Given how much information they can provide about the quality of the data collection, high-frequency checks are one of the major benefits of CAI. It’s hard to overstate how important these checks are. High-frequency checks are different from CAI logic checks, which are programmed into the CAI survey program and not in Stata. High-frequency checks should ideally be used to complement in-built checks in the survey program and are used for checks that cannot be effectively implemented in a CAI program. For instance, while CAI logic checks are restrictions on a field or the relationship between fields within a survey, high-frequency checks often check trends across surveys.
- Check the survey form version
- Check that there are no duplicate observations
- Check that there are no duplicates in other variables expected to be unique. eg. GPS coordinates or phone number
- Check that certain critical variables have no missing values
- Check that no variable has all missing values
- Check for "specify other" values in the survey
- Check for outliers in numeric variables
- Check for field comments
- Track survey progress
- Check the percentage of “don’t know” and “refuse to answer”
- Check the "yes" percentage for filter questions
- Check for enumerator productivity
- Check average interview duration
- Check active hours
- Check statistics for numeric variables
- Check survey consent rate
- Check the percentage of survey values missing
- Check the percentage of "don't know" & "Refuse to Answer" responses
- Check the number and percentage of other specify values
- Check the number of variables with all missing and at least 1 missing value
- Check survey productivity