Skip to content

Dataset Assessment Template

Billy Charlton edited this page Mar 20, 2017 · 3 revisions

These are the minimum things we need recorded for each dataset. Add more detail as needed! A lot of this can be gleaned in one quick interview; some of the dataset details may take more time.

Dataset: xxxxx

Source of Data

  • Who owns it?
  • Where does it come from?

Current data handling

  • Where does it live now? (on the network drive)
  • How is it managed: subfolders? file naming? etc
  • Who at SFCTA is in charge of it
  • What are the QA/QC steps involved when it is received?

Data Records & Format

  • How large is one set of the data?
  • Column detail: a database will require each column to have a very specific defined TYPE.
  • Missing data: how are missing cells recorded? (By column, if need be)

Updates & Upkeep

  • How often is new data delivered & available?
  • Does existing data get corrected/updated after arrival?

Privacy

  • Does any aspect of this dataset have potential privacy issues?
  • If so are there current practices for hiding records/merging cells etc?

Meta Stuff

  • Are there any issues with the current setup needing attention, that might be in scope as long as we're in there doing stuff anyway?
  • Other notes & findings
Clone this wiki locally