This data comes from PDF reports released by the Michigan Department of Health & Human Services. The files are hosted on their Data and Research page. My Spring 2018 Data Journalism class used Tabula to free tables from the PDFs and convert them to CSV datafiles.
The PDFs in question are:
- 2012 Annual Data Report on Blood Lead Levels of Children in Michigan
- 2013 Data Report on Childhood Lead Testing and Elevated Levels
- 2014 Data Report on Childhood Lead Testing and Elevated Levels: Michigan
- 2015 Data Report on Childhood Lead Testing and Elevated Blood Lead Levels: Michigan
- 2016 Provisional Data Report on Childhood Lead Testing and Elevated Levels: Michigan
Each individual table from a PDF was extracted into a single CSV file. The file naming convention is as follows:
BLL_[age group of children]_[geographic area]_[year]
For example, BLL_under6_zip_2016
is the data on children under six, by zip code, in 2016.
The common age groups of children are
- under 6
- ages 1 and 2
- all ages
Common geographic areas are
- county
- zip code
- community (receiving funding for lead poisoning prevention)
We are also attempting to standardize the variable names, although it appears that slightly different measurements and cutoffs were used across years. The Data Dictionary will evolve as we work through the tables.