Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

As an agency partner, I want to be able to access the data of % DACs per county and zip code #1803

Open
lucasmbrown-usds opened this issue Aug 8, 2022 · 4 comments
Assignees

Comments

@lucasmbrown-usds
Copy link
Contributor

Description
Many program officers distribute funding not at the census tract level but using other geographies such as county or zip code. In order for them to use the definition of DACs, they need to translate the census tract data into those other geographic units.

Solution
Modify the ETL pipeline to create two separate files:

  1. Counties with the percent of the county that lives in a disadvantaged census tract
  2. Zip codes with the percent of the zip code that lives in a disadvantaged census tract

Make these files available in S3 with a link from Github or another location.

@lucasmbrown-usds lucasmbrown-usds self-assigned this Aug 8, 2022
@lucasmbrown-usds
Copy link
Contributor Author

@BethMattern - can you confirm that these files should be created as both CSVs and Excels?

And also, where would you like these files to live? Here are some options:

  1. As standalone files that get linked from the https://screeningtool.geoplatform.gov/en/downloads page. Such as,

    • "Zip codes by percent of tracts within the zip code that are identified as disadvantaged.csv"
    • "Zip codes by percent of tracts within the zip code that are identified as disadvantaged.xlsx"
    • "Counties by percent of tracts within the county that are identified as disadvantaged.csv"
    • "Counties by percent of tracts within the county that are identified as disadvantaged.xlsx"
  2. Combined into one big zip file that can be downloaded

    • "Zip codes and counties by percent of tracts that are identified as disadvantaged.zip" (contains 4 files)

@lucasmbrown-usds
Copy link
Contributor Author

lucasmbrown-usds commented Sep 28, 2022

We have a bit of a sticking point here. There's no tool out there to map from 2010 Census Tracts to 2020 Zip Codes by weighted population. GeoCorr does not support it.

Ideally, we would use weighted population to representing the % of people inside of a zip code who live in DACs. This will be more accurate than using simply geographic area, since the zip code may have most of the population concentrated in a certain number of tracts.

So we have only a few options I'm aware of:

  1. We could convert 2010 Census Tracts to 2020 ZCTA by geographic overlap only, not using population at all. This is already implemented.

  2. We could convert from 2010 Census Tracts to 2010 ZCTA by weighted population using Geocorr, and then convert from 2010 ZCTA to 2020 ZCTA using geographic overlap only.

  3. We could convert from 2010 Census Tracts to 2010 Census Blocks by weighted population using Geocorr, then use geographic overlap to convert from 2010 Census Blocks to 2020 Census Blocks using census relationship files, and then convert 2020 Census Blocks to 2020 ZCTA by weighted population using GeoCorr.

  4. Very similar to Option 3: We could convert from 2010 Census Tracts to 2010 Census Blocks by weighted population using Geocorr, then use NHGIS's crosswalk files to convert from 2010 Census Blocks to 2020 Census Blocks by weighted population, and then convert 2020 Census Blocks to 2020 ZCTA by weighted population using GeoCorr.

@lucasmbrown-usds
Copy link
Contributor Author

lucasmbrown-usds commented Sep 28, 2022

Finally, after all the above is completed, we need to map 2020 ZCTAs to 2020 zip codes, which are not the same.

ZCTAs can include one or more zip codes. See explanation here.

To convert from 2020 ZCTA to 2020 Zips, we have a couple of options:

https://github.com/censusreporter/acs-aggregate/blob/master/crosswalks/zip_to_zcta/ZIP_ZCTA_README.md

and

https://udsmapper.org/zip-code-to-zcta-crosswalk/

The latter seems to be more actively maintained.

Many thanks to @JoeGermuska for his help assembling this information! Joe also recommends posting to https://acsdatacommunity.prb.org/discussion-forum/ with this question.

@lucasmbrown-usds
Copy link
Contributor Author

lucasmbrown-usds commented Oct 4, 2022

Do ranges instead of single number.

Another option:

Produce spreadsheet with:

  1. 2020(ish) Zip
  2. % of Zip geographically within a tract
  3. 2019 tract population inside that zip code that would be in that zip if population were evenly geographically distributed within tracts
  4. % of Zip population within that tract (calculated from field 3)
  5. % of zip geographically within DAC
  6. % of zip population* within DAC
  7. Range of the two estimates (smallest to largest)

1-2 is already implemented.

An idea we've eliminated: Randomly sample dividing 2010 tracts into 2010 blocks and getting ranges of the distribution. Generate a parameter q*x where the population of a block can be divided into tracts within the range q and (1-q).

#3 can be implemented by:

  1. Assume populations are distributed when they cross ZCTA with proportion: .5 per unit of area, evenly per unit of area (1?), 1.5 per unit of area.
  2. Load the 2019 tract population.
  3. Take % of the tract in the ZCTA and multiply it by population to get estimate A of tract population in ZCTA.
  4. Calculate an estimate of the lower estimate of population in a DAC: For every tract that spans the border, take each DAC tract that spans the border and multiply its population field (field 3) by its % spatially in the ZCTA and then multiply by .5, and then take every DAC tract that spans the border and multiply its population field (field 3) by its % spatially in the ZCTA and then multiply by 1.5, then calculate DAC estimated population by zip estimated population.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant