To run puf_stage2/stage2.py
, you must first install the CyLP package.
Installation instructions can be found here
This repository prepares data used in the Tax-Calculator repository.
The data produced here, all of which have CSV format, provide two different sets of data files for Tax-Calculator:
-
A set based on a recent IRS-SOI Public Use File (PUF)
-
A set based on recent Census Current Population Survey (CPS) data
Because the PUF data are restricted in their use, the IRS-SOI-supplied
PUF file and the puf.csv
data file produced here are not part of the
taxdata or the Tax-Calculator repository.
Each of these two sets of data files contains four files:
-
a sample data file containing variables for each tax filing unit;
-
a factors file containing annual variable extrapolation factors;
-
a weights file containing annual weights for each filing unit;
-
a ratios file containing annual adjustment ratios for some variables.
Note that the factors file is the same in both sets of data files because the variable extrapolation factors are independent of the sample data being used. But the weights and ratios files do depend on the data file, so they are different in the two sets of data files.
IRS-SOI Public Use File (PUF) documentation:
Census Current Population Survey (CPS) documentation is available here:
Before opening up a pull request, run pytest
to ensure all tests pass.
The sequence of operations required to make the two sets of data files
is contained in the csvmake
bash script, which also
automates the preparation work-flow (except on Windows).
The sequence of operations required to install the two sets of data
files in the Tax-Calculator repository is contained in the csvcopy
bash script, which also automates the installation work-flow
(except on Windows).
- John O'Hare
- Amy Xu
- Anderson Frailey
- Martin Holmer