-
-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rename dropq directory/files as tbi and refactor run_nth_year_*_model functions #1577
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1577 +/- ##
======================================
Coverage 100% 100%
======================================
Files 37 37
Lines 2731 2731
======================================
Hits 2731 2731
Continue to review full report at Codecov.
|
@martinholmer I just briefly looked through this PR. This looks great. Thanks for knocking this out quickly. I'll take a closer look tomorrow. |
The refactored So, it seems as if the speed-up benefits of maintaining a cache might not exceed the complexity costs of maintaining a cache. This issue can be reconsidered after pull request #1577 has been tested with TaxBrain. @MattHJensen @feenberg @Amy-Xu @andersonfrailey @hdoupe @GoFroggyRun @brittainhard P.S. The timing statistics above were generated for the second call to the |
input_path = 'puf.csv.gz' | ||
if not os.path.isfile(input_path): | ||
# otherwise try local Tax-Calculator deployment path | ||
input_path = os.path.join(tbi_path, '..', '..', 'puf.csv') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@martinholmer Currently, the PUF is sent to taxcalc
as a pandas dataframe. I'm not sure how to generally specify the PUF path relative to this file. It may be easier for use_puf_not_cps
to be a keyword argument where it is the PUF data frame if the user chooses the PUF file option and None
or False
otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hdoupe said:
Currently, the PUF is sent to taxcalc as a pandas dataframe. I'm not sure how to generally specify the PUF path relative to this file.
The newest tbi_utils.py
code tries to anticipate where the input files will be located.
The following code fragment from PolicyBrain/webapp/apps/taxbrain/tasks.py
suggests to me that puf.csv.gz
will be in the current working directory:
def get_tax_results_async(mods, inputs_pk):
print "mods is ", mods
user_mods = package_up_vars(mods)
print "user_mods is ", user_mods
print "begin work"
cur_path = os.path.abspath(os.path.dirname(__file__))
tax_dta = pd.read_csv("puf.csv.gz", compression='gzip')
mY_dec, mX_dec, df_dec, mY_bin, mX_bin, df_bin, fiscal_tots = dropq.run_models(tax_dta,
num_years=NUM_BUDGET_YEARS, user_mods={START_YEAR:user_mods})
And that is where the tbi_utils.py
code looks for it.
Does this make sense?
The philosophy guiding the refactoring in pull request #1577 is that TaxBrain should be able to just tell Tax-Calculator what it wants to do and then have Tax-Calculator do it and return the results. Expecting TaxBrain to read input files and draw quick-calculation subsamples (which TaxBrain has been doing incorrectly for many months --- see unresolved Policy Brain issue 574, which has been open since June 28th) does not seem like a sensible strategy. |
The goal of this pull request is to provide TaxBrain with a simple interface to Tax-Calculator that can handle
cps.csv
input as well aspuf.csv
input. So, this pull request is an attempt to resolve PolicyBrain issue 668. While this pull request does not attempt to deal with CPS benefits information, it should provide a foundation for doing that. So, that means that pending pull request #1500 needs to be coordinated with the changes in this pull request.Note that this pull request changes the public API of Tax-Calculator (from the point of view of TaxBrain only).
In addition to the changes in the two
run_nth_year_*_model
functions, thecreate_json_table
function has been renamedcreate_dict_table
because the function converts a dataframe table into a dictionary. It has never created a JSON table, so the old name was highly misleading.Despite the changes in the public API, there are no changes in tax-calculating logic or in tax results.
@MattHJensen @feenberg @Amy-Xu @andersonfrailey @hdoupe @GoFroggyRun @brittainhard