Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to Tax-Calculator 0.15.0 #808

Merged
merged 18 commits into from
Jan 24, 2018
Merged

Conversation

hdoupe
Copy link
Collaborator

@hdoupe hdoupe commented Jan 18, 2018

No description provided.

@hdoupe hdoupe added the WIP label Jan 18, 2018
@hdoupe
Copy link
Collaborator Author

hdoupe commented Jan 19, 2018

On my local machine, I got the following error when I tried to change the start year to 2018:
screen shot 2018-01-19 at 11 42 42 am

It looks like I forgot to add a cpi field for this parameter. This is simple to fix. However, there is a bigger problem with the PolicyBrain cpi button logic. The following code snippet is where PolicyBrain determines if a parameter should have a cpi button or not:

      SPECIAL_INFLATABLE_PARAMS = {'_II_credit', '_II_credit_ps'}
      SPECIAL_NON_INFLATABLE_PARAMS = {'_ACTC_ChildNum', '_EITC_MinEligAge',
                                                                          '_EITC_MaxEligAge'}
        ....
        first_value = self.col_fields[0].values[0]
        inflatable_params = SPECIAL_INFLATABLE_PARAMS
        non_inflatable_params = SPECIAL_NON_INFLATABLE_PARAMS
        if 'inflatable' in attributes:
            self.inflatable = attributes['inflatable']
        elif self.tc_id in inflatable_params:
            self.inflatable = True
        elif (self.tc_id in non_inflatable_params or
              self.nice_id in BOOL_PARAMS):
            self.inflatable = False
        else:
            self.inflatable = first_value > 1 <-------------- Most parameters marked here

        if self.inflatable:
            cpi_flag = attributes['cpi_inflated']
            self.cpi_field = TaxCalcField(self.nice_id + "_cpi", "CPI",
                                          [cpi_flag], self, first_budget_year)

The problem is that most of the parameters are marked as inflatable if their value is greater than one. This is a heuristic for determining if a parameter is inflatable. Rates are not inflatable and are lower than one. Bracket ceilings for example are greater than one and are inflatable. This fails for parameters like DependentCredit_Child_c since it starts at zero and is raised to 600 a few years later. Thus, it is marked as inflatable in later years but not earlier years.

Is there a better way to determine from the current_law_policy.json document whether a parameter is inflatable or not?

cc @martinholmer

@martinholmer
Copy link
Contributor

@hdoupe asked in #808:

Is there a better way to determine from the current_law_policy.json document whether a parameter is inflatable or not?

Yes, there is a better way. Consult the boolean value of the cpi_inflated field for each parameter.

@hdoupe
Copy link
Collaborator Author

hdoupe commented Jan 19, 2018

Thanks @martinholmer. My interpretation of that attribute was that all parameters where cpi_inflated is true are inflatable. However, there could be fields with cpi_inflated as false but they could be switched with a reform. For example, SS_thd50 has cpi_inflated set to false, but maybe there is some reform that wants to index this parameter to inflation. This is not the case, correct? That is, only parameters with cpi_inflated set to True can be inflated.

@hdoupe
Copy link
Collaborator Author

hdoupe commented Jan 19, 2018

@martinholmer I've been updating test data and I just got an interesting result with the following script:

import taxcalc
print('taxcalc version:', taxcalc.__version__)

json_reform = """{
    "policy": {
        "_II_brk4_single": {"2017": [500.0]}
    }
}
"""

policy_dict = taxcalc.Calculator.read_json_param_objects(json_reform, None)

pol = taxcalc.Policy()
pol.implement_reform(policy_dict['policy'])
print('\nerrors')
print(pol.reform_errors)
print('\nwarnings')
print(pol.reform_warnings)

and output:

(aei_dropq) HDoupe-MacBook-Pro:test_assets henrydoupe$ python strange_errors.py 
('taxcalc version:', '0.15.0')

errors
ERROR: 2017 _II_brk4_0 value 500.0 < min value 91900.0 for _II_brk3_0
ERROR: 2018 _II_brk4_0 value 510.0 < min value 82500.0 for _II_brk3_0
ERROR: 2019 _II_brk4_0 value 520.15 < min value 84141.75 for _II_brk3_0
ERROR: 2020 _II_brk4_0 value 530.87 < min value 85875.07 for _II_brk3_0
ERROR: 2021 _II_brk4_0 value 542.18 < min value 87704.21 for _II_brk3_0
ERROR: 2022 _II_brk4_0 value 554.0 < min value 89616.16 for _II_brk3_0
ERROR: 2023 _II_brk4_0 value 565.74 < min value 91516.02 for _II_brk3_0
ERROR: 2024 _II_brk4_0 value 577.9 < min value 93483.61 for _II_brk3_0
ERROR: 2025 _II_brk4_0 value 590.21 < min value 95474.81 for _II_brk3_0
ERROR: 2026 _II_brk4_0 value 602.78 < min value 110791.0 for _II_brk3_0
ERROR: 2027 _II_brk4_0 value 615.74 < min value 113173.01 for _II_brk3_0
ERROR: 2018 _II_brk4_3 value 216750.0 > max value 200000.0 for _II_brk5_3
ERROR: 2019 _II_brk4_3 value 221063.33 > max value 203980.0 for _II_brk5_3
ERROR: 2020 _II_brk4_3 value 225617.23 > max value 208181.99 for _II_brk5_3
ERROR: 2021 _II_brk4_3 value 230422.88 > max value 212616.27 for _II_brk5_3
ERROR: 2022 _II_brk4_3 value 235446.1 > max value 217251.3 for _II_brk5_3
ERROR: 2023 _II_brk4_3 value 240437.56 > max value 221857.03 for _II_brk5_3
ERROR: 2024 _II_brk4_3 value 245606.97 > max value 226626.96 for _II_brk5_3
ERROR: 2025 _II_brk4_3 value 250838.4 > max value 231454.11 for _II_brk5_3


warnings

(aei_dropq) HDoupe-MacBook-Pro:test_assets henrydoupe$ 

Why do you think that I'm getting errors for II_brk4_3 when it is not changed?

@martinholmer
Copy link
Contributor

@hdoupe said:

My interpretation of that attribute was that all parameters where cpi_inflated is true are inflatable. However, there could be fields with cpi_inflated as false but they could be switched with a reform. For example, SS_thd50 has cpi_inflated set to false, but maybe there is some reform that wants to index this parameter to inflation.

You are correct. In fact, as far as I can see, in Tax-Calculator any policy parameter that has cpi_inflated equal to false could in a reform have its cpi_inflated value set to true. In other words, we have never had what you might call a cpi_inflatable attribute for each parameter that indicates whether or not that parameter's cpi_inflated attribute can be changed.

When a user of the TaxBrain GUI form changes the value of cpi_inflated in a future year, TaxBrain has to prepare a JSON reform file that reflects that reform provision. So, TaxBrain has all the indexing information both under current-law policy (from the current_law_policy.json file) and under any reform (from the TaxBrain user's settings in the GUI form). RIght? So, what's missing? Or maybe I don't understand your question.

@martinholmer
Copy link
Contributor

martinholmer commented Jan 20, 2018

@hdoupe specified this reform:

json_reform = """{
    "policy": {
        "_II_brk4_single": {"2017": [500.0]}
    }
}
"""

and when he implemented it, he got these error messages:

ERROR: 2017 _II_brk4_0 value 500.0 < min value 91900.0 for _II_brk3_0
ERROR: 2018 _II_brk4_0 value 510.0 < min value 82500.0 for _II_brk3_0
ERROR: 2019 _II_brk4_0 value 520.15 < min value 84141.75 for _II_brk3_0
ERROR: 2020 _II_brk4_0 value 530.87 < min value 85875.07 for _II_brk3_0
ERROR: 2021 _II_brk4_0 value 542.18 < min value 87704.21 for _II_brk3_0
ERROR: 2022 _II_brk4_0 value 554.0 < min value 89616.16 for _II_brk3_0
ERROR: 2023 _II_brk4_0 value 565.74 < min value 91516.02 for _II_brk3_0
ERROR: 2024 _II_brk4_0 value 577.9 < min value 93483.61 for _II_brk3_0
ERROR: 2025 _II_brk4_0 value 590.21 < min value 95474.81 for _II_brk3_0
ERROR: 2026 _II_brk4_0 value 602.78 < min value 110791.0 for _II_brk3_0
ERROR: 2027 _II_brk4_0 value 615.74 < min value 113173.01 for _II_brk3_0
ERROR: 2018 _II_brk4_3 value 216750.0 > max value 200000.0 for _II_brk5_3
ERROR: 2019 _II_brk4_3 value 221063.33 > max value 203980.0 for _II_brk5_3
ERROR: 2020 _II_brk4_3 value 225617.23 > max value 208181.99 for _II_brk5_3
ERROR: 2021 _II_brk4_3 value 230422.88 > max value 212616.27 for _II_brk5_3
ERROR: 2022 _II_brk4_3 value 235446.1 > max value 217251.3 for _II_brk5_3
ERROR: 2023 _II_brk4_3 value 240437.56 > max value 221857.03 for _II_brk5_3
ERROR: 2024 _II_brk4_3 value 245606.97 > max value 226626.96 for _II_brk5_3
ERROR: 2025 _II_brk4_3 value 250838.4 > max value 231454.11 for _II_brk5_3

which prompted this question:

Why do you think that I'm getting errors for _II_brk4_3 when it is not changed?

_II_brk4_3 "not changed" is not exactly right. But that's getting ahead of the story.

You are correct that this is a puzzle. It took me some time to come up with the puzzle solution. (Or, at least what I think is a solution. If you don't agree, then we can delve further into this matter.)

The 2017 value of _II_brk4 in current_law_policy.json is the following MARS-indexed array:

[191650.00, 233350.00, 116675.00, 212500.00, 233350.00]

Your JSON reform file is translated by Tax-Calculator into this reform:

[500.00, 233350.00, 116675.00, 212500.00, 233350.00]

which means these five _II_brk4 values prevail in 2017 and those inflation-indexed five values prevail in each subsequent year. So, the one error message for 2017 (with respect to _II_brk4_0) is the one you expected for 2017. Why the error messages for _II_brk4_3 in 2018 and subsequent years? Because current-law policy changes _II_brk_5 beginning in 2018. The 21018 values of _II_brk5 in current_law_policy.json are:

[200000.00, 400000.00, 200000.00, 200000.00, 400000.00]

So Tax-Calculator generates this (unexpected to us) error message for 2018:

ERROR: 2018 _II_brk4_3 value 216750.0 > max value 200000.0 for _II_brk5_3

And the same error is generated in years after 2018 because _II_brk4_3 and _II_brk5_3 are both indexed at the same rate.

So, in the end, the unexpected error messages are appropriate.
This seems like a case when an algorithm is better than our brains at quickly diagnosing a complex situation.

Another way to understand the solution to this puzzle is to confirm that when your reform is changed to start in 2018 instead of 2017, the error messages would be only about _II_brk4_0. Try that to see if my puzzle solution is correct.

@hdoupe
Copy link
Collaborator Author

hdoupe commented Jan 20, 2018

@martinholmer said

You are correct. In fact, as far as I can see, in Tax-Calculator any policy parameter that has cpi_inflated equal to false could in a reform have its cpi_inflated value set to true. In other words, we have never had what you might call a cpi_inflatable attribute for each parameter that indicates whether or not that parameter's cpi_inflated attribute can be changed.

Ah, ok. I understand now. Thanks for clearing this up. TaxBrain has always decided whether a parameter can be inflated or not. For example, the rate does not have a cpi button but the ceiling does:

screen shot 2018-01-20 at 10 55 54 am

It seems like the proper way to solve this is to give each parameter a CPI button except for boolean type parameters. Does this line up more closely with how Tax-Calculator treats parameters?

@martinholmer
Copy link
Contributor

@hdoupe said:

You are correct. In fact, as far as I can see, in Tax-Calculator any policy parameter that has cpi_inflated equal to false could in a reform have its cpi_inflated value set to true. In other words, we have never had what you might call a cpi_inflatable attribute for each parameter that indicates whether or not that parameter's cpi_inflated attribute can be changed.

Ah, ok. I understand now. Thanks for clearing this up. TaxBrain has always decided whether a parameter can be inflated or not. For example, the rate does not have a cpi button but the ceiling does:

...

It seems like the proper way to solve this is to give each parameter a CPI button except for boolean type parameters. Does this line up more closely with how Tax-Calculator treats parameters?

That would be closer to what Tax-Calculator is doing now, but your questions on this matter reveal that the real problem is that Tax-Calculator's current_law_policy.json file lacks the proper information. That is, it does not include a cpi_inflatable field for each parameter. Do you want me to include such a new field? If so, is there a name better than cpi_inflatable for that new boolean field?

@hdoupe
Copy link
Collaborator Author

hdoupe commented Jan 20, 2018

Ok, adding some type of inflatable attribute makes sense to me. The only other name I can think of is cpi_activable, but I think I like cpi_inflatable more.

@martinholmer
Copy link
Contributor

@hdoupe said:

Ok, adding some type of inflatable attribute makes sense to me. The only other name I can think of is cpi_activable, but I think I like cpi_inflatable more.

OK. I'll start on this enhancement now with hopes for a new release later today.

@hdoupe
Copy link
Collaborator Author

hdoupe commented Jan 20, 2018

@martinholmer Thanks for the quick response, but there's no big rush. Enjoy your weekend. I still have to solve some other problems on this PR, and it won't be ready until end of Monday at the earliest.

@martinholmer
Copy link
Contributor

@hdoupe, the taxcalc packages for Tax-Calculator release 0.15.1 are now available.
Be sure to read the recently-edited comment regarding Tax-Calculator pull request 1838.

Let me know if you have any questions or problems using the 0.15.1 release.

@hdoupe
Copy link
Collaborator Author

hdoupe commented Jan 22, 2018

@martinholmer Thanks for the in-depth diagnosis. That makes perfect sense. The fifth bracket decreased in 2018 due to the TCJA, but the fourth bracket was frozen since I specified the single parameter. Thus, the fifth bracket was set lower than the fourth bracket and we got our errors.

@martinholmer
Copy link
Contributor

@hdoupe said:

The fifth bracket decreased in 2018 due to the TCJA, but the fourth bracket was frozen since I specified the single parameter. Thus, the fifth bracket was set lower than the fourth bracket and we got our errors.

Exactly. That's an excellent way to describe what's going on.

@hdoupe
Copy link
Collaborator Author

hdoupe commented Jan 23, 2018

Using the new cpi_inflatable attribute, I was able to tell that there are about 50 parameters that either have a CPI button and should not or do not have a CPI button but should. Here's the list:

name    TaxBrain cpi button    taxcalc cpi button    default value 2017
ALD_Dependents_Child_c	False	True	0.0
ALD_Dependents_Elder_c	False	True	0.0
ALD_Dependents_thd	False	True	[0.0, 0.0, 0.0, 0.0, 0.0]
II_credit_nr	False	True	[0.0, 0.0, 0.0, 0.0, 0.0]
II_credit_nr_ps	False	True	[0.0, 0.0, 0.0, 0.0, 0.0]
ID_StateLocalTax_crt	True	False	9e+99
ID_RealEstate_crt	True	False	9e+99
ID_BenefitSurtax_em	False	True	[0.0, 0.0, 0.0, 0.0, 0.0]
ID_AmountCap_rt	True	False	9e+99
CG_ec	False	True	0.0
PT_excl_wagelim_rt	True	False	9e+99
PT_excl_wagelim_thd	False	True	[0.0, 0.0, 0.0, 0.0, 0.0]
AMT_KT_c_Age	True	False	24.0
CDCC_crt	True	False	35.0
CTC_c_under5_bonus	False	True	0.0
DependentCredit_c	False	True	0.0
DependentCredit_Child_c	False	True	0.0
DependentCredit_Nonchild_c	False	True	0.0
FilerCredit_c	False	True	[0.0, 0.0, 0.0, 0.0, 0.0]
CTC_new_c	False	True	0.0
CTC_new_c_under5_bonus	False	True	0.0
CTC_new_ps	False	True	[0.0, 0.0, 0.0, 0.0, 0.0]
UBI_u18	False	True	0.0
UBI_1820	False	True	0.0
UBI_21	False	True	0.0
ALD_Dependents_Child_c	False	True	0.0
ALD_Dependents_Elder_c	False	True	0.0
ALD_Dependents_thd	False	True	[0.0, 0.0, 0.0, 0.0, 0.0]
II_credit_nr	False	True	[0.0, 0.0, 0.0, 0.0, 0.0]
II_credit_nr_ps	False	True	[0.0, 0.0, 0.0, 0.0, 0.0]
ID_StateLocalTax_crt	True	False	9e+99
ID_RealEstate_crt	True	False	9e+99
ID_BenefitSurtax_em	False	True	[0.0, 0.0, 0.0, 0.0, 0.0]
ID_AmountCap_rt	True	False	9e+99
CG_ec	False	True	0.0
PT_excl_wagelim_rt	True	False	9e+99
PT_excl_wagelim_thd	False	True	[0.0, 0.0, 0.0, 0.0, 0.0]
AMT_KT_c_Age	True	False	24.0
CDCC_crt	True	False	35.0
CTC_c_under5_bonus	False	True	0.0
DependentCredit_c	False	True	0.0
DependentCredit_Child_c	False	True	0.0
DependentCredit_Nonchild_c	False	True	0.0
FilerCredit_c	False	True	[0.0, 0.0, 0.0, 0.0, 0.0]
CTC_new_c	False	True	0.0
CTC_new_c_under5_bonus	False	True	0.0
CTC_new_ps	False	True	[0.0, 0.0, 0.0, 0.0, 0.0]
UBI_u18	False	True	0.0
UBI_1820	False	True	0.0

Fixing this under the current approach of storing each parameter to a column in a database would involve extensive changes to the database. Given the magnitude of these changes, I think this is a good time to make the swap to storing the run parameters primarily as JSON files. This idea was first raised in #435, @GoFroggyRun has mentioned this, and @codekansas has also suggested that we move away from our current approach.

This would have no effect on the file upload runs, and the raw input from the GUI is parsed as a dictionary that is built from the attributes of the model object (i.e. taxbrain_data = dict(taxbrain_model.__dict__)). These attributes are populated by the raw input data. Thus, it would be easy to make the change going forward, but it may be more difficult to preserve backwards compatibility.

One solution is to create a decorator like we did for the new results table names. If the version of PB is less than 1.4.0 (would be a minor bump to 1.4.0 if we do this change), then we form the JSON object from the raw fields and serve that to the parsing functions. This would require saving all of the no longer needed fields though.

@GoFroggyRun @codekansas what do you think?

@hdoupe
Copy link
Collaborator Author

hdoupe commented Jan 23, 2018

With the most recent commit, all tests are passing for PR #808. To summarize the work done so far, the tests have been refactored to further separate the "data" from the "code" as suggested by @codekansas here, and PolicyBrain has been updated to Tax-Calculator 0.15.1. The environment setup scripts were also altered to get the travis build to pass. A bug involving how PolicyBrain decides whether a parameter is inflatable or not was discovered by upgrading to Tax-Calculator 0.15.1.

Except for the CPI bug, this PR is ready to be reviewed and merged afterwards. I think that we should merge this PR as it is (with modifications if suggested in the review) and begin work on the CPI bug and the larger data storage modifications in a separate PR. Doing this in a separate PR has a two main benefits:

  1. This is a fairly large PR as it is now. Adding another large piece will make it that much harder to review.
  2. Once this is merged and PB is on TC 0.15.+, @GoFroggyRun can begin work on front-end changes required by the UBI project.

Unless there are any objections, this is how we will proceed.

@hdoupe hdoupe added ready and removed WIP labels Jan 23, 2018
@MattHJensen
Copy link
Contributor

This discussion might be relevant to updating presets on TaxBrain to account for the new baseline.

@GoFroggyRun
Copy link
Contributor

@hdoupe said:

I think this is a good time to make the swap to storing the run parameters primarily as JSON files.

Definitely agree.

Regarding backward compatibility, I'm not sure whether we want to store "all of the no longer needed fields." Would it be easier to preserve backward compatibility if we just provide the reform file ran, not necessarily via GUI, and its results (which I think is the point of preserving the compatibility)? We don't have to worry too much when fields were added or removed, while users are able to archive what they ran along with the results and cite whatever they need.

@hdoupe
Copy link
Collaborator Author

hdoupe commented Jan 23, 2018

Thanks for the link @MattHJensen. The presets will definitely need to be addressed. It seems to me that all of the presets are still valid (except for the TCJA preset) but are compared to the new post-TCJA baseline instead of the pre-TCJA baseline. I think that the TCJA preset should also be swapped with the 2017_law.json preset.

cc @martinholmer

@hdoupe
Copy link
Collaborator Author

hdoupe commented Jan 23, 2018

@GoFroggyRun said

Regarding backward compatibility, I'm not sure whether we want to store "all of the no longer needed fields." Would it be easier to preserve backward compatibility if we just provide the reform file ran, not necessarily via GUI, and its results (which I think is the point of preserving the compatibility)? We don't have to worry too much when fields were added or removed, while users are able to archive what they ran along with the results and cite whatever they need.

Right, they could pull up the results just fine no matter what we do with the parameters. However, we need the fields data to provide the edit parameters page for previous runs. I think there are a couple solutions:

  1. JSON file to fields converter--as long as we have the JSON reform (which we do from PB 1.0.2 onward) we can convert that back to the fields data and display that. I've been interested in having something like this for awhile, but it may take some time to get right. And there may be some edge cases that are difficult to catch.
  2. One time database update that converts all stored run parameters into JSON fields objects and store those. Then delete the old, no-longer needed columns.
  3. Keep the old columns and use some decorator attribute to dynamically create the JSON fields object and parse that.

(3) seems like the easiest but laziest option. (2) is risky since we are doing a lot of work on the database, but I guess we could back up the database. (1) probably consumes the most time.

  1. On backwards compatibility, if someone checks out the edit parameters page, it may not mean the same as before. The fields that were not specified could potentially be populated by the new default values for 2017 and 2018. If the parameters are submitted again, the difference results will be different since the baseline is the TCJA. So, one solution is to drop backwards compatibility on the edit parameters side and keep backwards compatibility on the results.

The major downside of (4) is the user won't have a way to review the parameters used to get an archived result.

@hdoupe
Copy link
Collaborator Author

hdoupe commented Jan 24, 2018

I plan to begin working on solution (3) in a separate PR. This will allow us to get a fix out faster, and it allows us to go back and implement a more permanent solution in the future. If anyone has any objections to this plan then please let me know and we can discuss it further.

I will also kick the preset addition to a separate issue/PR that will be completed before the end of this release cycle.

@hdoupe hdoupe merged commit 2e64827 into ospc-org:master Jan 24, 2018
@GoFroggyRun
Copy link
Contributor

@hdoupe what I have in mind is more like your (4).

By providing the reform files ran, we don't necessarily have to allow users to always be able to edit their previous runs, via GUI, as PB releases go. A reform/assumption file and tax-calculator version (and TaxData version once available) should be sufficient for a user to recreate (theoretically) his/her results and provide citation, which, in my opinion, are the primary reasons we care about backward compatibility. For example, the file upload page doesn't support editing, and thus makes its maintenance easier than GUI page in terms of backward compatibility. Maybe we can do something similar: once those GUI runs got stale, we make GUI re-editing not available, yet store reform/assumption information likewise in the file-upload way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants