-
-
Notifications
You must be signed in to change notification settings - Fork 199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adjusted code to allow users to modify mortality assumptions in cstwMPC #214
Conversation
|
Matt,
Thanks for this extensive feedback. I will instruct Jackree to revise
accordingly.
Though I spent some time in class explaining that they should not
put any new code or data inside the HARK filepath unless absolutely
unavoidable, the point seems to have been a difficult one for them to
understand. It sounds like they DID get a related point that I emphasized, w
which is that I did NOT want them just to clone HARK and then make revised
versions of the core HARK code to work however they wanted it to:
"No copy and paste" coding.
But both Jackree and Emma told me that there was no way to tell the
existing modules (cstwMPC or SolvingMicroDSOPs) to look elsewhere for their
data inputs, so I gave them permission to make pull requests for the
minimal changes that would allow a user to (optionally) specify an
alternative file path AS AN OPTION when invoking the modules. (If the
option is not specified, the module should work exactly as it does now).
I'm guessing that this the right way to accomplish this is simply one line
of code somewhere, and it would take you 30 seconds to figure out where to
put that line and what it should be.
The other thing I told both of them is that they should structure their
code efforts as REMARKs, and specify the path to their new data relative to
the root directory of their REMARK. Neither of them seems to have
been able to figure out how to do it.
PS. I have another student whose project is to create a cstwMPC REMARK.
He, too, began by just copying cstwMPC.py and making extensive changes
throughout it. Again I told him, "no, you should invoke tools already
provided in cstwMPC.py; but tell me if there is anything you need to do for
your REMARK that cannot be accomplished using the existing cstwMPC module,
and if there is anything then you should issue a pull request for the
minimal required modifications." He seemed to believe that there were
certain files (a parameter setup file?) whose filepath is hard-wired and he
would need to change that -- I asked him, "can't you just change a
parameter after it gets read in and before it gets used?" and he did not
seem to understand what I was saying, but maybe he will figure it out. Can
you think of any problem along those lines that would require ANY
modification of cstwMPC.py? (He is NOT substituting new data inputs for
data that are hard-wired).
…On Mon, Dec 17, 2018 at 11:07 PM Matthew N. White ***@***.***> wrote:
1.
File names should not have spaces in them.
2.
You deleted the 2 lines of code that specify the default behavior.
3.
You put new mortality data files into the cstwMPC project folder, with
descriptive names, then direct the user to use a file specifically named
alternativedata.txt in their downloads folder. No file will actually
download with that name, and people do not usually do work in their
downloads folder. Moreover, your code requires the user to copy the new
mortality table data you added into their downloads folder and rename it to
use it. That's not a normal workflow.
4.
Basically none of the code you added to SetupParamsCSTW.py is
necessary. The file already has the capability to use a different mortality
table: just change the name of the file in the one line where it appears!
It's fine to add a line at the top of the file that says mortality_file_name
= 'SSAactuarial.txt', and then have the line that opens the file refer
to that variable, but the block of you added is unnecessary. Python will
throw an error on its own if someone tries to open something that is not a
file.
5.
The instructions for how to change the mortality file are not useful.
You can trust that the user knows to put a file in a specific place and
change a name. If you want to make it easier to change mortality
assumptions, the relevant documentary information is the exact file format
the code expects: tab-delimited text, survival probabilities in column X,
starting from age Y, etc.
6.
More generally, I'm not sure we want to edit cstwMPC to add any new
capabilities. That's a discussion for outside this PR thread, however.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#214 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABQlf8Ga4nU9i4djDQgxkwCXbvZmB_wiks5u6BW3gaJpZM4ZXKFQ>
.
--
- Chris Carroll
|
This goes back to our discussion of what is in "HARK itself" vs what is an "application of HARK". Where I thought we landed this summer was that cstwMPC, SolvingMicroDSOPs, and (maybe?) StickyE would be distributed in HARK and initially live in the user's sitepackages directory (where they should not touch it). BUT: There would be a top level function that we advertise, like Didn't Nate or one of the OTS people write this function right around the time I was at SciPy? If so, your students can do things like add additional mortality processes into the project files (and no need to have a weird wonky workaround to the "don't touch code in your sitepackages" rule). |
@@ -0,0 +1,120 @@ | |||
0 0.092840 100000.00 49.820000 0.082350 100000.00 53.600000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's problematic to distribute data without proper license information. It's not at all clear that this data can be shared like this, and it doesn't contain a source either. What website/researcher/other was this obtained from? In general, it can be much easier to supply a function to generate such a table to avoid these types of considerations.
@@ -0,0 +1,120 @@ | |||
0 0.00199 100000 80.98 0.00202 100000 87.13 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above (Matt already mentioned the no-spaces-in-filenames point.)
@@ -112,9 +112,24 @@ | |||
slope_prev = 1.0 # Initial slope of kNextFunc (aggregate shocks model) | |||
intercept_prev = 0.0 # Initial intercept of kNextFunc (aggregate shocks model) | |||
|
|||
from pathlib import Path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pathlib
is not part of Python prior to v 3.3, so as long as HARK hasn't dropped support for older versions of Python than v3.3, this has to be handled.
HARK/cstwMPC/SetupParamsCSTW.py
Outdated
# or other countries, they could do so by naming the alternative data file "alternativedata.txt" and save this in the "/home/users/Downloads/" folder | ||
|
||
|
||
README = Path("/home/users/Downloads/alternativedata.txt") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Matt touched upon this already, but I'm not sure he mentioned the following: why not just allow for the user to provide a string? No reason to hardcode this. Also, there's no guarantee to even have a /home/users/Downloads
folder on other OS's than the one this was developed on (MacOS? unless the user is called users
and this is linux... even so, Windows wouldn't appreciate this!)
Edit: oh, and README
is probably not a good name for this. A readme is typically a file expected to be read by the user before using the software. This is probably meant as the file saying: "Hey there, read me not the default file", but something to the tune of SurvivalPath
or YourPath
or whatever is probably preferable (or lower case versions depending on your likings).
|
||
README = Path("/home/users/Downloads/alternativedata.txt") | ||
if README.is_file(): | ||
print ("File does exist") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a mistake. The file does exist, that was what that if README.is_file():
just tested
if README.is_file(): | ||
print ("File does exist") | ||
data_location = os.path.dirname(os.path.abspath(__file__)) | ||
f = open(data_location + '/' + 'alternativedata.txt','r') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole purpose of pathlib
is to provide object oriented path-handling functionality. I think something like this is more idiomatic
with YourPath.open() as f:
actuarial_reader = csv.reader(f,delimiter='\t')
@@ -0,0 +1,7 @@ | |||
Instructions on modifying the mortality assumptions used in Section 5 of Carroll et al. (2017) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's preferable to put information like this in a docstring, manual and/or inline comment. It must also describe the format.
Welcome ! Seems like this is your first open source contribution. Do not get discouraged by the many many comments. Few PRs to larger projects get through without corrections and or discussion. Last year I made a PR for another project that managed to collect almost 147 comments in just over a month and I committed 39 changes 💫 JuliaLang/julia#22603 To the extent that this code is meant to be useful in a fashion beyond the application provided currently, it is of course interesting to modularize the data handling. I think there's a much simpler way to do it though: simply allow the user to provide a path when constructing the Now, if you want the users to be able to do this, then you need to document exactly how to do it. What are the columns, etc. An alternative is simply to require the mortality information to be passed in vector by vector. These vectors can of course be produced from a csv file's columns, but also simulated in memory for educational/academic purposes. Based on @mnwhite and @llorracc 's exchange above, I think the most important thing to agree upon is: is cstwMPC meant to be generalized like this, or is it meant to provide a very specific implementation and application for illustrative purposes. |
No description provided.