Reorganizing subset_data.py to follow SE recommendations #1461

negin513 · 2021-08-18T05:07:26Z

Description of changes

This PR is based on a meeting we had with @billsacks @ekluzek and @slevisconsulting. They suggested to add a top level skeleton code called (subset_data instead of subset_data.py) and move the subset_data.py to CTSM python folder. Next, I created a seperate module file for each class.

Specific notes

Contributors other than yourself, if any: @adrifoster

CTSM Issues Fixed (include github issue #):
Partially addresses #1441
Fixes #1436
Fixes #1437
Fixes #1594
Fixes #1606

Are answers expected to change (and if so in what way)?

Any User Interface Changes (namelist or namelist defaults changes)?

Testing performed, if any: tools testing python testing
I checked the subset_data for all neon sites and the results were identical (bit-for-bit) except for the two crop sites (i.e. KONA and STER sites). The crop sites previously had an issue based on #1606.

billsacks

Thanks for moving forward with these initial steps, @negin513 . I have some minor comments below. In addition:

It seems like you haven't deleted the original subset_data.py; please delete that file.

python/ctsm/base_case.py

python/ctsm/regional_case.py

python/ctsm/subset_data.py

negin513 · 2021-09-07T20:51:37Z

Thanks @billsacks for all your comments.

I'd answer all the above comments but I did not mark them as resolved. Please mark them as resolved if you think my answers and my fixes were satisfactory.

Besides your comments, I have some comments on my code and I am listing them here not to forget about them:
These are either based on our discussions or my thoughts:

Make a top-level skeleton code in tools/site_and_regional and relevant modules under python/ctsm-- add instructions...
Classes should be sorted separately in different module under python/ctsm/site_and_regional.
Make sure PEP-8 complaint. Run through black.
identify for possible areas for testing
Logging update
output_to_file in utils.py
Adding --input-directory as an option...

python/ctsm/subset_data.py

ekluzek · 2021-09-08T19:02:56Z

@negin513 I'd like this PR to fix #1436 if possible. We just talked about it and it sounds reasonably straightforward.

ekluzek · 2021-09-08T19:05:17Z

This works on #1441 for these specific tools.

ekluzek · 2021-09-17T19:59:17Z

@negin513 we talked about the need to have subset_data and the modify_data python tools to create a user-mod directory that can then be pointed to for create_newcase, which provides a nice workflow. It also makes the workflow for NEON very similar to the workflow for a generic site which helps. Do you want to do that as part of this PR or make that a future change?

billsacks · 2021-09-17T20:11:02Z

Do you want to do that as part of this PR or make that a future change?

I'd suggest doing that separately, so this PR can get merged and @slevisconsulting can build off of it.

ekluzek · 2021-09-17T20:20:42Z

OK. I think in general for these tools to create a user-mod directory that can then be easily used to create a case is a good thing. I think most of the tools should do that.

So the question is should @slevisconsulting scripts create user-mod directories or not? If yes perhaps you should include it here, but if not then we should do it as a future PR. I think his scripts are just modifying a single surface dataset, so perhaps it shouldn't create a user-mod directory. If they are doing more than that though, perhaps they should.

negin513 · 2022-02-11T06:35:31Z

@negin513 and I had a nice discussion going over this and were able to complete a few of the final things together. And we started a project of pulling out a tricky bit of code to it's own method so that it can be easily unit-tested. @negin513 should be able to complete this next week, and it can come in one of the very next CTSM tags.

Thanks to @ekluzek for testing tricky part of the code (modifying surface dataset based on the user flags combination): we separated this section into a single class method called modify_surfdata_atpoint.
I added 24 unit tests for this specific method (in test_unit_singlept_data_surfdata.py).
For these tests, instead of reading/writing files, we are creating an xarray datasets (similar to the surface dataset files) on the fly with the same data structure and attributes but fill it with random numbers...
Then the tests checks to make sure different variables from this "dummy" dataset is set correctly to the specific value (for example if we expect the outcome of PCT_CROP to be all zeros or one). This way we can make sure that the correct part of the code is triggered and the variables are set accordingly.

Here we are creating two "dummy" xarray dataset: one mimicking 16 pft dataset and the other is for the 78 pft dataset to check the code behavior for both crop and no crop dataset.

In this single PR we added 41 python unit tests which increased the number of CTSM total python unit tests by 80%...

negin513 · 2022-02-11T07:02:31Z

@ekluzek With the addition of these random xarray dataset testing, I think I have addressed all the comments on this PR. Please let me know if there is anything else remaining.

pylint output is clean on this code:

--------------------------------------------------------------------
Your code has been rated at 10.00/10 (previous run: 10.00/10, +0.00)

The 3 python system tests are running without problems.
I ran clm_pymod tests and the tests show PASS.
All 113 python unit tests passed without any issues.

Thanks again @ekluzek for your help with this PR.

…thon_dev_meeting

…ad to make a change in modify_singlept_site_neon.py

…atch

ekluzek · 2022-02-17T21:24:27Z

I've marked off and made sure all of the check boxes have been addressed. And the query tool shows they are all addressed.

gh-pr-query -p 1461 -t -r ESCOMP/CTSM

Already worked with bill on this

negin513 added 3 commits August 12, 2021 17:13

adding the skeleton file here...

Verified

This commit was signed with the committer’s verified signature.

kunxian-xia xkx

SSH Key Fingerprint: 6I7qcLOQRBjJDvKfIthltFf1pxrJ9DWjiZvpyb6foEs
Verified
Learn about vigilant mode

abf2968

adding a module file for each class... suggested by Bill Sacks.

Verified

This commit was signed with the committer’s verified signature.

kunxian-xia xkx

SSH Key Fingerprint: 6I7qcLOQRBjJDvKfIthltFf1pxrJ9DWjiZvpyb6foEs
Verified
Learn about vigilant mode

efc8f9f

updating subset_data.py remove class definition from here.

Verified

This commit was signed with the committer’s verified signature.

kunxian-xia xkx

SSH Key Fingerprint: 6I7qcLOQRBjJDvKfIthltFf1pxrJ9DWjiZvpyb6foEs
Verified
Learn about vigilant mode

89497d3

billsacks requested changes Aug 18, 2021

View reviewed changes

negin513 added 9 commits September 7, 2021 12:20

adding the top-level skeleton.

60d9494

adding classes for subset_data.py

50f8727

moving this under site_and_regional.

093a0ed

moving the classes under subset_data.

b4bb64e

adding some more instructions...

bf8026a

removing subset_data.py under site_and_regional...why not git mv :/

1f45c14

some small changes...

cb317e9

running this through formatter...

9edf9b0

more comments...

93243c6

running through formatter...

408c604

negin513 force-pushed the python_dev_meeting branch from 77369bb to 408c604 Compare September 7, 2021 21:02

adding the deleted unit test.

7f72af9

billsacks requested changes Sep 7, 2021

View reviewed changes

python/ctsm/subset_data.py Outdated Show resolved Hide resolved

python/ctsm/subset_data.py Outdated Show resolved Hide resolved

ekluzek linked an issue Sep 8, 2021 that may be closed by this pull request

Move critical toolchain scripts out of tools/contrib and their guts into the python subdirectory and unit tests added #1441

Closed

10 tasks

ekluzek removed a link to an issue Sep 8, 2021

Move critical toolchain scripts out of tools/contrib and their guts into the python subdirectory and unit tests added #1441

Closed

10 tasks

negin513 marked this pull request as draft September 9, 2021 15:42

negin513 added 2 commits September 17, 2021 13:55

removing the capability to run ./subset_data.py directly...

40d6b3e

not executable anymore...

7095f7e

Adrianna Foster added 2 commits December 8, 2021 14:03

updates to facilitate user mods and config file

92fbaad

add CLM_USRDAT_DIR xml variable

6ddf11f

negin513 added 2 commits February 10, 2022 21:47

fixing pct_urban dims but it was working previously.

4903f4f

updating unit tests.

2d88dc2

negin513 added 3 commits February 10, 2022 23:45

fewer lines for minor pylint complaints

b4154f1

blacking reformatter

6dbd29e

running this through black

e2218dc

ekluzek and others added 8 commits February 15, 2022 09:16

Add KONA, US-UMB, and regional tests for subset_data

e61fbcc

Merge branch 'python_dev_meeting' of github.com:negin513/CTSM into py…

e859158

…thon_dev_meeting

Add test for a point from the global f09 grid for subset_data

4ae0dd9

Changes to get modify_data_YELL and subset_data_KONA tests to pass, h…

5902cd3

…ad to make a change in modify_singlept_site_neon.py

adding test for region1

985b00e

modifying both versions of files from subset_data.

2850dfb

updating the surf_wrapper to point to local directory instead of scr…

ef8a9d9

…atch

Merge branch 'master' into python_dev_meeting

9ad7de0

negin513 added 7 commits February 17, 2022 16:53

fixing the test lists

ade3853

new tests added

55cbf35

updating README for neon site

ba08c13

updating README file list names

e532b2b

merge updates needed

6911b5e

updating docs for dev076

4e64c84

adding subset_data tests in here

a4e712f

ekluzek approved these changes Feb 18, 2022

View reviewed changes

negin513 merged commit d9ae4b4 into ESCOMP:master Feb 18, 2022

negin513 mentioned this pull request Feb 26, 2022

modify_fsurdat: Add dom_cft as alternative user-choice to dom_nat_pft #1615

Merged

negin513 mentioned this pull request Jun 30, 2022

Subset mesh files for regional cases and add user-mods for regional cases. #1735

Closed

negin513 mentioned this pull request Nov 2, 2022

NEON AG sites are running with generic crop rather than prognostic crop #1889

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reorganizing subset_data.py to follow SE recommendations #1461

Reorganizing subset_data.py to follow SE recommendations #1461

negin513 commented Aug 18, 2021 •

edited

Loading

billsacks left a comment •

edited

Loading

negin513 commented Sep 7, 2021 •

edited

Loading

ekluzek commented Sep 8, 2021

ekluzek commented Sep 8, 2021

ekluzek commented Sep 17, 2021

billsacks commented Sep 17, 2021

ekluzek commented Sep 17, 2021

negin513 commented Feb 11, 2022 •

edited

Loading

negin513 commented Feb 11, 2022 •

edited

Loading

ekluzek commented Feb 17, 2022

Reorganizing subset_data.py to follow SE recommendations #1461

Reorganizing subset_data.py to follow SE recommendations #1461

Conversation

negin513 commented Aug 18, 2021 • edited Loading

Description of changes

Specific notes

billsacks left a comment • edited Loading

Choose a reason for hiding this comment

negin513 commented Sep 7, 2021 • edited Loading

ekluzek commented Sep 8, 2021

ekluzek commented Sep 8, 2021

ekluzek commented Sep 17, 2021

billsacks commented Sep 17, 2021

ekluzek commented Sep 17, 2021

negin513 commented Feb 11, 2022 • edited Loading

negin513 commented Feb 11, 2022 • edited Loading

ekluzek commented Feb 17, 2022

negin513 commented Aug 18, 2021 •

edited

Loading

billsacks left a comment •

edited

Loading

negin513 commented Sep 7, 2021 •

edited

Loading

negin513 commented Feb 11, 2022 •

edited

Loading

negin513 commented Feb 11, 2022 •

edited

Loading