Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add lat_lon_land set #534

Closed
wants to merge 5 commits into from
Closed

Add lat_lon_land set #534

wants to merge 5 commits into from

Conversation

forsyth2
Copy link
Collaborator

Add lat_lon_land set. Resolves #518.

@forsyth2 forsyth2 added the semver: new feature New feature (will increment minor version) label Nov 28, 2023
@forsyth2 forsyth2 self-assigned this Nov 28, 2023
@forsyth2 forsyth2 marked this pull request as ready for review November 28, 2023 14:39
@forsyth2
Copy link
Collaborator Author

@chengzhuzhang I was worried adding this set might involve complicated changes to e3sm_diags.py or e3sm_diags.bash but I think it really can be included just by adding lat_lon_land to the list of sets. (I would update default.ini to include it in the E3SM Diags default set list).

The model-vs-model produces the lat_lon_land plots.

The model-vs-obs does not, however. I think it's just because the observation data isn't present though? I get errors like:

RuntimeError: Neither does QRUNOFF nor the variables in [('QRUNOFF',), ('mrro',)] exist in the file file:///lcrc/group/e3sm/ac.forsyth2/zppy_test_complete_run_output/issue_518/v2.LR.historical_0201/post/scripts/tmp.434052.DQkf/climo/v2.LR.historical_0201_JJA_185006_185108_climo.nc.

[e3sm_diags]
active = True
grid = '180x360_aave'
ref_final_yr = 2014
ref_start_yr = 1985
# TODO: this directory is missing OMI-MLS
sets = "lat_lon","zonal_mean_xy","zonal_mean_2d","polar","cosp_histogram","meridional_mean_2d","enso_diags","qbo","diurnal_cycle","annual_cycle_zonal_mean","streamflow", "zonal_mean_2d_stratosphere", "tc_analysis",
sets = "lat_lon","lat_lon_land"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be more straightforward to start a new e3sm_diags task for "lat_on_land", for instance the test_data_path needs to be pointed to separate land files. And right now because there aren't many land obs in e3sm_diags. the lat_lon_land is used mostly for model vs model comparison.

@forsyth2
Copy link
Collaborator Author

forsyth2 commented Dec 8, 2023

Note for self: lat_lon_land set should be an individual e3sm_diags run. Don’t squeeze into atmosphere set. It needs a different set of test data paths. Cleaner to have an individual run. (Having both atmosphere and land in the same viewer is possible though, but less straightforward).

@forsyth2
Copy link
Collaborator Author

Debugging updates

Results so far

Model-vs-model continues to work:

Model-vs-obs is partially working. The viewer at least generates now, but only the top plot gets created (and it does not look accurate):

(I moved output as follows)

mv /lcrc/group/e3sm/public_html/diagnostic_output/ac.forsyth2/zppy_test_complete_run_www/issue_518/v2.LR.historical_0201 /lcrc/group/e3sm/public_html/diagnostic_output/ac.forsyth2/zppy_test_complete_run_www/issue_518/v2.LR.historical_0201_20231220
mv /lcrc/group/e3sm/ac.forsyth2/zppy_test_complete_run_output/issue_518/v2.LR.historical_0201 /lcrc/group/e3sm/ac.forsyth2/zppy_test_complete_run_output/issue_518/v2.LR.historical_0201_20231220

Debugging process/notes

(1) Needed to add ("lat_lon_land" in sets) to the lines that had ("lat_lon" in sets) in order to find the data directories.

I ran git grep -n "lat_lon" to determine where were branching based on the lat_lon set, since I imagined something similar would be needed for lat_lon_land.

(2) Added param.variables = ["FLDS", "FSDS"] since those seem to be the only two land variables actually available (since they show up on the model-vs-model lat_lon_land run).

I was running into the following:

RuntimeError: Neither does QRUNOFF nor the variables in [('QRUNOFF',), ('mrro',)] exist in the file file:///lcrc/group/e3sm/ac.forsyth2/zppy_test_complete_run_output/issue_518/v2.LR.historical_0201/post/scripts/tmp.446505.amD9/climo/v2.LR.historical_0201_ANN_185001_185112_climo.nc.

I ran git grep -n "QRUNOFF" * in the e3sm_diags repo. I saw it was used in e3sm_diags/driver/default_diags/lat_lon_land_model_vs_model.cfg and e3sm_diags/driver/default_diags/lat_lon_land_model_vs_obs.cfg

Looking at https://e3sm-project.github.io/e3sm_diags/_build/html/master/defining-parameters.html, it looked like I needed to override that variable list. (That said, I'm confused as to why model-vs-model generated FLDS and FSDS when its default cfg was simply QRUNOFF, just like the model-vs-obs one).

@forsyth2
Copy link
Collaborator Author

Results of latest run:

$ cd /lcrc/group/e3sm/ac.forsyth2/zppy_test_complete_run_output/issue_518/v2.LR.historical_0201/post/scripts
$ grep -v "OK" *status
ts_land_monthly_1850-1851-0002.status:ERROR (5)
ts_land_monthly_1852-1853-0002.status:ERROR (5)
$ cd /lcrc/group/e3sm/ac.forsyth2/zppy_test_complete_run_output/issue_518/v2.LR.historical_0201_20231220/post/scripts
$ grep -v "OK" *status
# Nothing shows up, so these `ts` errors are new. Caused by a change in dataset? In Unified 1.9.2's release?
$ cd -
$ tail -n 19 ts_land_monthly_1850-1851-0002.o460284 
2024-01-23 01:10:53,431_431:INFO:cmorize:lai: creating CMOR variable with CMOR axis objects.
  File "/lcrc/soft/climate/e3sm-unified/base/envs/e3sm_unified_1.9.2_chrysalis/lib/python3.10/site-packages/e3sm_to_cmip/__main__.py", line 912, in _run_parallel
    out = res.result()
  File "/lcrc/soft/climate/e3sm-unified/base/envs/e3sm_unified_1.9.2_chrysalis/lib/python3.10/concurrent/futures/_base.py", line 458, in result
    return self.__get_result()
  File "/lcrc/soft/climate/e3sm-unified/base/envs/e3sm_unified_1.9.2_chrysalis/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
100%|██████████| 1/1 [00:00<00:00,  1.94it/s]
2024-01-23 01:10:53,546 [INFO]: __main__.py(_run_parallel:930) >> 0 of 1 handlers complete
2024-01-23 01:10:53,546 [INFO]: __main__.py(_run_parallel:930) >> 0 of 1 handlers complete
2024-01-23 01:10:53,546_546:INFO:_run_parallel:0 of 1 handlers complete
2024-01-23 01:10:53,546 [ERROR]: __main__.py(_run_parallel:934) >> lai failed to complete
2024-01-23 01:10:53,546 [ERROR]: __main__.py(_run_parallel:934) >> lai failed to complete
2024-01-23 01:10:53,546_546:ERROR:_run_parallel:lai failed to complete
2024-01-23 01:10:53,546 [ERROR]: __main__.py(_run_parallel:935) >> 0 of 1 handlers complete
2024-01-23 01:10:53,546 [ERROR]: __main__.py(_run_parallel:935) >> 0 of 1 handlers complete
2024-01-23 01:10:53,546_546:ERROR:_run_parallel:0 of 1 handlers complete
'LAISHA'
mv: cannot stat '/lcrc/group/e3sm/ac.forsyth2/zppy_test_complete_run_output/issue_518/v2.LR.historical_0201/post/lnd/180x360_aave/cmip_ts/monthly/tmp_ts_land_monthly_1850-1851-0002/CMIP6/CMIP/*/*/*/*/*/*/*/*/*.nc': No such file or directory
$ ls /lcrc/group/e3sm/ac.forsyth2/zppy_test_complete_run_output/issue_518/v2.LR.historical_0201/post/lnd/180x360_aave/cmip_ts/monthly/tmp_ts_land_monthly_1850-1851-0002
user_metadata.json
# No CMIP6 directory present.

The (bad) generated plots match between the two though:

Previous run:
v2.LR.historical_0201_20231220
This run:
v2.LR.historical_0201

@forsyth2
Copy link
Collaborator Author

If I replace source /lcrc/soft/climate/e3sm-unified/load_latest_e3sm_unified_chrysalis.sh with source /lcrc/soft/climate/e3sm-unified/load_e3sm_unified_1.9.1_chrysalis.sh in ts_land_monthly_1850-1851-0002.bash and then run sbatch ts_land_monthly_1850-1851-0002.bash, it does actually work. So, something happened in Unified 1.9.2.

@tomvothecoder @chengzhuzhang do you know what could cause the above error in e3sm_to_cmip? I'm not seeing an immediately obvious cause on https://github.com/E3SM-Project/e3sm_to_cmip/releases. (Note: I'm not saying there's necessarily a bug in e3sm_to_cmip. Rather, Unified 1.9.2. might have exposed an error in something I'm doing here).

@forsyth2
Copy link
Collaborator Author

As for the bad plots, I'm seeing a lot of lat_lon_driver.py(run_diag:154) >> Can not process reference data, analyse test data only in the output log. @chengzhuzhang Is there any reason E3SM Diags can't process the reference data? It would be useful to have more information in that error message.

@forsyth2
Copy link
Collaborator Author

I see a note to myself about properly setting test_data_path or perhaps rather reference_data_path. Maybe that's causing the lat_lon_driver.py(run_diag:154) >> Can not process reference data, analyse test data only error and thus the bad plots.

@forsyth2
Copy link
Collaborator Author

In the latest run, for [e3sm_diags] > [[ land_monthly_180x360_aave ]], I had the following set:

reference_data_path = "/lcrc/soft/climate/e3sm_diags_data/obs_for_e3sm_diags/climatology"
test_data_path = ""

@chengzhuzhang Should I set these differently to avoid the bad plots above? (only test showing up and even then all in one color)

@chengzhuzhang
Copy link
Collaborator

In the latest run, for [e3sm_diags] > [[ land_monthly_180x360_aave ]], I had the following set:

reference_data_path = "/lcrc/soft/climate/e3sm_diags_data/obs_for_e3sm_diags/climatology"
test_data_path = ""

@chengzhuzhang Should I set these differently to avoid the bad plots above? (only test showing up and even then all in one color)

For testing purpose, for model vs model run, I think you could point test and ref paths to a same model data directory.

@forsyth2
Copy link
Collaborator Author

For testing purpose, for model vs model run, I think you could point test and ref paths to a same model data directory.

@chengzhuzhang Well we already know model-vs-model works. It's model-vs-obs that's failing. Are you saying just use the same data for both, but don't set run_type = "model_vs_model"?

@chengzhuzhang
Copy link
Collaborator

Oh, i didn't realize that is for model vs obs, which have problems. I think the metrics looks right, but the contour levels range are too small. Somehow, e3sm_diags is using the contour levels set for a different variable:
https://github.com/E3SM-Project/e3sm_diags/blob/main/e3sm_diags/driver/default_diags/lat_lon_land_model_vs_obs.cfg

The QRUNOFF is the only model vs obs variable that is supported in lat lon land for now. We should file an issue in e3sm_diags. And I think land group will mostly rely on ILAMB for model vs obs. Let's prioritize having model vs model working in zppy for now.

@forsyth2
Copy link
Collaborator Author

Let's prioritize having model vs model working in zppy for now.

In that case, that is in fact working. Should we mark this done then? I'll file an issue in E3SM Diags for model-vs-obs.

@chengzhuzhang
Copy link
Collaborator

Let's prioritize having model vs model working in zppy for now.

In that case, that is in fact working. Should we mark this done then? I'll file an issue in E3SM Diags for model-vs-obs.

Sounds good. Could you provide the configuration file to set the model vs model run up? I will test during code review. Thank you!

@forsyth2
Copy link
Collaborator Author

@chengzhuzhang Thanks, I'll clean up this PR and run the full test suite, then tag you for final code review. (If you want an early start, you can run the tests/integration/generated/test_complete_run_chrysalis.cfg in this PR).

@chengzhuzhang
Copy link
Collaborator

@forsyth2 sounds good, I will review once the PR is ready.

@forsyth2
Copy link
Collaborator Author

I have all tests passing except the complete_run test. I'm getting a few weird errors there.

Notably:
(1) Taylor diagrams are missing variables (Actual, Expected):
Actual
Expected

(2) FLDS has a wildly different plot (Actual, Expected):
Actual
Expected

It occurs to me I haven't run the "c. test final Unified" test-process from https://e3sm-project.github.io/zppy/_build/html/main/dev_guide/release_testing.html to test zppy in the actually released E3SM Unified 1.9.2. I'm going to run that and check if these issues show up there.

@forsyth2
Copy link
Collaborator Author

I'm going to run that and check if these issues show up there.

Strangely, the complete_run test passes using Unified 1.9.2, but the bundles test fails there: see #543. (Note I had diags_environment_commands set to use E3SM Diags 2.10.1).

@@ -116,6 +116,7 @@ years = "1850:1854:2", "1850:1854:4",
ref_years = "1850-1851",
reference_data_path = "/lcrc/group/e3sm/ac.forsyth2/zppy_test_complete_run_output/v2.LR.historical_0201/post/atm/180x360_aave/clim"
run_type = "model_vs_model"
sets = "lat_lon","lat_lon_land","zonal_mean_xy","zonal_mean_2d","polar","cosp_histogram","meridional_mean_2d","enso_diags","qbo","diurnal_cycle","annual_cycle_zonal_mean","streamflow", "zonal_mean_2d_stratosphere",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@forsyth2 hi, I'm skimming at this PR to try figure out the problem. It sounds like only one e3sm_diags viewer is created for including both lat_lon and lat_lon_land? I had a comment earlier. We should have a separate task i.e. [[ lnd_monthly_180x360_aave_mvm ]] for the "lat_lon_land" set, because it is more straightforward, for instance this set depends on the the land data paths. (in the implementation here, I don't think the land data are used)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, at the time, I thought that advice was just for model-vs-obs, not model-vs-model since that seemed to be working. So yes, that's a good idea here too. I'll separate them and see if anything changes in the plots. Thanks.

@chengzhuzhang
Copy link
Collaborator

chengzhuzhang commented Jan 30, 2024

I can't explain the contour level range difference is #534 (comment). But based on my comment here, I don't think the land data are correctly feed in e3sm_diags lat_lon_land set for model vs model.

@forsyth2
Copy link
Collaborator Author

forsyth2 commented Feb 2, 2024

I don't think the land data are correctly feed in e3sm_diags lat_lon_land set for model vs model.

I separated the lat_lon_land set into its own run (everything is identical except the sets being run):

  [[ land_monthly_180x360_aave_mvm ]]
  # Test model-vs-model using the same files as the reference
  climo_diurnal_frequency = "diurnal_8xdaily"
  climo_diurnal_subsection = "atm_monthly_diurnal_8xdaily_180x360_aave"
  climo_subsection = "atm_monthly_180x360_aave"
  diff_title = "Difference"
  partition = "compute"
  qos = "regular"
  ref_final_yr = 1851
  ref_name = "v2.LR.historical_0201"
  ref_start_yr = 1850
  ref_years = "1850-1851",
  reference_data_path = "/lcrc/group/e3sm/ac.forsyth2/zppy_test_complete_run_output/v2.LR.historical_0201/post/atm/180x360_aave/clim"
  run_type = "model_vs_model"
  sets = "lat_lon_land",
  short_ref_name = "v2.LR.historical_0201"
  swap_test_ref = False
  tag = "model_vs_model"
  ts_num_years_ref = 2
  ts_subsection = "atm_monthly_180x360_aave"
  walltime = "5:00:00"
  years = "1852-1853",

After looking at the generated viewers, I do suspect that the test errors (differences in FLDS plot and the Taylor Diagrams) were due to the test comparing the actual lat_lon_land output with the expected lat_lon output.

The lat_lon_land colormaps do seem off. The FLDS test/ref plots are all white and the FSDS test/ref plots are all red.

I'm going to try running with:

  climo_subsection = "land_monthly_180x360_aave"

@forsyth2
Copy link
Collaborator Author

forsyth2 commented Feb 2, 2024

It turns out I get the same results using climo_subsection = "land_monthly_180x360_aave".

@chengzhuzhang
Copy link
Collaborator

chengzhuzhang commented Feb 8, 2024

It turns out I get the same results using climo_subsection = "land_monthly_180x360_aave".

@forsyth2 I think first is to make sure the regirdded land data is being read by e3sm_diags. If the land data is read in correctly. The ocean should be masked on plots.

@forsyth2
Copy link
Collaborator Author

forsyth2 commented Mar 1, 2024

Replaced by #548.

@forsyth2 forsyth2 closed this Mar 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semver: new feature New feature (will increment minor version)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature]: Add new E3SM Diags set -- lat_lon_land
2 participants