Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

atm_in not always correct #1539

Closed
mfdeakin-sandia opened this issue May 15, 2017 · 13 comments
Closed

atm_in not always correct #1539

mfdeakin-sandia opened this issue May 15, 2017 · 13 comments
Assignees
Labels

Comments

@mfdeakin-sandia
Copy link
Contributor

From @PeterCaldwell
Paraphrased notes from the critical path meeting:
Run with certain parameter settings the model isn't running, put things in usernl cam, where things are picked up from, not getting into the atm in
Checked a couple of places, definitely an error seen right before the submission
Ran run_acme on Edison, saw something wrong with the model,
Reran run_acme, saw atm in was messed up when old_executable was true
Reran the whole thing, checked each of the steps the second time built, the correct values were in atm in

Hypothesis: Using old_executable=true messed this up

@PeterCaldwell
Copy link
Contributor

Thanks Michael -

To make this problem more tractable, I'd suggest:

  1. use run_acme to configure and build the model (you can set submit_run=false because the problem arises before the model runs). Use this particular instance of run_acme because it adds a lot of changes to user_nl_cam (and it's the case that I found the problem in): https://github.com/ACME-Climate/SimulationScripts/blob/31a3a49b13b60e4b85a3ff817e59b503889768d5/archive/A_WCYCL1850/ne30_oEC/run_acme.20170512.beta1_05-v0.4atm-60Locn.ne30_oECv3_ICG.edison . grep the run_acme script for user_nl_cam and check that all the settings in user_nl_cam are also found in atm_in the run directory.
  2. If atm_in looks fine, set old_executable to true and run run_acme again
  3. if atm_in still looks fine, repeat steps 1 and 2 twice as a check for a non-reproducible problem. Try whatever hypotheses come to mind.
    If none of these things reproduces inconsistent atm_in values, report to me that you can't reproduce the problem and consider yourself off the hook.

@PeterCaldwell
Copy link
Contributor

@golaz tracked this down. Apparently atm_in is created by case.setup, but the information from user_nl_cam which should populate it isn't available until later in run_acme and hence you'll see bad values if you look at atm_in just after case.setup is run. From a practical standpoint this isn't a real problem because atm_in is overwritten by build-namelist, which is called during the build and the case.run scripts. But it is unsettling and relies on build-namelist being called over and over again, which is redundant. It would be nice to move the user_nl_cam specification earlier in the run_acme script so case.setup picks it up... but perhaps case.setup would just overwrite user_nl_cam at that point? I'm not sure. @cameronsmith1 - have you thought about this issue?

@cameronsmith1
Copy link
Contributor

cameronsmith1 commented May 16, 2017

Hi @PeterCaldwell , you are correct. The only way to be sure you are seeing the actual atm_in is to wait until the job has started running. You are also correct, that if you set user_nl too soon in run_acme it will get overwritten. Sometimes you also want to wait until other variables or configurations are set, so that you can do the appropriate thing in the user_nl file.

NOTE: this is not a bug in run_acme, rather it is due to features in ACME/CIME.

However, this was all set in place in run_acme before CIME, so the timing of operations may have changed.

@golaz
Copy link
Contributor

golaz commented May 16, 2017

It would be interesting to hear from the CIME team whether there would be a better place in 'run_acme' for specifying user_nl that would avoid this unfortunate situation.

@cameronsmith1 : note that you can run 'preview_namelists' to build atm_in before running the model.

@cameronsmith1
Copy link
Contributor

Hi @golaz, as I recall, ACME/CIME runs preview_namelists at various times. In fact, that is part of the problem, since a namelist gets generated before all of the changes are implemented.

The automatic generation at multiple steps makes sense if people are manually editing/changing the namelist (via user_nl), but it causes the problem identified with this thread when done as part of a script.

If there is a better way with CIME, then that would be great.

@cameronsmith1 cameronsmith1 changed the title atm in not always correct atm_in not always correct May 17, 2017
jgfouca pushed a commit that referenced this issue Jun 2, 2017
Add some maps to config_grids.xml
Added the f19_g17 maps, as wall as the ww3 <-> tx1v1 maps; all files are in inputdata as well.

Test suite: none, just some stand-alone checks
Test baseline: N/A
Test namelist changes: N/A
Test status: bit for bit, plus resolutions that didn't run previously now run

Fixes #1539

User interface changes?: N/A

Code review: @dabail10 can you please look at this? (And maybe see about getting added to ESMCI so I can list you as an official reviewer?)
@cameronsmith1
Copy link
Contributor

Should this issue be closed?

@mfdeakin-sandia
Copy link
Contributor Author

It's not clear to me how to resolve this issue in a way which supports how everyone is/might be using CIME, so probably? I unfortunately don't know enough about how and when the namelists are processed

@cameronsmith1
Copy link
Contributor

I also do not know how to fix this in a way that is clear for all. An argument could be made that CIME generating atm_in at each CIME step now causes more problems than it solves.

In any event, this is a 'confusing CIME feature' and not a 'run_acme bug'. Hence, I propose that we close this issue. I then think it would be good to raise this issue with CIME, although it is very low on my current to-do list.

@PeterCaldwell
Copy link
Contributor

Yup, I agree and I'm resigned to not looking at input files until the model is actually running. I think we should close this issue. I do still think it is bad behavior though - just today we were almost misled by this "feature" in the course of debugging.

@cameronsmith1
Copy link
Contributor

This issue has been catching people for over a decade, and I am sure it will continue to catch people. Do you feel inclined to push this issue on the CIME site?

@PeterCaldwell
Copy link
Contributor

No, it's not the most important battle for me to fight right now. Let's just close this.

@rljacob
Copy link
Member

rljacob commented Jul 20, 2017

This is somewhat covered in this CIME issue: ESMCI/cime#1278

@cameronsmith1
Copy link
Contributor

FYI, I commented on the CIME issue (ESMCI/cime#1278), and it seems the situation is more messy and complicated than I realized, because some components use interim versions of the ACME namelists when building themselves.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants