Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move diag_table creation to fcst run script #349

Conversation

christinaholtNOAA
Copy link
Contributor

DESCRIPTION OF CHANGES:

The diag_table file is a cycle-dependent file that should be at forecast run-time, similar to the model_configure file corrected in #256. This change moves the creation of that file to the forecast runscript, and removes other instances of this file's creation in both the make_grid run script and the generate_FV3LAM_wflow script.

Preparing any working directories beforehand will cause problems for real-time runs, as well as larger retro runs where disk management becomes a necessary task.

In addition, the modulefiles have changed names (I assume) in UFS_UTILS and I could not run the test without the addition of the "intel" suffix to those.

TESTS CONDUCTED:

I successfully tested on Jet with the community configure file.

Copy link
Collaborator

@JeffBeck-NOAA JeffBeck-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @christinaholtNOAA, I reviewed you PR this evening and things look good. I did have one question:

I understand why you'd not want to create your cycle directory before it's activated in Rocoto for real-time runs, especially without a defined end time; however, I wouldn't expect the diag_table to change per cycle, unless someone wanted to output different fields for different cycles. Is the purpose of this PR just to limit creation of the cycle directory until run time by moving the diag_table creation to the run_fcst task? That would make sense to me.

Ultimately, it may be possible to create a single diag_table (copied to the root run directory), used by all cycles. The only reason we don't do that already is based on the header in the diag_table that is cycle-specific, but we aren't even sure that info is used by FV3.

Regarding the module files, we've recently changed the build system in the authoritative ufs-srweather-app for the release and now manually source a README (module-like) file that is used both at compilation and at run-time for all utilities. Therefore, a recent PR has removed sourcing of all external repository module files, and we instead just source the README file and then "module load" the .local file that typically contains the required conda information. So, with those changes, it looks like there's a minor conflict in the generate script for your PR. Once you update that piece, I think we're ready to merge.

Thanks for your work on this!

@christinaholtNOAA
Copy link
Contributor Author

@JeffBeck-NOAA The header in its current form is cycle-dependent. You're right, nothing else changes per cycle to my knowledge. As a general rule, making cycle-dependent files at run time for the task that needs it is a useful separation of concerns.

I can remove the changes for the modulefile sourcing, but at the time I was doing this work a week ago, it didn't work without the changes.

@christinaholtNOAA
Copy link
Contributor Author

In terms of resolving the conflicts with ush/generate_FV3LAM_wflow.sh, I see that the lines I modified were commented out instead of deleted. Since we use version control, it's quite messy to leave commented code around. You can always retrieve it from git history. I would like to remove it.

@JeffBeck-NOAA
Copy link
Collaborator

JeffBeck-NOAA commented Nov 12, 2020

Yes, the previous code to copy and source the module files was dispersed throughout multiple routines. In order to facilitate testing and ease of reverting to anything previously, they were commented out temporarily while we made sure the new build/sourcing of the README files in ufs-srweather-app worked as expected. I believe we're to a point where the new method has been sufficiently vetted, so we can remove the commented lines. There is an issue specifically to do that here.

@christinaholtNOAA
Copy link
Contributor Author

I am unable to successfully resolve conflicts and test because there is a mismatch between ufs-srweather-app:master and regional_workflow:develop. I opened an Issue #57 in ufs-srweather-app to address that problem.

@christinaholtNOAA
Copy link
Contributor Author

Apologies for the severe delays here. I updated this branch to the top of develop and re-ran a few of the end-to-end tests on Hera.

@@ -621,24 +621,6 @@ done
#
#-----------------------------------------------------------------------
#
# Call a function to generate the array ALL_CDATES containing the cycle
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we leave this script as is? I like having an array handy (ALL_CDATES) that contains all the cycles over which the workflow will run. Maybe even print it out for informational purposes. Otherwise everything looks good, so I'll approve.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gsketefian It's fine to leave it in for experiments that are only a couple of cycles, but when using the configuration layer for real time purposes where common practice is to set the last cycle to an arbitrarily long time in the future it becomes more cumbersome than necessary, printing out an array of 1800+ dates for a real time run that defines 5 cycles per day for a year (example from current RRFS_dev* configs). Can you not get this information from the Rocoto database, where it would be more accurate, anyway?

If there are no dependencies on needing to know all the cycles, there should be no need for the var_defns.sh file to know this huge array.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@christinaholtNOAA, @gsketefian, since this array may be useful in the future, let's create an if statement here that only outputs the start and end cycles above a given threshold (for now, 30). I've tested this with Christina's fork and it works as expected.

set_cycle_dates
date_start="${DATE_FIRST_CYCL}"
date_end="${DATE_LAST_CYCL}"
cycle_hrs="${CYCL_HRS_str}"
output_varname_all_cdates="ALL_CDATES"

NUM_CYCLES="${#ALL_CDATES[@]}"

if [ $NUM_CYCLES -gt 30 ] ; then
unset ALL_CDATES
print_info_msg "$VERBOSE" "
Too many cycles in ALL_CDATES, defining in abbreviated form."
ALL_CDATES="${DATE_FIRST_CYCL}${CYCL_HRS[0]}...${DATE_LAST_CYCL}${CYCL_HRS[-1]}"
fi

@@ -2662,18 +2644,6 @@ LBC_SPEC_FCST_HRS=(${LBC_SPEC_FCST_HRS[@]})
#
#-----------------------------------------------------------------------
#
# The number of cycles for which to make forecasts and the list of starting
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above (please leave in the script for now). Thx!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code would stay unchanged with the added if statement mentioned above.

@jwolff-ncar
Copy link
Contributor

What is the status of this PR? It looks like it was approved but never merged.

@JeffBeck-NOAA
Copy link
Collaborator

JeffBeck-NOAA commented Apr 13, 2021 via email

@christinaholtNOAA
Copy link
Contributor Author

I thought I was waiting back on the comment "If there are no dependencies on needing to know all the cycles, there should be no need for the var_defns.sh file to know this huge array."

Having a partial set of all the cycles in an array is false information, and having all cycles in the file (especially real time and large retro runs) is too much. This information is not currently needed or used, to my knowledge, and should not be included in this file. If there comes a time that the information is needed in the future a different mechanism to get it directly from rocoto XML should be developed to deliver accurate information.

@JeffBeck-NOAA
Copy link
Collaborator

I thought I was waiting back on the comment "If there are no dependencies on needing to know all the cycles, there should be no need for the var_defns.sh file to know this huge array."

Having a partial set of all the cycles in an array is false information, and having all cycles in the file (especially real time and large retro runs) is too much. This information is not currently needed or used, to my knowledge, and should not be included in this file. If there comes a time that the information is needed in the future a different mechanism to get it directly from rocoto XML should be developed to deliver accurate information.

@christinaholtNOAA, Did you see my reply on Feb 12? There was a comprise option, where only the first and last cycle were defined when an experiment exceeded a specific NUM_CYCLES. This would eliminate large arrays from being stored in var_defns.sh. To avoid a potentially ambiguous "first cycle...end cycle" ALL_CDATES array, we can also implement the following to eliminate it altogether when NUM_CYCLES is too large:

if [ $NUM_CYCLES -gt 30 ] ; then
print_info_msg "$VERBOSE" "
Too many cycles in ALL_CDATES, unsetting variable."
unset ALL_CDATES
fi

This option preserves NUM_CYCLES, which is half of what @gsketefian was asking. We could easily reconstruct ALL_CDATES for other purposes in the future. There's no reason to grab this information from Rocoto when it already exists in the shell scripts.

@christinaholtNOAA
Copy link
Contributor Author

I don't think it makes sense to sometimes have the availability of a variable, and sometimes not, especially when it isn't currently used. If you build in a requirement to use that variable and it doesn't show up for large numbers of cycles, then what is the point. In real time (only one of the uses of this set of scripts, I understand), the contents is still a lie since cycles get activated only when the actual time rolls around.

For example, this variable might have 365 cycles in it if we configure to "run every day in 2021" for real time runs. If we configure and deploy on Feb 1, there really are only going to be 334 active cycles.

I can't think of any relevant reason to hold onto this information. Holding onto variables that could be useful in the future is bad practice and leaves a lot of unused, untested, unneeded code that is misleading at best, and buggy at worst.

@JeffBeck-NOAA
Copy link
Collaborator

JeffBeck-NOAA commented Apr 13, 2021

@christinaholtNOAA I understand your concerns. I don't think it's a lie, since it's defined as the number of cycles and dates for a given experiment that were set by the user at the generation step, which is separate from what has actually run. Also, if the XML is modified after the workflow generation step, then anything defined in the config.sh/var_defns.sh scripts may no longer be valid.

Anyway, @gsketefian, since we have git history of these two variables, are you comfortable with their removal in this PR, knowing that in the future we can access and bring the code back for a specific implementation?

@christinaholtNOAA
Copy link
Contributor Author

Is there any update on this PR?

@gsketefian
Copy link
Collaborator

I find these two variables occasionally useful and would like to keep them, but I understand they are too long for real time runs. I think a good compromise would be to introduce a new workflow variable, say REALTIME, that is TRUE/FALSE. If TRUE, in setup.sh don't record NUM_CYCLES and ALL_CDATES into var_defns.sh; otherwise, record them. We will very likely have more situations like this in which we want the workflow to behave differently depending on if it's being used for research or real-time runs, so it would be handy to introduce this REALTIME variable now.

@christinaholtNOAA
Copy link
Contributor Author

@gsketefian The REALTIME flag really should be introduced when it actually does what you think it does. That's quite a large lift and way outside the scope of this PR.

@christinaholtNOAA
Copy link
Contributor Author

@gsketefian If you'd like this feature in when the full "REALTIME" flag is available and actually gives you that functionality, can you not add it back from history at that point, as @JeffBeck-NOAA suggests?

As I've mentioned, there are more reliable, accurate ways to get this list of cycles from Rocoto than to have it print out in a text file when the information is not being used by the workflow itself. It is not wise to leave a potentially huge, potentially partial (and therefore inaccurate), unused variable in a config file for personal use when the caveats for its use are undocumented.

@gsketefian
Copy link
Collaborator

As I said, I use ALL_CDATES now, especially during debugging of ensemble runs to see what cycles I've specified and to make sure they correspond to the ones that show up in the rocoto database (for the retro cases I use for testing, these two sets of cycles should be the same).

I think it is ok to use the REALTIME variable as long as we specify in the documentation that the feature is a work in progress. If you think it may mislead users, you can use another variable name.

@christinaholtNOAA
Copy link
Contributor Author

I think we're at an impasse and may need input from other reviewers... @jwolff-ncar @JeffBeck-NOAA @JulieSchramm @mkavulich @BenjaminBlake-NOAA

I do not feel comfortable with the solution to add a REALTIME flag to save this variable for personal debugging use. A REALTIME flag that only does this one thing is premature with this PR and is confusing given that it does nothing other than turn off output in the config that isn't used by the workflow.

This is also a problem (potentially very long array of cycles) for longer retro runs, which don't fall under the "realtime" case at all.

Again, ALL_CDATES is not used by the workflow after this modification, and can be easily obtained from the rocoto database.

@gsketefian
Copy link
Collaborator

As I mentioned, if you don't feel comfortable with REALTIME, use another name for the variable, like RECORD_ALL_CDATES or something better you can think of. Just set it to FALSE for all your runs. Easy compromise solution.

@christinaholtNOAA
Copy link
Contributor Author

I am going to try to get a bit more input from others on the #workflow-team Slack channel, since we don't seem to moving anywhere here.

I also wanted to mention, since I didn't previously, a few other specific reasons that I would like to do the cleanup of variables this PR proposed. Removing the now unused workflow configuration variables helps the workflow be more consistent with the following software engineering principles...

-- Duplication of information (Rocoto already tracks this)
-- Separation of concerns (Rocoto is the only place where this should be tracked, if that is the layer that is tasked with tracking it)

@jwolff-ncar
Copy link
Contributor

Thank you for your work on this PR, @christinaholtNOAA. I have not used the ALL_CDATES array so I do not have a strong feeling that it needs to be kept around. It does seem that we are at an impasse here with how to move forward. Getting input from others is fine with me. I think keeping software best practices in mind makes the most sense and is an important consideration here.

@JeffBeck-NOAA
Copy link
Collaborator

@BenjaminBlake-NOAA, @RatkoVasic-NOAA, @chan-hoo, and @llpcarson, would you be able to review this PR and the related conversation to provide your thoughts regarding the proposed solutions? Thank you!

@chan-hoo
Copy link
Collaborator

I understand both sides. @christinaholtNOAA's approach might be ideal to make the script simple and light. However, I think, the point is that it is still quite useful to @gsketefian who is the no.1 contributor of the regional workflow with 554 commits. I don't see any reasons to remove it at this moment if it is useful to anyone. An array of 1800+ dates should not be the case for the majority of users. So I agree with @JeffBeck-NOAA's compromise plan using if-statements.

@BenjaminBlake-NOAA
Copy link
Collaborator

After reading through the conversation and giving it some thought, I also think @JeffBeck-NOAA 's suggestion of using if statements makes sense and is a good compromise. But I understand @christinaholtNOAA 's concerns about a variable sometimes being defined and sometimes not, which could potentially be confusing.

@JeffBeck-NOAA
Copy link
Collaborator

Thank you @jwolff, @BenjaminBlake-NOAA, and @chan-hoo for your review of this PR! Based on these reviews, there is agreement that concerns are warranted related to the use of NUM_CYCLES and ALL_CDATES; however, it looks like a consensus to go with the compromise approach mentioned here was found. Adding this code will avoid undefined/defined variables and provide values of ALL_CDATES that will be abbreviated for experiments with more than 30 cycles defined during workflow generation. @christinaholtNOAA and @gsketefian, I understand this is not what either of you desired, but my hope is that it will resolve concerns about excessively long variable definitions in real-time configurations (Christina's request), yet also retain the variables for debugging (Gerard's request).

@JeffBeck-NOAA
Copy link
Collaborator

Superseded by PR #533.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants