-
Notifications
You must be signed in to change notification settings - Fork 250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rt hires #1471
Rt hires #1471
Conversation
I think we can remove the following directories from the new input-data: FV3_input_data_gsd/FV3_input_data_C96_with_aerosols/ most of the files from AQM/NEXUS - confirmed by @BrianCurtis-NOAA only NEXUS_Expt.nc is needed |
I an remove those directories from my input-data-20221101 directory on Hera and run full tests. |
I removed those directories and ran tests on Hera. Tests PASSED. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tests/rt.sh file permissions has been changed from 755 to 644. Please revert.
I guess it happens when you solve conflict using github's web based tools. |
@RatkoVasic-NOAA I may start rsync if input-data-20221101 is ready on hera. |
@jkbk2004 I see two new directories in input-data-20220414 : |
@jkbk2004 |
@RatkoVasic-NOAA Yes, those are added as a part of external land component PR. |
With the above change in fv3_slurm.IN_jet the regional_2threads run finished in ~260sec. |
I pushed the change to Ratko's branch. Please rerun the test on Jet. |
@DusanJovic-NOAA Great! do you think --npernode will same thing for cheyenne's mpiexec_mpt? let me try. |
@DusanJovic-NOAA is this something we should be doing on all machines we run that test on? |
We can test this change on other machines that are using slurm scheduler. |
I added |
I am wondering one issue about the setup of TASKS/INPES/JNPES of regional_control/regional_2thread. On cheyenne, regional_control job_card sets |
I don't know who set up cheyenne parameters originally. Most of us here don't have access to that machine. |
I agree to turn off regional_2thread on cheyenne and move on for merging. I am going to create an issue about this. If all agrees, then probably we can start merging tomorrow morning. @DusanJovic-NOAA @BrianCurtis-NOAA what do you think? |
The number of processes (-np) is calculated by multiplying number of nodes with number_of_tasks_per_node. So for regional_control that is 4*36 = 144, for regional_2threads it is 7*18 = 126. wcoss2, is also using PBS. Try to set some of the pbs options specified there, specifically 'place=vscatter' and mpiexec options '-ppn @[TPN] -depth @[THRD]'. I do not know if mpiexec_mpt ob cheyenne is the same as mpiexec on wcoss. I also do not have access to cheyenne. |
Let me give it a go on Cheyenne before we write off the RT on Cheyenne. |
|
@jkbk2004 Please try those changes with a few tests and re-run the regional_2threads as well for the logs. |
sure! |
With @BrianCurtis-NOAA contribution of omplace, regional_2threads passes ok now. @DusanJovic-NOAA @RatkoVasic-NOAA @ChunxiZhang-NOAA this pr is ready. |
PR Checklist
This PR is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR. Please consult the ufs-weather-model wiki if you are unsure how to do this.
This PR has been tested using a branch which is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR
An Issue describing the work contained in this PR has been created either in the subcomponent(s) or in the ufs-weather-model. The Issue should be created in the repository that is most relevant to the changes in contained in the PR. The Issue and the dependent sub-component PR
are specified below.
Issue Replacing regional regression tests with more realistic ones #1461
Results for one or more of the regression tests change and the reasons for the changes are understood and explained below.
New tests make new results.
Should create new develop-2022XXXX directory
New input data is required and created on Hera machine. Should be resynched to other platforms.
Instructions: All subsequent sections of text should be filled in as appropriate.
The information provided below allows the code managers to understand the changes relevant to this PR, whether those changes are in the ufs-weather-model repository or in a subcomponent repository. Ufs-weather-model code managers will use the information provided to add any applicable labels, assign reviewers and place it in the Commit Queue. Once the PR is in the Commit Queue, it is the PR owner's responsibility to keep the PR up-to-date with the develop branch of ufs-weather-model.
Description
Replaced old regional regression tests with new, using latest configuration from LAMDX parallel. It is in 3km resolution over small domain.
@DusanJovic-NOAA removed variable "calendar" from namelists (model_configure*.IN) and replaced calendar variable values in input directory in files coupler.res from 2 --> 3
(Calendar: no_calendar=0, thirty_day_months=1, julian=2, gregorian=3, noleap=4)
Issue(s) addressed
Testing
How were these changes tested? What compilers / HPCs was it tested with? Are the changes covered by regression tests? (If not, why? Do new tests need to be added?) Have regression tests and unit tests (utests) been run? On which platforms and with which compilers? (Note that unit tests can only be run on tier-1 platforms)
Dependencies
No dependences