Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UFS-dev PR#16 #76

Merged
merged 30 commits into from
Dec 13, 2022
Merged

UFS-dev PR#16 #76

merged 30 commits into from
Dec 13, 2022

Conversation

grantfirl
Copy link
Collaborator

Identical to ufs-community#1475
According to that PR,

All cpld_* RTs will change baselines due to changed variables (_aod550) in sfcf.nc files

ChunxiZhang-NOAA and others added 19 commits October 26, 2022 16:19
…y#1481)

* Improve radiative fluxes and cloud cover in FV3 for HR1

Co-authored-by: JONG KIM <jong.kim@noaa.gov>
Co-authored-by: Brian Curtis <brian.curtis@noaa.gov>
* Replace low resolution regression tests with high resolution.

* Rename ICs and result directory

* Remove "calendar" variable from namelists.

* Remove unused scripts: gfdlmp_run.IN & gsd_run.IN

* Add '--cpus-per-task=@[THRD]' to fv3_conf/fv3_slurm.IN_jet

* add omplace to cheyenne qsub script

Co-authored-by: JONG KIM <jong.kim@noaa.gov>
Co-authored-by: Brian Curtis <brian.curtis@noaa.gov>
Co-authored-by: Dusan Jovic <dusan.jovic@noaa.gov>
on-behalf-of @ufs-community <brian.curtis@noaa.gov>
on-behalf-of @ufs-community <brian.curtis@noaa.gov>
on-behalf-of @ufs-community <brian.curtis@noaa.gov>
on-behalf-of @ufs-community <brian.curtis@noaa.gov>
… 550nm (ufs-community#1475)

* Adjust GFS diagnostic ADO output to the exact 550nm in ccpp/physics

* Modify a few lines of code in FV3/ccpp/physics/physics/radiation_aerosols.f to make them properly indented

Co-authored-by: JONG KIM <jong.kim@noaa.gov>
Co-authored-by: Brian Curtis <brian.curtis@noaa.gov>
@grantfirl
Copy link
Collaborator Author

Includes changes from ufs-community#1471 too.

@dustinswales
Copy link
Collaborator

Automated RT Failure Notification
Machine: cheyenne
Compiler: gnu
Job: RT
[RT] Repo location: /glade/scratch/epicufsrt/GMTB/ufs-weather-model/RT/auto_RT/Pull_Requests/1124949537/20221130150632/ufs-weather-model
[RT] Error: Test control 001 failed in run_test failed
[RT] Error: Test control_c48 003 failed in run_test failed
[RT] Error: Test control_p8 006 failed in run_test failed
[RT] Error: Test rap_control 007 failed in run_test failed
[RT] Error: Test rap_decomp 008 failed in run_test failed
[RT] Error: Test rap_2threads 009 failed in run_test failed
[RT] Error: Test rap_sfcdiff 011 failed in run_test failed
[RT] Error: Test rap_sfcdiff_decomp 012 failed in run_test failed
[RT] Error: Test hrrr_control 014 failed in run_test failed
[RT] Error: Test hrrr_control_2threads 015 failed in run_test failed
[RT] Error: Test hrrr_control_decomp 016 failed in run_test failed
[RT] Error: Test rrfs_v1beta 018 failed in run_test failed
[RT] Error: Test rrfs_conus13km_hrrr_warm 019 failed in run_test failed
[RT] Error: Test rrfs_smoke_conus13km_hrrr_warm 020 failed in run_test failed
[RT] Error: Test rrfs_conus13km_radar_tten_warm 021 failed in run_test failed
[RT] Error: Test rrfs_conus13km_radar_tten_warm_2threads 022 failed in run_test failed
[RT] Error: Test control_wam_debug 037 failed in run_test failed
[RT] Error: Test rap_control_dyn32_phy32 038 failed in run_test failed
[RT] Error: Test hrrr_control_dyn32_phy32 039 failed in run_test failed
[RT] Error: Test rap_2threads_dyn32_phy32 040 failed in run_test failed
[RT] Error: Test hrrr_control_2threads_dyn32_phy32 041 failed in run_test failed
[RT] Error: Test hrrr_control_decomp_dyn32_phy32 042 failed in run_test failed
[RT] Error: Test cpld_control_p8 049 failed in run_test failed
[RT] Error: Test cpld_control_nowave_noaero_p8 050 failed in run_test failed
[RT] Error: Test cpld_debug_p8 051 failed in run_test failed
[RT] Error: Test compile_003 failed in run_compile failed
[RT] Error: Test compile_006 failed in run_compile failed
[RT] Error: Test compile_007 failed in run_compile failed
[RT] Error: Test compile_008 failed in run_compile failed
[RT] Error: Test compile_012 failed in run_compile failed
Please make changes and add the following label back: cheyenne-gnu-RT

@grantfirl
Copy link
Collaborator Author

I pulled down the latest develop branches from NOAA-EMC:fv3atm and merged in the commit from the appropriate PR merges in the histories. It didn't seem to have changed anything though.

@dustinswales
Copy link
Collaborator

Automated RT Failure Notification
Machine: hera
Compiler: gnu
Job: BL
[BL] Repo location: /scratch1/BMC/gmtb/RT/auto_RT/Pull_Requests/1124949537/20221212190015/ufs-weather-model
[BL] ERROR: Baseline location exists before creation:
/scratch1/BMC/gmtb/RT/NCAR/main-20221201/GNU
Please make changes and add the following label back: hera-gnu-BL

on-behalf-of @NCAR <dswales@ucar.edu>
@dustinswales
Copy link
Collaborator

Automated RT Failure Notification
Machine: hera
Compiler: gnu
Job: BL
[BL] Repo location: /scratch1/BMC/gmtb/RT/auto_RT/Pull_Requests/1124949537/20221213001513/ufs-weather-model
[BL] Error: Test cpld_control_p8 034 failed in run_test failed
Please make changes and add the following label back: hera-gnu-BL

@dustinswales
Copy link
Collaborator

@grantfirl It seems the Intel tests on Hera passed, but still stuck on GNU...

@grantfirl
Copy link
Collaborator Author

It failed due to exceeding the wall clock limit. How do we increase the wall clock limit for a test? @dustinswales

@grantfirl
Copy link
Collaborator Author

Or, we can cite ufs-community#1440 and go ahead and merge without it passing? I mean, this passed RT on the ufs-dev branch.

@grantfirl
Copy link
Collaborator Author

@dustinswales Did you get the autoRT scripts to run on hecflow01 on Hera? If not, we could just run the test manually on that node and paste a log here in the comments.

@dustinswales
Copy link
Collaborator

This is exactly what was happening last week, and again today. The test cpld_control_p8 isn't passing due to wallclock limit, but weirdly when I go into the run directory and run the RT manually (sbatch job_card) it runs just, ~28min, within the wallclock limit of 30min (/scratch1/BMC/gmtb/RT/stmp2/Dustin.Swales/FV3_RT/rt_28202/cpld_control_p8).

We can increase the time limit in ufs-weather-model/tests/fv3_conf/compile_slurm.IN_hera.
I'm fine with merging this with the wallclock time adjustment.

@dustinswales
Copy link
Collaborator

@grantfirl Yes, the auto-RT scripts are running on Hera now.

@grantfirl
Copy link
Collaborator Author

This is exactly what was happening last week, and again today. The test cpld_control_p8 isn't passing due to wallclock limit, but weirdly when I go into the run directory and run the RT manually (sbatch job_card) it runs just, ~28min, within the wallclock limit of 30min (/scratch1/BMC/gmtb/RT/stmp2/Dustin.Swales/FV3_RT/rt_28202/cpld_control_p8).

We can increase the time limit in ufs-weather-model/tests/fv3_conf/compile_slurm.IN_hera. I'm fine with merging this with the wallclock time adjustment.

Isn't that just for compiling? Shouldn't we be concerned with fv3_conf/fv3_slurm.IN_hera for the run wall clock, which is controlled by the WLCLK environment variable? I would think that, in the future, if there are tests failing due to exceeding wall clock, let's just temporarily set the WLCLK environment variable in the appropriate test file (e.g. tests/tests/cpld_control_p8) while we run RTs. Otherwise, we would have to maintain a delta between the NCAR fork and ufs-dev, right?

@dustinswales
Copy link
Collaborator

This is exactly what was happening last week, and again today. The test cpld_control_p8 isn't passing due to wallclock limit, but weirdly when I go into the run directory and run the RT manually (sbatch job_card) it runs just, ~28min, within the wallclock limit of 30min (/scratch1/BMC/gmtb/RT/stmp2/Dustin.Swales/FV3_RT/rt_28202/cpld_control_p8).
We can increase the time limit in ufs-weather-model/tests/fv3_conf/compile_slurm.IN_hera. I'm fine with merging this with the wallclock time adjustment.

Isn't that just for compiling? Shouldn't we be concerned with fv3_conf/fv3_slurm.IN_hera for the run wall clock, which is controlled by the WLCLK environment variable? I would think that, in the future, if there are tests failing due to exceeding wall clock, let's just temporarily set the WLCLK environment variable in the appropriate test file (e.g. tests/tests/cpld_control_p8) while we run RTs. Otherwise, we would have to maintain a delta between the NCAR fork and ufs-dev, right?

My mistake, I was looking at the compile file. Yes the fv3_conf/fv3_slurm.IN_#### files.
Looking at other tests, tests/tests/cpld_bmark_p8_35d, it seems that they do what you outlined.

The only issue I see is that since these auto-RTs run on the code in a PR, this change of WLCLK will need to be included there (and reverted back before merging, just like the submodule pointers). Correct?

@grantfirl
Copy link
Collaborator Author

Yes. Let's go ahead and try that. I'll set WLCLK in cpld_control_p8 test and push. Then, I'll add back the hera_gnu_BL label. Once successful, we still need to run the RTs against the new baselines more merging, right?

@dustinswales
Copy link
Collaborator

Yes. Let's go ahead and try that. I'll set WLCLK in cpld_control_p8 test and push. Then, I'll add back the hera_gnu_BL label. Once successful, we still need to run the RTs against the new baselines more merging, right?

Sounds like a plan.
I'm not sure that we need to run the RTs after creating the baselines. I don't think EMC does. I believe they run the RTs, then create the new baselines, if necessary, and merge.

@grantfirl
Copy link
Collaborator Author

Yes. Let's go ahead and try that. I'll set WLCLK in cpld_control_p8 test and push. Then, I'll add back the hera_gnu_BL label. Once successful, we still need to run the RTs against the new baselines more merging, right?

Sounds like a plan. I'm not sure that we need to run the RTs after creating the baselines. I don't think EMC does. I believe they run the RTs, then create the new baselines, if necessary, and merge.

OK, I think you're right. Let's see if this works!

on-behalf-of @NCAR <dswales@ucar.edu>
@dustinswales
Copy link
Collaborator

@grantfirl Looks good!
Revert submodules and WLCLK, and merge

@grantfirl
Copy link
Collaborator Author

@grantfirl Looks good! Revert submodules and WLCLK, and merge

Yay! Could you approve the associated ccpp-physics and fv3atm PRs?

@dustinswales
Copy link
Collaborator

Done.

@grantfirl grantfirl merged commit 0bef462 into NCAR:main Dec 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants