Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update JEDI hashes (20250203) #1475

Draft
wants to merge 10 commits into
base: develop
Choose a base branch
from
Draft

Conversation

RussTreadon-NOAA
Copy link
Contributor

@RussTreadon-NOAA RussTreadon-NOAA commented Feb 2, 2025

Description

Update select JEDI hashes on 20250203

Companion PRs

none

Issues

Resolves #1474

Automated CI tests to run in Global Workflow

  • atm_jjob
  • C96C48_ufs_hybatmDA
  • C96C48_hybatmaerosnowDA
  • C48mx500_3DVarAOWCDA
  • C48mx500_hybAOWCDA
  • C96C48_hybatmDA

@RussTreadon-NOAA RussTreadon-NOAA self-assigned this Feb 2, 2025
@RussTreadon-NOAA
Copy link
Contributor Author

This PR will be marked Ready for review once all GDASApp ctests have been run and pass on Hera, Hercules, and Orion and g-w CI passes on WCOSS2.

@RussTreadon-NOAA RussTreadon-NOAA added hera-GW-RT Queue for automated testing with global-workflow on Hera orion-GW-RT Queue for automated testing with global-workflow on Orion hercules-GW-RT Queue for automated testing with global-workflow on Hercules labels Feb 2, 2025
@emcbot emcbot added hercules-GW-RT-Running Automated testing with global-workflow running on Hercules orion-GW-RT-Running Automated testing with global-workflow running on Orion hera-GW-RT-Running Automated testing with global-workflow running on Hera and removed hercules-GW-RT Queue for automated testing with global-workflow on Hercules orion-GW-RT Queue for automated testing with global-workflow on Orion hera-GW-RT Queue for automated testing with global-workflow on Hera labels Feb 2, 2025
@emcbot
Copy link

emcbot commented Feb 2, 2025

Automated GW-GDASApp Testing Results:
Machine: hercules

Start: Sun Feb  2 14:48:35 CST 2025 on hercules-login-1.hpc.msstate.edu
---------------------------------------------------
Build:                                 *SUCCESS*
Build: Completed at Sun Feb  2 15:24:43 CST 2025
---------------------------------------------------
Tests: ctest -j12 -R gdasapp
Tests:                                 *SUCCESS*
Tests: Completed at Sun Feb  2 16:24:09 CST 2025
Tests: 100% tests passed, 0 tests failed out of 135

@emcbot emcbot added hercules-GW-RT-Passed Automated testing with global-workflow successful on Hercules and removed hercules-GW-RT-Running Automated testing with global-workflow running on Hercules labels Feb 2, 2025
@emcbot
Copy link

emcbot commented Feb 2, 2025

Automated GW-GDASApp Testing Results:
Machine: hera

Start: Sun Feb  2 20:56:24 UTC 2025 on hfe09
---------------------------------------------------
Build:                                 *SUCCESS*
Build: Completed at Sun Feb  2 21:42:53 UTC 2025
---------------------------------------------------
Tests: ctest -j12 -R gdasapp
Tests:                                 *SUCCESS*
Tests: Completed at Sun Feb  2 22:43:09 UTC 2025
Tests: 100% tests passed, 0 tests failed out of 135

@emcbot emcbot added hera-GW-RT-Passed Automated testing with global-workflow successful on Hera and removed hera-GW-RT-Running Automated testing with global-workflow running on Hera labels Feb 2, 2025
@emcbot
Copy link

emcbot commented Feb 2, 2025

Automated GW-GDASApp Testing Results:
Machine: orion

Start: Sun Feb  2 02:51:15 PM CST 2025 on orion-login-1.hpc.msstate.edu
---------------------------------------------------
Build:                                 *SUCCESS*
Build: Completed at Sun Feb  2 03:55:49 PM CST 2025
---------------------------------------------------
Tests: ctest -j12 -R gdasapp
Tests:                                 *SUCCESS*
Tests: Completed at Sun Feb  2 05:38:19 PM CST 2025
Tests: 100% tests passed, 0 tests failed out of 135

@emcbot emcbot added orion-GW-RT-Passed Automated testing with global-workflow successful on Orion and removed orion-GW-RT-Running Automated testing with global-workflow running on Orion labels Feb 2, 2025
@RussTreadon-NOAA
Copy link
Contributor Author

WCOSS2 g-w CI

Clone g-w develop at 380946c on Cactus. Update sorc/gdas.cd to feature/stable-nightly at e6eafb0`.

All g-w components successfully build except GDASApp. The GDASApp build fails with

[100%] Linking CXX executable ../../bin/gdas_soca_error_covariance_toolbox.x
cd /lfs/h2/emc/da/noscrub/russ.treadon/git/global-workflow/stable-nightly/sorc/gdas.cd/build/gdas/mains && /apps/ops/test/spack-stack-1.6.0-nco/envs/nco-intel-19.1.3.304/install/intel/19.1.3.304/cmake-3.23.1-chpcsen/bin/cmake -E remove /lfs/h2/emc/da/noscrub/russ.treadon/git/global-workflow/stable-nightly/sorc/gdas.cd/build/bin/gdas_soca_error_covariance_toolbox.x
cd /lfs/h2/emc/da/noscrub/russ.treadon/git/global-workflow/stable-nightly/sorc/gdas.cd/build/gdas/mains && /apps/ops/test/\
spack-stack-1.6.0-nco/envs/nco-intel-19.1.3.304/install/intel/19.1.3.304/cmake-3.23.1-chpcsen/bin/cmake -E cmake_link_script CMakeFiles/gdas_soca_error_covariance_toolbox.x.dir/link.txt --verbose=NO
/usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: ../../lib/libsaber.so: undefined reference to `atlas::grid::detail::partitioner::TransPartitioner::TransPartitioner()'
make[2]: *** [gdas/mains/CMakeFiles/gdas_soca_error_covariance_toolbox.x.dir/build.make:174: bin/gdas_soca_error_covariance_toolbox.x] Error 1
make[2]: Leaving directory '/lfs/h2/emc/da/noscrub/russ.treadon/git/global-workflow/stable-nightly/sorc/gdas.cd/build'
make[1]: *** [CMakeFiles/Makefile2:28951: gdas/mains/CMakeFiles/gdas_soca_error_covariance_toolbox.x.dir/all] Error 2
make[1]: Leaving directory '/lfs/h2/emc/da/noscrub/russ.treadon/git/global-workflow/stable-nightly/sorc/gdas.cd/build'
make: *** [Makefile:166: all] Error 2

Notice that Hera, Hercules, and Orion use more recent versions of atlas

./hera.intel.lua:load("atlas/0.35.1")
./orion.intel.lua:load("atlas/0.35.1")
./wcoss2.intel.lua:load("atlas/0.35.0")
./hercules.gnu.lua:load("atlas/0.36.0")
./hercules.intel.lua:load("atlas/0.36.0")

Might a more recent version of atlas resolve the
undefined reference to atlas::grid::detail::partitioner::TransPartitioner::TransPartitioner()'
message in libsaber.so?

Manually load GDASApp wcoss2.intel modulefile on Cactus. module spider atlas returns

---------------------------------------------------------------------------------------------------------------------------------
  atlas:
---------------------------------------------------------------------------------------------------------------------------------
     Versions:
        atlas/0.33.0
        atlas/0.35.0 

Unfortunately, atlas/0.35.0 is the most recent version of atlas available in the current installation of spack-stack/1.6.0 on WCOSS2.

Note that saber PR #1001 contains references to TransPartitioner.h in WriteFields.cc.

As a test back up working copy of sorc/saber to hash d51284c6. This is the commit prior to saber PR #1001. feature/stable-nightly successfully builds on Cactus using this saber hash.

We need to reach out to the library time to request installation of a newer version of atlas on WCOSS2. This PR can not move forward until we can build and run feature/stable-nightly on WCOSS2.

Attention: @DavidNew-NOAA , @danholdaway , @CoryMartin-NOAA , @guillaumevernieres

@RussTreadon-NOAA
Copy link
Contributor Author

WCOSS2 g-w CI
As a test revert sorc/saber back to d51284c6, build GDASApp, and run g-w CI on Cactus. All configurations successfully run to completion.

/lfs/h2/emc/ptmp/russ.treadon/EXPDIR/C48_ATM_stable-nightly
   CYCLE         STATE           ACTIVATED              DEACTIVATED     
202103231200        Done    Feb 03 2025 10:05:18    Feb 03 2025 11:15:25
 
/lfs/h2/emc/ptmp/russ.treadon/EXPDIR/C48mx500_3DVarAOWCDA_stable-nightly
   CYCLE         STATE           ACTIVATED              DEACTIVATED     
202103241800        Done    Feb 03 2025 10:05:22    Feb 03 2025 10:20:23
202103250000      Active    Feb 03 2025 10:05:22             -          
 
/lfs/h2/emc/ptmp/russ.treadon/EXPDIR/C48mx500_hybAOWCDA_stable-nightly
   CYCLE         STATE           ACTIVATED              DEACTIVATED     
202103241800        Done    Feb 03 2025 10:05:24    Feb 03 2025 10:20:30
202103250000        Done    Feb 03 2025 10:05:24    Feb 03 2025 11:25:21
 
/lfs/h2/emc/ptmp/russ.treadon/EXPDIR/C48_S2SWA_gefs_stable-nightly
   CYCLE         STATE           ACTIVATED              DEACTIVATED     
202103231200        Done    Feb 03 2025 10:05:39    Feb 03 2025 12:01:05
 
/lfs/h2/emc/ptmp/russ.treadon/EXPDIR/C48_S2SW_stable-nightly
   CYCLE         STATE           ACTIVATED              DEACTIVATED     
202103231200        Done    Feb 03 2025 10:05:26    Feb 03 2025 11:30:46
 
/lfs/h2/emc/ptmp/russ.treadon/EXPDIR/C96_atm3DVar_stable-nightly
   CYCLE         STATE           ACTIVATED              DEACTIVATED     
202112201800        Done    Feb 03 2025 10:05:29    Feb 03 2025 10:20:40
202112210000        Done    Feb 03 2025 10:05:29    Feb 03 2025 12:30:37
202112210600        Done    Feb 03 2025 10:05:29    Feb 03 2025 12:15:35
 
/lfs/h2/emc/ptmp/russ.treadon/EXPDIR/C96C48_hybatmaerosnowDA_stable-nightly
   CYCLE         STATE           ACTIVATED              DEACTIVATED     
202112201200        Done    Feb 03 2025 10:05:31    Feb 03 2025 10:25:41
202112201800        Done    Feb 03 2025 10:05:31    Feb 03 2025 12:30:41
202112210000        Done    Feb 03 2025 10:05:31    Feb 03 2025 12:20:45
 
/lfs/h2/emc/ptmp/russ.treadon/EXPDIR/C96C48_hybatmDA_stable-nightly
   CYCLE         STATE           ACTIVATED              DEACTIVATED     
202112201800        Done    Feb 03 2025 10:05:33    Feb 03 2025 10:20:49
202112210000        Done    Feb 03 2025 10:05:33    Feb 03 2025 12:05:39
202112210600        Done    Feb 03 2025 10:05:33    Feb 03 2025 12:10:44
 
/lfs/h2/emc/ptmp/russ.treadon/EXPDIR/C96C48_ufs_hybatmDA_stable-nightly
   CYCLE         STATE           ACTIVATED              DEACTIVATED     
202402231800        Done    Feb 03 2025 10:05:36    Feb 03 2025 10:20:53
202402240000        Done    Feb 03 2025 10:05:36    Feb 03 2025 12:53:23
202402240600        Done    Feb 03 2025 10:05:36    Feb 03 2025 12:52:37
 
/lfs/h2/emc/ptmp/russ.treadon/EXPDIR/C96_S2SWA_gefs_replay_ics_stable-nightly
   CYCLE         STATE           ACTIVATED              DEACTIVATED     
202011010000        Done    Feb 03 2025 10:05:43    Feb 03 2025 10:56:14

@RussTreadon-NOAA RussTreadon-NOAA added the DO NOT MERGE PR is not ready to be merged yet label Feb 3, 2025
@RussTreadon-NOAA
Copy link
Contributor Author

While g-w based GDASAPpp ctests pass on Hera, Hercules, and Orion, we can not build feature/stable-nightly at e6eafb0 on WCOSS2. Starting with saber @ b85ece5 we need at least atlas/0.35.1. Currently the WCOSS2 spack-stack installation only has atlas versions up to 0.35.0.

Mark this PR DO NO MERGE until WCOSS2 spack stack is updated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DO NOT MERGE PR is not ready to be merged yet hera-GW-RT-Passed Automated testing with global-workflow successful on Hera hercules-GW-RT-Passed Automated testing with global-workflow successful on Hercules orion-GW-RT-Passed Automated testing with global-workflow successful on Orion
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update JEDI hashes (20250203)
2 participants