Fix indexing issues with correlated error in V16.x #233

KristenBathmann · 2021-10-15T15:58:17Z

A recent update to release/gfsda.v16.x allowed for flexibility in channel selection in the satinfo while preserving correlated error. Testing with GDAS was successful, but testing with GFS failed. This is because the gfsanal and gdasanal use different channel indices. The gfsanal completely ignores monitored channels and generates a satinfo file that does not contain them.

Each covariance file is generated as
open(26,file=trim(cov_file),form='unformatted')
write(26) nch_active, nctot, reclen
write(26) indR
write(26) Rcov
close(26)
nch_active is the number of active channels, for example 174 for IASI
nctot is the total number of channels, for example 616 for IASI
indR are satinfo relative indices, for example 1, 5, 9, 11... for IASI. These would correspond to channels 16, 38, 49 and 51.
Rcov is the covariance matrix.

The correlated obsmod in the master has logic in various routines to handle the inconsistency in channel indexing between the gfsanal and the gdasanal. The update to the set_routine in correlated_obsmod.F90 build a covariance matrix from the file based on the channels in the satinfo. Because of the way the channel indices are specified in the covariance file (1,5,9,11 rather than 16,38,49,51) there is no way for set_ to know which channel has been removed or added in the satinfo, and won't know which entries in Rcov to use. Fixing this issue will therefore require changing the indices in the covariance file (to 16,38,49,51,...), changing correlated_obsmod.F90 and updating the covariance utility util/Correlated_Obs to write the files correctly.

To make matters more complicated, GMAO does not have this indexing issue in their code, and the logical GMAO_ObsErrorCov will need to be used in set_.

KristenBathmann · 2021-10-15T17:16:21Z

This work is being done in my fork KristenBathmann/GSI, in release/gfsda.v16.x
The util/Correlated_Obs directory in here is very out of date with the master. Care will need to be taken when merging this code into the master.

RussTreadon-NOAA · 2021-10-15T17:33:22Z

Given this caution should this work be done in a branch off the current master? gfsda.v16.x will be sync'd with the master. This sync would bring in the necessary changes. This bypasses potential problems when we merge into the master.

…

On Fri, Oct 15, 2021 at 1:16 PM Kristen Bathmann ***@***.***> wrote: This work is being done in my fork KristenBathmann/GSI, in release/gfsda.v16.x The util/Correlated_Obs directory in here is very out of date with the master. Care will need to be taken when merging this code into the master. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#233 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AGNN635XBSMCYO5RYONXGNTUHBOXBANCNFSM5GCIY3LQ> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

ADCollard · 2021-10-15T18:31:41Z

So just so I am clear what you are suggesting, @RussTreadon-NOAA .

Should we

Merge the master into v16.x
Merge 16.x back into the master
Introduce the latest changes that Kristen has introduced today into the master at this point?

I will work on Step number 3 once the other two are complete.

KristenBathmann · 2021-10-15T18:43:08Z

I have put the potential fix into my branch. It works for gdasanal, but I don't have a script to test gfsanal. This will also need to be tested after adding and removing channels from the satinfo, for both gdasanal and gfsanal. You can test by grepping "cond=" in the stdout. It should get smaller when removing channels, and larger when adding, and obviously there should be no warnings about correlated error getting turned off.
This is what you should get when the channel selection is unchanged:
Rcov(Evals) for Instrument: iasi_metop-b:sea cond= 6.8648778493E+01
Rcov(Evals) for Instrument: iasi_metop-b:land cond= 1.5388534575E+02
Rcov(Evals) for Instrument: cris-fsr_n20:sea cond= 3.6394468822E+01

For the updates to the util, I changed how schind is defined in readsatobs.f90, and created an array in here called indRf, which is written to the covariance file in cov_calc.f90. It shouldn't be too difficult to merge this into the master.

RussTreadon-NOAA · 2021-10-15T18:44:31Z

My comments are limited only to the changes Kristen needs to make. I read her caution "Care will need to be taken when merging this code into the master" and wondered if Kristen directly commit her changes to the master. She knows util/Correlated_Obs so she should be able to safely navigate the caution she raised. This way we don't need to cross the master bridge later down the road. The approach you outline is equally valid. I am not familiar with the correlated error code. Hence my desire to have Kristen do the heavy lifting and get her changes into the master.

…

On Fri, Oct 15, 2021 at 2:32 PM Andrew Collard ***@***.***> wrote: So just so I am clear what you are suggesting, Russ. Should we 1. Merge the master into v16.x 2. Merge 16.x back into the master 3. Introduce the latest changes that Kristen has introduced today into the master at this point? I will work on Step #3 <#3> once the other two are complete. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#233 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AGNN632GGOUJYDXNIIIAJQ3UHBXRRANCNFSM5GCIY3LQ> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

RussTreadon-NOAA · 2021-10-15T19:18:45Z

My comments are obsolete. Kristen's changes are already in her forked copy of release/gfsda.v16.x as of 33e46e6.

This branch has been cloned, built, and run on Venus. The global_gsi.x runs past the point of previous failure for the 2021101500 gfs case. stdout does not contain the Rcov(Evals) values Kristen mentions in her comment. Instead, stdout contains

Rcov(Evals) for Instrument: iasi_metop-b:sea  cond=      9.1821883788E+01
Rcov(Evals) for Instrument: iasi_metop-b:land  cond=      5.3391529276E+01
Rcov(Evals) for Instrument: cris-fsr_n20:sea  cond=      1.5226357845E+02

This run is using Rcov files taken from the master. Is this the problem?

KristenBathmann · 2021-10-15T19:23:41Z

Please point to these covariance files:
/gpfs/dell2/emc/modeling/noscrub/Kristen.Bathmann/GSI/fix/Rcov*

These are updated with different indices.
I can create a branch from the current master and just merge in the changes to the util/Correlated_Obs directory if that makes things easier. When you merge release/gfsda.v16.x into the master, it would then look identical. But I don't want to actually do a PR for the master right now.

RussTreadon-NOAA · 2021-10-15T19:31:34Z

Test rerun with Rcov files from /gpfs/dell2/emc/modeling/noscrub/Kristen.Bathmann/GSI/fix/. Now see the following in stdout

Rcov(Evals) for Instrument: iasi_metop-b:sea  cond=      6.8648778493E+01
Rcov(Evals) for Instrument: iasi_metop-b:land  cond=      1.5388534575E+02
Rcov(Evals) for Instrument: cris-fsr_n20:sea  cond=      3.6394468822E+01

The updated Rcov files need to be committed.

KristenBathmann · 2021-10-15T19:58:44Z

I created a branch to reflect what util/Correlated_Obs should look like when release/gfsda.v16.x is merged into the master. It is called v16.x_corrutil

I updated this fix directory with covariance files for IASI-A (I only updated the IASI-B and CrIS previously). These Rcov files do need to committed.

RussTreadon-NOAA · 2021-10-15T20:41:56Z

Reran the 2021101500 gdas case using Kristen's forked copy of release/gfsda.v16.x as of 33e46e6. Use the updated Rcov files Kristen provided. Also run 2021101500 gdas using authoritative release/gfsda.v16.x at 23abb36 with updated Rcov files from Kristen. All other fix files came from Kristen's release/gfsda.v16.x.

I thought indexing was only a problem in gfs runs. I expected initial J terms to be identical between the two global_gsi.x executables for the 2021101500 gdas case. This is not true. The initial J for radiances differ.

The authoritative gfsda.v16.x has 6.2847632847397670E+05.
Kristen's fork has 8.3162092822719424E+05

KristenBathmann · 2021-10-15T20:46:37Z

Using my updated fork copy requires the updated Rcov files, and this should work for both gdas and gfs. Using the current authoritative release/gfsda.v16.x requires the old Rcov files, but will only work for gdas.

RussTreadon-NOAA · 2021-10-15T20:49:37Z

So we should expect differences in both gfs and gdas runs when using your updated fork with update Rcov files. This is what I see.

KristenBathmann · 2021-10-15T21:21:14Z

Yes, you should expect the differences that you observed. If you rerun the gdasanal with the authoritative release/gfsda.v16.x and the old Rcov files, the result should be identical to using my fork with the new covariance files.

RussTreadon-NOAA · 2021-10-16T14:55:13Z

I ran a clean test for 2021101600 gfs and gdas. Two GSI versions were checked out and installed on Mars:

For all runs fixgsi was set to the fix checked out with NOAA-EMC gfsda.v16.x. The only exception is that runs using Kristen's gfsda.v16.x used the Rcov files Kristen pointed us at (/gpfs/dell2/emc/modeling/noscrub/Kristen.Bathmann/GSI/fix/Rcov*).

For 2021101600 gfs, the NOAA-EMC global_gsi.x seg faulted. Kristen's global_gsi.x ran to completion. This is expected.

For 2021101600 gdas, the NOAA-EMC and Kristen's global_gsi.x both ran to completion. The numbers printed for the first outer loop Initial cost function and Initial gradient norm are identical for all 19 printed digits.

NOAA-EMC

Initial cost function =  2.125944993387707043E+06
Initial gradient norm =  5.296622843764786921E+03

Kristen's fork

Initial cost function =  2.125944993387707043E+06
Initial gradient norm =  5.296622843764786921E+03

However, the initial step sizes differs in the last few digits

NOAA-EMC:           2.571934798076587203E+00
Kristen's fork:     2.571934798076594753E+00

Could this reflect different ordering of information in the vectors processed by pcgsoi and routines called from pcgsoi?

At the start of the second outer loop the cost and gradient norm differ, though the leading digits are identical

NOAA-EMC

Initial cost function =  1.529773474892185070E+06
Initial gradient norm =  1.343542338745532106E+03

Kristen's fork

Initial cost function =  1.529773474891685881E+06
Initial gradient norm =  1.343562729594621942E+03

The final cost and grad numbers are comparable
NOAA-EMC
cost,grad,step,b,step? = 2 150 1.468026390140016563E+06 7.947329347877215788E+00 1.155097915181486412E+00 9.449180669965929757E-01 good

Kristen's fork
cost,grad,step,b,step? = 2 150 1.468026362812372390E+06 7.949759601992846925E+00 1.155496369801310763E+00 9.476025479407522711E-01 good

compare_ncfile.py of the two siginc.nc files shows differences to be relatively small. compare_ncfile.py returns similar differences for sigi03.nc and sigi09.nc. siginc.nc differences are shown below

/gpfs/dell2/ptmp/emc.glopara/tmp766> compare_ncfile.py gfsda.v16.x_gdas.2021101600/siginc.nc kristen_gdas.2021101600/siginc.nc
lon min/max 1=0.0,359.766 min/max 2=0.0,359.766 max abs diff=0.0000000000
lat min/max 1=-89.8207,89.8207 min/max 2=-89.8207,89.8207 max abs diff=0.0000000000
lev min/max 1=1.0,127.0 min/max 2=1.0,127.0 max abs diff=0.0000000000
pfull min/max 1=1.0,127.0 min/max 2=1.0,127.0 max abs diff=0.0000000000
ilev min/max 1=1.0,128.0 min/max 2=1.0,128.0 max abs diff=0.0000000000
hyai min/max 1=1.0,128.0 min/max 2=1.0,128.0 max abs diff=0.0000000000
hybi min/max 1=1.0,128.0 min/max 2=1.0,128.0 max abs diff=0.0000000000
u_inc min/max 1=-39.3891,22.8589 min/max 2=-39.3889,22.8589 max abs diff=0.1614894867
v_inc min/max 1=-33.1067,20.7986 min/max 2=-33.1075,20.7986 max abs diff=0.1975569576
delp_inc min/max 1=-13.2386,15.8491 min/max 2=-13.2379,15.8491 max abs diff=0.0029287338
delz_inc min/max 1=-158.188,147.854 min/max 2=-158.216,147.854 max abs diff=0.6293458939
T_inc min/max 1=-14.6978,13.852 min/max 2=-14.6975,13.852 max abs diff=0.0871682167
sphum_inc min/max 1=-0.0090915,0.00732399 min/max 2=-0.00909151,0.00732399 max abs diff=0.0001566859
liq_wat_inc min/max 1=0.0,0.0 min/max 2=0.0,0.0 max abs diff=0.0000000000
o3mr_inc min/max 1=-2.29021e-06,2.19063e-06 min/max 2=-2.2902e-06,2.19063e-06 max abs diff=0.0000000016
icmr_inc min/max 1=0.0,0.0 min/max 2=0.0,0.0 max abs diff=0.0000000000

Taken as a whole is the behavior we observe expected and acceptable?

ADCollard · 2021-10-18T13:44:58Z

@RussTreadon-NOAA, it is not immediately apparent to me why we would see a difference in the step size from this change (possibly related to different treatment of passive obs?).

I'd be OK with proceeding, but I'm going to take a couple of hours digging into the code if that is OK?

RussTreadon-NOAA · 2021-10-18T14:08:18Z

Thank you, @ADCollard for looking at the code. I agree. I did not expect a difference for the gdas case. My guess is that the treatment of passive obs alters the order of information in the vectors which pcgsoi and the routines it calls process. This change in order leads to small roundoff differences in minimization calculations. Everything may be OK but it would be nice to have confirmation that everything is OK.

ADCollard · 2021-10-18T15:11:45Z

@RussTreadon-NOAA There's one difference not from Kristen. John added an extra line to genqsat.f90 that is not in Kristen's branch. And I'm guessing that might be more likely to produce the change we've seen.

Should I just go ahead and add it and then you can rerun?

RussTreadon-NOAA · 2021-10-18T16:05:14Z

Excellent find, @ADCollard. You solved the mystery! Details matter. I missed this critical detail.

I added John's genqsat.f90 change to my working copy of Kristen's branch. global_gsi.x was rebuilt and the 2021101600 gdas case is being rerun. The job has not finished but it's into the minimization on the second outer loop. The minimization stats from the rerun are identical to the authoritative gfsda.v16.x run for all printed digits.

We are merging Kristen's branch into the authoritative branch. John's change to genqsat.f90 is already in the authoritative branch. I think it is sufficient that you explained the reason for the observed differences. We don't need to commit the John's genqsat.f90 to Kristen's branch.

We need to get Kristen's updated Rcov files into fix (DA_GFSv16.x_global_only). The local copy of Kristen's updated Rcov files are on Venus in /gpfs/dell2/emc/modeling/noscrub/Kristen.Bathmann/GSI/fix/. After we update DA_GFSv16.x_global_only with the new Rcov files, we have two choices. We can associate the updated DA_GFSv16.x_global_only with Kristen's branch and merge to the authoritative repo. We could also directly associate the updated DA_GFSv16.x_global_only with the authoritative gfsda.v16.x.

RussTreadon-NOAA · 2021-10-18T16:44:21Z

Just to confirm that with the genqsat.f90 change @ADCollard noted, Kristen's release/gfsda.v16.x at 9a41e68 reproduces analysis increments from the authoritative release/gfsda.v16.x at 23abb36 for 2021101600 gdas. For example, here is a compare_ncfile.py difference for siginc.nc:

/gpfs/dell2/ptmp/emc.glopara/tmp766> compare_ncfile.py kristen_gdas.2021101600.update_genqsat/siginc.nc gfsda.v16.x_gdas.2021101600/siginc.nc
lon min/max 1=0.0,359.766 min/max 2=0.0,359.766 max abs diff=0.0000000000
lat min/max 1=-89.8207,89.8207 min/max 2=-89.8207,89.8207 max abs diff=0.0000000000
lev min/max 1=1.0,127.0 min/max 2=1.0,127.0 max abs diff=0.0000000000
pfull min/max 1=1.0,127.0 min/max 2=1.0,127.0 max abs diff=0.0000000000
ilev min/max 1=1.0,128.0 min/max 2=1.0,128.0 max abs diff=0.0000000000
hyai min/max 1=1.0,128.0 min/max 2=1.0,128.0 max abs diff=0.0000000000
hybi min/max 1=1.0,128.0 min/max 2=1.0,128.0 max abs diff=0.0000000000
u_inc min/max 1=-39.3891,22.8589 min/max 2=-39.3891,22.8589 max abs diff=0.0000000000
v_inc min/max 1=-33.1067,20.7991 min/max 2=-33.1067,20.7991 max abs diff=0.0000000000
delp_inc min/max 1=-13.2386,15.8484 min/max 2=-13.2386,15.8484 max abs diff=0.0000000000
delz_inc min/max 1=-158.188,147.85 min/max 2=-158.188,147.85 max abs diff=0.0000000000
T_inc min/max 1=-14.6978,13.852 min/max 2=-14.6978,13.852 max abs diff=0.0000000000
sphum_inc min/max 1=-0.0090915,0.00732401 min/max 2=-0.0090915,0.00732401 max abs diff=0.0000000000
liq_wat_inc min/max 1=0.0,0.0 min/max 2=0.0,0.0 max abs diff=0.0000000000
o3mr_inc min/max 1=-2.29021e-06,2.19059e-06 min/max 2=-2.29021e-06,2.19059e-06 max abs diff=0.0000000000
icmr_inc min/max 1=0.0,0.0 min/max 2=0.0,0.0 max abs diff=0.0000000000

ADCollard · 2021-10-18T19:37:09Z

So I have added the 8 Rcov files to DA_GFSv16.x_global_only (land and sea files for the three IASIs and one each for the two CrISs). Thanks to @RussTreadon-NOAA and @CatherineThomas-NOAA for their patience as I remembered how to do it again!

….x_global_only.

RussTreadon-NOAA · 2021-10-18T20:24:47Z

@ADCollard committed Rcov files with revised channel number information to DA_GFSv16.x_global_only at hash 709a1139.

Update NOAA-EMC/GSI release/gfsda.v16.x fix at 2c8b219 to point at 709a1139 DA_GFSv16.x_global_only.

GSI Issue #233: Fix indexing issues in correlated error

RussTreadon-NOAA added a commit that referenced this issue Oct 18, 2021

GSI Issue #233: update gfsda.v16.x fix to point at 709a1139 DA_GFSv16…

2c8b219

….x_global_only.

CatherineThomas-NOAA mentioned this issue Oct 18, 2021

GSI Issue #233: Fix indexing issues in correlated error #236

Merged

CatherineThomas-NOAA added a commit that referenced this issue Oct 18, 2021

Merge pull request #236 from KristenBathmann/release/gfsda.v16.x

3898ab4

GSI Issue #233: Fix indexing issues in correlated error

CatherineThomas-NOAA closed this as completed Oct 18, 2021

RussTreadon-NOAA mentioned this issue Oct 19, 2021

Merge master into release/gfsda.v16.x #237

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix indexing issues with correlated error in V16.x #233

Fix indexing issues with correlated error in V16.x #233

KristenBathmann commented Oct 15, 2021

KristenBathmann commented Oct 15, 2021

RussTreadon-NOAA commented Oct 15, 2021 via email

ADCollard commented Oct 15, 2021 •

edited

Loading

KristenBathmann commented Oct 15, 2021

RussTreadon-NOAA commented Oct 15, 2021 via email

RussTreadon-NOAA commented Oct 15, 2021

KristenBathmann commented Oct 15, 2021

RussTreadon-NOAA commented Oct 15, 2021

KristenBathmann commented Oct 15, 2021

RussTreadon-NOAA commented Oct 15, 2021

KristenBathmann commented Oct 15, 2021

RussTreadon-NOAA commented Oct 15, 2021 •

edited

Loading

KristenBathmann commented Oct 15, 2021

RussTreadon-NOAA commented Oct 16, 2021

ADCollard commented Oct 18, 2021

RussTreadon-NOAA commented Oct 18, 2021

ADCollard commented Oct 18, 2021 •

edited

Loading

RussTreadon-NOAA commented Oct 18, 2021

RussTreadon-NOAA commented Oct 18, 2021

ADCollard commented Oct 18, 2021

RussTreadon-NOAA commented Oct 18, 2021

Fix indexing issues with correlated error in V16.x #233

Fix indexing issues with correlated error in V16.x #233

Comments

KristenBathmann commented Oct 15, 2021

KristenBathmann commented Oct 15, 2021

RussTreadon-NOAA commented Oct 15, 2021 via email

ADCollard commented Oct 15, 2021 • edited Loading

KristenBathmann commented Oct 15, 2021

RussTreadon-NOAA commented Oct 15, 2021 via email

RussTreadon-NOAA commented Oct 15, 2021

KristenBathmann commented Oct 15, 2021

RussTreadon-NOAA commented Oct 15, 2021

KristenBathmann commented Oct 15, 2021

RussTreadon-NOAA commented Oct 15, 2021

KristenBathmann commented Oct 15, 2021

RussTreadon-NOAA commented Oct 15, 2021 • edited Loading

KristenBathmann commented Oct 15, 2021

RussTreadon-NOAA commented Oct 16, 2021

ADCollard commented Oct 18, 2021

RussTreadon-NOAA commented Oct 18, 2021

ADCollard commented Oct 18, 2021 • edited Loading

RussTreadon-NOAA commented Oct 18, 2021

RussTreadon-NOAA commented Oct 18, 2021

ADCollard commented Oct 18, 2021

RussTreadon-NOAA commented Oct 18, 2021

ADCollard commented Oct 15, 2021 •

edited

Loading

RussTreadon-NOAA commented Oct 15, 2021 •

edited

Loading

ADCollard commented Oct 18, 2021 •

edited

Loading