Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge of megan namelist item from CAM and CTSM is failing even though there are only whitespace differences #509

Open
ekluzek opened this issue Oct 1, 2024 · 7 comments
Labels
bug Something isn't working CESM only

Comments

@ekluzek
Copy link
Collaborator

ekluzek commented Oct 1, 2024

I have a case that's failing in the merge of the CAM and CTSM namelist for drv_flds_in in preview_namaelists even though the difference is only in whitespace. It's comparing a character array of strings. It's not recognizing that whitespace differences in the list of strings is valid in FORTRAN. This is in cmeps1_0_20 the latest version right now...

Here is the error I'm getting:

Create namelist for component sesp
   Calling /glade/derecho/scratch/erik/cam6_4_032/cime/CIME/non_py/src/components/stub_comps_nuopc/sesp/cime_config/buildnml
  2024-10-01 16:01:49 cpl 
Create namelist for component drv
   Calling /glade/derecho/scratch/erik/cam6_4_032/components/cmeps/cime_config/buildnml
Writing nuopc_runconfig for components ['CPL', 'ATM', 'LND', 'ICE', 'OCN', 'ROF']
Key: megan_specifier, 
 Value 1: 'CH2O = formaldehyde', 'CO = carbon_monoxide', 
 Value 2: 'CH2O = formaldehyde','CO = carbon_monoxide'
ERROR: incompatible settings in drv_flds_in from 
 /glade/derecho/scratch/erik/cam6_4_032/cime/scripts/SMS_D_Ln9.f19_f19_mg17.FWma2000climo.derecho_intel.cam-outfrq9s_waccm_ma_mam4.20241001_150352_n6ytq4/Buildconf/camconf/drv_flds_in 
 and 
 /glade/derecho/scratch/erik/cam6_4_032/cime/scripts/SMS_D_Ln9.f19_f19_mg17.FWma2000climo.derecho_intel.cam-outfrq9s_waccm_ma_mam4.20241001_150352_n6ytq4/Buildconf/clmconf/drv_flds_in

How to replicate this case:

  • Check out cesm3_0_alpha03d
  • cd ./cime/scripts
  • ./create_test SMS_D_Ln9.f19_f19_mg17.FWma2000climo.derecho_intel.cam-outfrq9s_waccm_ma_mam4 -r . --walltime 00:20:00 --no-build
  • cd
  • ./xmlchange LND_SETS_DUST_EMIS_DRV_FLDS=FALSE
  • ./preview_namelists
@ekluzek ekluzek added bug Something isn't working CESM only labels Oct 1, 2024
@ekluzek
Copy link
Collaborator Author

ekluzek commented Oct 1, 2024

@jedwards4b @billsacks this is a potential fix that does get it to work. But, I'm not sure it's the right thing to do. It simply removes the space character from the comparison.

diff --git a/cime_config/buildnml b/cime_config/buildnml
index 44116d98..91cb4f41 100755
--- a/cime_config/buildnml
+++ b/cime_config/buildnml
@@ -605,7 +605,7 @@ def compare_drv_flds_in(first, second, infile1, infile2):
     ###############################################################################
     sharedKeys = set(first.keys()).intersection(second.keys())
     for key in sharedKeys:
-        if first[key] != second[key]:
+        if first[key].replace(" ", "") != second[key].replace(" ", ""):
             print(
                 "Key: {}, \n Value 1: {}, \n Value 2: {}".format(
                     key, first[key], second[key]

The big caveat to this is that it also modifies the internal content of the strings.

@billsacks
Copy link
Member

I'm not very familiar with the details of this, but the diff you give looks reasonable to me.

@jedwards4b
Copy link
Collaborator

@ekluzek I'm not sure I like this - whitespace should already be removed in buildnml lines 534 and 535:

     name = name.strip()                                                                                                                                
     var = var.strip()      

Why is it needed again?

@jedwards4b
Copy link
Collaborator

I see it's needed because in this case the extra space is in the middle of the string. I think that your fix is fine.

@jedwards4b jedwards4b assigned jedwards4b and unassigned jedwards4b Oct 2, 2024
@jedwards4b
Copy link
Collaborator

Please submit a PR with the change.

@billsacks
Copy link
Member

billsacks commented Oct 22, 2024

As discussed with @ekluzek - One step more robust might be to limit the replacement to spaces after a comma; something like this:

first_key = re.sub(r', +', ',', first[key])
second_key = re.sub(r', +', ',', second[key])
if first_key != second_key:

But as @ekluzek and I discussed, it's going to be very hard to get things totally robust in this check, and it probably isn't worth the time to do that (at least right now), and so we should get something decent and add a comment about its deficiencies.

@ekluzek
Copy link
Collaborator Author

ekluzek commented Oct 22, 2024

To comment on why my proposed fix is an incomplete hack is that it will fail for different scenarios.

  1. First, when the array of strings is broken up on separate lines in a different way from one to the other
  2. Second, if the whitespace content within the array of strings is significant.

In "1" it could fail to recognize that the merge is identical, and will abort needlessly. In "2" it could allow an important difference between the two. However, right now the only character arrays for drv_flds_in are ones where that isn't a problem. It doesn't show up for non-arrays, nor for non-string arrays.

Riffing off of Jim's example the strip statement on 535 could be changed to the substitute I have above. Although I think it might need something outside of the if statement as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CESM only
Projects
None yet
Development

No branches or pull requests

3 participants