Index in namelist #1259

jedwards4b · 2017-03-17T17:08:10Z

Adds support for namelist entries of the form foo(3) = 'a'

Test suite: scripts_regression_tests.py, --namelist-only tests for cesm prealpha and aux_clm45
Note there are some expected failures here due to bugs found in the process. One bug was that
for multi-instance cases all namelists were using the 0001 stream file and they each should have
a separate file. Another bug was that only changes in user_nl_d***_0001 were being used in all
instances (user_nl_d***_000n where n>1 was being ignored)
Test baseline: cesm2_0_alpha06g
Test namelist changes:
Test status: bit for bit

Fixes #1248
Fixes #1274

User interface changes?:

Code review: @jgfouca @mvertens

jedwards4b · 2017-03-17T21:03:36Z

scripts/lib/CIME/namelist.py

@@ -112,7 +112,7 @@

 # Fortran syntax regular expressions.
 # Variable names.
-FORTRAN_NAME_REGEX = re.compile(r"^[a-z][a-z0-9_]{0,62}$", re.IGNORECASE)
+FORTRAN_NAME_REGEX = re.compile(r"^[a-z][a-z0-9_]{0,62}(\((\d+)\))?$", re.IGNORECASE)


This allows for the possibility of an index in the variable name such as foo(3)

What about foo(-3) or foo(+3), both of which are also valid?
I have a regex for that which also includes (integer constant) array sections if you are interested.
As a side note, since you use re.IGNORECASE, can we get rid of all the scary language in namelist_definition_drv.xml about the id having to be lower case?

jedwards4b · 2017-03-17T21:04:43Z

scripts/lib/CIME/namelist.py

@@ -171,6 +171,20 @@ def is_valid_fortran_name(string):
    """
    return FORTRAN_NAME_REGEX.search(string) is not None

+def fortran_name_index(string):


If a valid index exists return that value. Note that we are not yet handling array sections such as (:2) or (3:5)

jedwards4b · 2017-03-17T21:07:06Z

scripts/lib/CIME/namelist.py

@@ -1414,6 +1453,18 @@ def _parse_variable_name(self, allow_equals=True):
        if not is_valid_fortran_name(text_check):
            raise _NamelistParseError("%r is not a valid variable name at %s" %
                                      (str(text), self._line_col_string()))
+
+        # a particular value of a namelist array may be assigned using syntax


We create a list of indices here and pop them off later where they are used.

jedwards4b · 2017-03-17T21:08:25Z

scripts/lib/CIME/namelist.py

@@ -1905,17 +1968,23 @@ def parse_namelist(self):
        >>> _NamelistParser("! Comment \n &group /! Comment ").parse_namelist()
        {u'group': {}}
        >>> _NamelistParser("&group1\n foo='bar','bazz'\n,, foo2=2*5\n / &group2 /").parse_namelist()
-        {u'group1': {u'foo': [u"'bar'", u"'bazz'", u''], u'foo2': [u'2*5']}, u'group2': {}}
+        {u'group1': {u'foo': [u"'bar'", u"'bazz'", u''], u'foo2': [u'5', u'5']}, u'group2': {}}


As a side effect these values are expanded a little earlier than they were before

jedwards4b · 2017-03-17T21:10:46Z

scripts/lib/CIME/nmlgen.py

+            self._streams_namelists[variable] = []
+        self._streams_namelists["streams"] = []
+
+    def new_instance(self):


Clean data stored from previous instances so that it isn't copied into new instance by instance here I am refering to the cesm ensemble mode functionality.

jgfouca

Looks good. Thanks for improving tests.

jgfouca · 2017-03-17T21:36:18Z

I will wait for @mvertens before merging.

mvertens · 2017-03-17T21:51:27Z

@jgfouca I think we need to do some testing on the cesm side before we bring this in.

…

On Fri, Mar 17, 2017 at 3:36 PM, James Foucar ***@***.***> wrote: I will wait for @mvertens <https://github.com/mvertens> before merging. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1259 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AHlxE8wk3APAca4hPlpszjCwfM9YeXuHks5rmvzTgaJpZM4Mg27Y> .

jedwards4b · 2017-03-17T22:03:47Z

Having the doctests there really helped in the development process.

jgfouca · 2017-03-18T21:30:32Z

@mvertens OK, I will hold off until you guys tell me it's ready.

gold2718

It seems that your current implementation, in addition to only allowing setting of single values, assumes that all arrays read in a namelist are indexed to begin at one. This is not a valid assumption in Fortran.
I recommend less processing, just spit valid array sections out in the order in which they appear. An alternative would be to optionally specify bounds in the namelist definition file.

gold2718 · 2017-03-17T22:36:31Z

scripts/lib/CIME/namelist.py

+            if vname in self._nameindex:
+                self._nameindex[vname].append(nindex)
+            else:
+                self._nameindex[vname] = [nindex]
        return text.lower()


Empty line before return (end of previous if block)?

gold2718 · 2017-03-19T22:01:27Z

scripts/lib/CIME/namelist.py

@@ -112,7 +112,7 @@

 # Fortran syntax regular expressions.
 # Variable names.
-FORTRAN_NAME_REGEX = re.compile(r"^[a-z][a-z0-9_]{0,62}$", re.IGNORECASE)
+FORTRAN_NAME_REGEX = re.compile(r"^[a-z][a-z0-9_]{0,62}(\((\d+)\))?$", re.IGNORECASE)


What about foo(-3) or foo(+3), both of which are also valid?
I have a regex for that which also includes (integer constant) array sections if you are interested.
As a side note, since you use re.IGNORECASE, can we get rid of all the scary language in namelist_definition_drv.xml about the id having to be lower case?

gold2718 · 2017-03-19T22:04:23Z

scripts/lib/CIME/namelist.py

+    m = FORTRAN_NAME_REGEX.search(string)
+    if m is not None and m.group(2) is not None:
+        return int(m.group(2))
+


Returning an int here is restrictive in that you will not be able to handle a list of ints, such as created by an array section. Of course, even a list will not always help (see below).

gold2718 · 2017-03-19T22:17:44Z

scripts/lib/CIME/namelist.py

@@ -618,6 +634,9 @@ def expand_literal_list(literals):
        if FORTRAN_REPEAT_PREFIX_REGEX.search(literal) is not None:
            num, _, value = literal.partition('*')
            expanded += int(num) * [value]
+        elif nindex is not None:
+            expanded = ['']*(nindex-1)
+            expanded.append(literal)


This logic will not help much with an input like:
foo(5:6) = 2*5

gold2718 · 2017-03-19T22:20:31Z

scripts/lib/CIME/namelist.py

+            if vname in self._nameindex:
+                self._nameindex[vname].append(nindex)
+            else:
+                self._nameindex[vname] = [nindex]


If fortran_name_index returned a list of integers instead of a single int, you could always use extend.
Before you work on that, be sure you have figured out how to handle sections correctly.
For instance, foo(2::2) is valid as foo(::) or foo(8,4,-1.
Another issue with trying to insert values in lists (as opposed to just listing each section in order) is that you do now know the bounds of the actual array. It is perfectly valid for an array in a namelist to have negative bounds (e.g., integer :: foo(-3:3)). Not knowing those bounds, how will you process:

foo(:) = 16*'a' foo(-3) = 'b'

jedwards4b · 2017-03-19T23:58:28Z

@gold2718: I believe that implementing a complete fortran namelist parser may be beyond the scope or intent of this issue. Do we want something that will be ready for cesm 2 or do we want a complete solution?

gold2718 · 2017-03-20T02:28:08Z

@jedwards4b, That questions is above my pay grade of course. I suggest this be brought up at the next CIME meeting.

Whatever solution is decided on (for any particular CIME tag, not just the CESM release) should be properly documented -- especially where it deviates from the Fortran standard.

If it is decided to keep the current version, documentation should at least have:

"The namelist code assumes that all arrays have a lower bound of one unless the user enters an index of zero in an array section in which case, the assumption is that the array has a lower bound of zero."

because that is how the current code sort of behaves. However, even that is not correct.

For fun, stick this in your user_nl_datm and try ./preview\_namelists:

dtlimit(6) = 6.0
DTLIMIT(3) = 3.0
dtlimit(0) = 0.0
dtlimit(1) = 1.0

Funny how dtlimit(1) overwrote dtlimit(0). Any why is there one 'default' entry of 1.5 but not others? I don't think this is ready as it stands. I think rejecting zero as a valid index (and documenting it) would allow the assumption of a lower bound of one to work.

jedwards4b · 2017-03-20T13:05:07Z

@gold2718: Out of curiosity I tried a few of these in a short fortran program. None of intel, pgi and gnu compilers recognize array sections in input at all. I think that it would be okay to document that we assume all arrays begin at 1 and I think that that is the case in all current namelists. I will look into the dtlimit example you raised.

jedwards4b · 2017-03-20T17:04:30Z

@gold2718 The number of default values set in an array is something that was not changed with this PR - I have determined that this is done in nmlgen.py subroutine _to_namelist_literals and the number of values set is determined by enumerate(value). I haven't figured out why the result of this appears generally to be 3. I will open this as a separate issue.

gold2718 · 2017-03-20T18:56:29Z

@jedwards4b ???
Before I began, I also wrote a short program and had no issues reading in various array sections on Yellowstone (Intel) or Hobart (Nag). Program and one trial namelist below (I tried lots of variants):

program foo
  implicit none
  integer :: ierr
  integer :: unitn = 63
  integer :: item(-8:8)
  namelist /foo_nl/ item
  item(:) = -1
  open(unitn, file="foo.nl", status='old', delim='APOSTROPHE')
  read(unitn, foo_nl, iostat = ierr)
  if (ierr /= 0) then
    write(6, *) 'ERROR reading foo_nl, iostat = ',ierr
  else
    write(6, '(8i5)') item
  end if
  close(unitn)
end program foo

foo.nl

&foo_nl
  item(::2)    = -8, -6
  item(8:4:-1) = 8,,,,4
  ITEM(1:2) = ,2
  item(-3) = -3
  item(5:6) = 2*5
/

The result

   -8   -1   -6   -1   -1   -3   -1   -1
   -1   -1    2   -1    4    5    5   -1
    8

jedwards4b · 2017-03-20T20:48:29Z

@gold2718 I stand corrected on array section syntax. I did not appreciate the importance of the right hand side of the above assignments.

gold2718 · 2017-03-20T22:56:11Z

Before this PR is closed, I would like to hear from @mvertens, @rljacob, and @jgfouca on the issue of how to handle the issue of array sections in namelist input. Three clear options I see are:

Only handle whole arrays or single elements to the left of the equals sign along with the assumption that the namelist array has a lower bound of one. This option would include documentation of the limitation in the CIME manual (sections 6.3, 6.4, 6.5 & 6.6) and possibly an additional message when an unhandled case is detected.
Scrap this PR and implement the full standard.
Merge this PR with a plan to implement the full standard prior to the release (thus avoiding the need for the additional documentation mentioned in option 1).

Thoughts? Additional options? Please note a possible issue with options 2 & 3 in my next comment.

gold2718 · 2017-03-20T23:28:46Z

Why is it so hard to implement the full Fortran standard for namelist input? I think it is a combination of the fact that we do not know the bounds of the target array along with a decision to try and combine all mentions of the array into a single namelist entry. For instance, the sequence:

foo = 1, 2, 3, 4, 5
foo(-3) = 9

requires no knowledge of the array bounds, just an assumption that foo has at least five elements and bounds that include index minus three. However, if we want to combine these into a single entry, we need to know where to put the nine. Our current namelist definition file does not contain this information so I see the need for a decision before attempting to handle the full Fortran namelist input standard for arrays:

Extend the syntax of the <type> tag for namelist entries to allow specification of array bounds (as in Fortran programs). This allows for proper interpretation of non-positive integer as well as array-section element selections. It also allows for rejection of out-of-bounds elements.
Do not try to fold multiple array element setting into a single namelist entry (i.e., keep multiple entries as in the example above).

Thoughts on these options? Am I missing a third option?

rljacob · 2017-03-21T00:20:32Z

So did this work in CIME2 because CIME2 didn't check what was in user_nl?

mvertens · 2017-03-21T02:48:19Z

@rjacob - no CIME2 did check what was in user_nl - but it used the perl namelist generator which handled single array indices.
@gold2718 - I would recommend having at least the behavior we had with the perl namelist parsing of user_nl_xxx. At this point. I would like to understand if the perl namelist parsing handled array syntax behavior. If not - I would recommend going with (1) for now and dealing handling more sophisticated namelist parsing after the cesm2.0 release in an update.

rljacob · 2017-03-21T03:12:31Z

Agreed. The original issue was about the missing CIME2 functionality so restoring that should be enough for now.

jedwards4b · 2017-03-23T14:37:06Z

I've rerun scripts_regression_tests.py after these changes - all pass.

…o needed by external components

jedwards4b · 2017-03-23T22:00:45Z

I have tested the cesm yellowstone prealphas in namelist-only mode with this tag, all tests pass except ERS_Ly20_N2_P2.f09_g16_gl10.TG1.yellowstone_pgi which failed due to an existing bug in multiinstance cases which this PR fixes. I am on vacation next week so I would really appreciate a thorough review of this PR asap if you can. Thanks

gold2718

I'm still unhappy with the line number logged in the _NamelistParseError on line 1814 (namelist.py). It does not correspond to any sort of actual line. Please either fix it or remove it.

Also, if a rank>1 array is found, use a better error message than:
"Error in parsing namelist: expected literal value, but got 'dtlimit(1'"
Looking for the comma may help distinguish which unsupported situation you have found.

If you try to set an array element past the top of the array, you get:
"ERROR: Variable 'dtlimit' has invalid value"
which may be hard to interpret (you have to count the number of elements and see that it is too many.

The user_nl_<model> is still processed out of order.
For instance, this combo:

dtlimit(3) = 22.0
DTLIMIT(3) = 3.0

gives the correct result (dtlimit(3) = 3.0), however, this one:

dtlimit(6) = 19.0
dtlimit(6:99:3) = 6.0,9.0,12.0,15.0,18.0

gives dtlimit(6) = 19.0. Not sure how that happens (but it is the ordering because if I comment out the first line, the elements are correct).
Here is my test case and the wrong (and what I think is correct) answer:

dtlimit(6) = 19.0
dtlimit(6:99:3) = 6.0,9.0,12.0,15.0,18.0
dtlimit(1:2:3) = 4.0
dtlimit(3) = 22.0
DTLIMIT(3) = 3.0
dtlimit(8:10:2) = 8.0, 10.0
dtlimit(1) = 1.0
dtlimit(::2) = 1.0, 3.0, 5.0

results in:

  dtlimit = 1.0, 1.5, 3.0, , , 19.0, , , 9.0, 10.0, , 12.0, , 8.0, 15.0, , ,

I think it should be:

  dtlimit = 1.0, 1.5, 3.0, ,5.0 , 6.0, ,8.0 , 9.0, 10.0, , 12.0, , , 15.0, , , 18.0

gold2718 · 2017-03-24T04:48:14Z

scripts/lib/CIME/namelist.py

+        minindex, maxindex, step = get_fortran_variable_indices(variable_name, var_size)
+        variable_name = get_fortran_name_only(variable_name.lower())
+
+        expect(minindex > 0, "CIME indices < 1 not supported in CIME interface to fortran namelists minindex=%s"%minindex)


I would change this to
"Array lower bounds /= 1 are not supported in CIME interface to fortran namelists ..."
I added the ... because a minimum index without the array name is not very useful (unless you get the file and line number to display correctly). If you do print it out, use 'lower bound' instead of 'minindex'.
Also, change the expect logic to minindex == 1 because that is the correct current requirement.

But the expect logic is correct here - if in user_nl you have dtlimit(6) then minindex=maxindex=6,step=1 .

I have no way of knowing or trapping for fortran lower array bounds > 1

jedwards4b · 2017-03-24T14:56:10Z

@gold2718 I believe that I've addressed all of the issues you raised - please try again.

Thanks

jedwards4b · 2017-03-24T16:11:18Z

Reran scripts_regression_tests.py and cesm yellowstone prealphas in namelist-only mode

jedwards4b · 2017-03-24T17:42:29Z

In addition to the above I also ran the aux_clm45 tests:

:) git checkout master
Switched to branch 'master'
Your branch is up-to-date with 'origin/master'.
yslogin6: ~/sandboxes/cesm2_0_alpha/cime/scripts
:) execca ./create_test --namelists-only --xml-machine yellowstone --xml-category aux_clm45  --generate jpe_aux_clm45_master  1> cts.out 2>&1
yslogin6: ~/sandboxes/cesm2_0_alpha/cime/scripts
:) git checkout index_in_namelist
Switched to branch 'index_in_namelist'
yslogin6: ~/sandboxes/cesm2_0_alpha/cime/scripts
:) execca ./create_test --namelists-only --xml-machine yellowstone --xml-category aux_clm45  --compare jpe_aux_clm45_master  1> cts.out 2>&1

All pass with the exception of a multiinstance case that was broken in master and fixed in this PR.

gold2718

Results are correct!
Just fix the comment and merge this baby (as far as I'm concerned).

gold2718 · 2017-03-24T17:16:57Z

scripts/lib/CIME/XML/namelist_definition.py


-                # Check if can actually change this variable via filename change
+                # Check if can actuax1lly change this variable via filename change


I liked the original spelling better :)

gold2718 · 2017-03-24T17:49:37Z

scripts/lib/CIME/namelist.py

+                                  (([+-]?\d+)?:([+-]?\d+)?:?([+-]?\d+)?))               # colon seperated triplet
+                                  \))?\s*$"""                                           # end optional index expression
+                                , re.IGNORECASE | re.VERBOSE)
+


Why all these blank lines?

gold2718 · 2017-03-24T17:59:35Z

scripts/lib/CIME/namelist.py

+            self._groups[group_name][variable_name].extend(['']*(minindex-tlen-1))
+
+        for i in range(minindex-1, maxindex+step, step):
+            while len(self._groups[group_name][variable_name]) < i+1:


I think this logic is incorrect. For instance:
dtlimit(20:22:2) = 20.0, 22.0, 24.0
should generate an error since the slice only indicates two elements but you accept (and set) three. This is an ERROR in Fortran and should be trapped. If you trap the error earlier, then you can maybe keep the logic.
I believe the correct number of steps is:
1 + ((maxindex - minindex) / step)

gold2718 · 2017-03-24T19:06:35Z

Just tried:
dtlimit(::2) = 1.0, 3.0, 5.0
and got
ERROR: Too many values for array dtlimit(::2)

gold2718 · 2017-03-24T20:06:09Z

scripts/lib/CIME/namelist.py

+        if minindex > tlen:
+            self._groups[group_name][variable_name].extend(['']*(minindex-tlen-1))
+
+        for i in range(minindex-1, maxindex+step, step):


This range does not seem to work when step < 0.
dtlimit(25:24:-1) = 25.0, 24.0
only sets spot 25.

gold2718

I think the horse is finally dead, we can stop beating it (i.e., could not find any issues this time)!

jedwards4b added 3 commits March 17, 2017 08:33

add _nameindex to NamelistParser

97244cb

fix doctest issues

ae96067

better assignment

7105184

jedwards4b added the in progress label Mar 17, 2017

Mariana Vertenstein and others added 4 commits March 17, 2017 13:18

fixed multiple instance problem

388f7d2

need to clean streams between instances

9be48ca

add new_instance function to all data models

91b899a

remove commented code

906c66a

jedwards4b commented Mar 17, 2017

View reviewed changes

jedwards4b assigned jgfouca Mar 17, 2017

jedwards4b requested a review from mvertens March 17, 2017 21:13

jgfouca approved these changes Mar 17, 2017

View reviewed changes

gold2718 requested changes Mar 19, 2017

View reviewed changes

add space before return

c106652

added error for values out of range

31c5f95

jedwards4b added 2 commits March 23, 2017 06:57

fix step > 1

d085370

character not char

a46759a

jedwards4b added 2 commits March 23, 2017 14:02

need a special case for cice

757bc28

move new_instance code from buildnml to init_defaults since it is als…

ed578f0

…o needed by external components

gold2718 requested changes Mar 24, 2017

View reviewed changes

jedwards4b added 3 commits March 24, 2017 07:16

remove bad line info, better error for multiple dimensions

e255277

fix ordering issues

6584df0

fix doctests

4bc8257

fix pylint issue

4ba1711

gold2718 requested changes Mar 24, 2017

View reviewed changes

fix spelling in comment

00285b7

gold2718 reviewed Mar 24, 2017

View reviewed changes

remove blank lines

454f484

gold2718 reviewed Mar 24, 2017

View reviewed changes

error for values exceed array size

6c9e249

jedwards4b added 2 commits March 24, 2017 13:34

fix issues with array limits

357d31d

remove unneeded restriction on step

b284bde

gold2718 reviewed Mar 24, 2017

View reviewed changes

one more fix for negative index

75a815a

gold2718 approved these changes Mar 24, 2017

View reviewed changes

jedwards4b merged commit 9d8396a into ESMCI:master Mar 24, 2017

jedwards4b removed the in progress label Mar 24, 2017

gold2718 mentioned this pull request Jul 18, 2017

Namelist parser needs to support null values #1133

Open

billsacks mentioned this pull request Jun 4, 2021

Missing default namelist values should raise an error but do not #3984

Closed


		# Check if can actually change this variable via filename change
		# Check if can actuax1lly change this variable via filename change

Index in namelist #1259

Index in namelist #1259

Conversation

jedwards4b commented Mar 17, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jgfouca left a comment

Choose a reason for hiding this comment

jgfouca commented Mar 17, 2017

mvertens commented Mar 17, 2017 via email

jedwards4b commented Mar 17, 2017

jgfouca commented Mar 18, 2017

gold2718 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jedwards4b commented Mar 19, 2017

gold2718 commented Mar 20, 2017

jedwards4b commented Mar 20, 2017

jedwards4b commented Mar 20, 2017

gold2718 commented Mar 20, 2017

jedwards4b commented Mar 20, 2017

gold2718 commented Mar 20, 2017

gold2718 commented Mar 20, 2017

rljacob commented Mar 21, 2017

mvertens commented Mar 21, 2017

rljacob commented Mar 21, 2017

jedwards4b commented Mar 23, 2017

jedwards4b commented Mar 23, 2017

gold2718 left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jedwards4b commented Mar 24, 2017

jedwards4b commented Mar 24, 2017

jedwards4b commented Mar 24, 2017

gold2718 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gold2718 commented Mar 24, 2017

gold2718 Mar 24, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gold2718 left a comment

Choose a reason for hiding this comment

jedwards4b commented Mar 17, 2017 •

edited

Loading

gold2718 left a comment •

edited

Loading

gold2718 Mar 24, 2017 •

edited

Loading