Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistency between yaml libraries #441

Closed
aidanheerdegen opened this issue May 10, 2024 · 3 comments · Fixed by #480
Closed

Inconsistency between yaml libraries #441

aidanheerdegen opened this issue May 10, 2024 · 3 comments · Fixed by #480
Assignees
Labels

Comments

@aidanheerdegen
Copy link
Collaborator

aidanheerdegen commented May 10, 2024

It seems ruamel yaml is pickier than PyYAML, because I go this error

$ payu checkout -r archive/restart100 -b 1xCO2 e12bda105f3
Created and checked out new branch: 1xCO2                                                                                                    
laboratory path:  /scratch/lg87/aph502/access-esm
binary path:  /scratch/lg87/aph502/access-esm/bin
input path:  /scratch/lg87/aph502/access-esm/input
work path:  /scratch/lg87/aph502/access-esm/work
archive path:  /scratch/lg87/aph502/access-esm/archive
Updated metadata. Experiment UUID: cc4197a5-9df7-410c-9aa9-a60c5593c5d2
Traceback (most recent call last):
  File "/g/data/vk83/apps/payu/1.1.3/bin/payu", line 10, in <module>
    sys.exit(parse())
  File "/g/data/vk83/apps/payu/1.1.3/lib/python3.9/site-packages/payu/cli.py", line 41, in parse
    run_cmd(**args)
  File "/g/data/vk83/apps/payu/1.1.3/lib/python3.9/site-packages/payu/subcommands/checkout_cmd.py", line 34, in runcmd
    checkout_branch(is_new_branch=new_branch,
  File "/g/data/vk83/apps/payu/1.1.3/lib/python3.9/site-packages/payu/branch.py", line 175, in checkout_branch
    add_restart_to_config(prior_restart_path, config_path=config_path)
  File "/g/data/vk83/apps/payu/1.1.3/lib/python3.9/site-packages/payu/branch.py", line 71, in add_restart_to_config
    config = yaml.load(config_path)
  File "/g/data/vk83/apps/payu/1.1.3/lib/python3.9/site-packages/ruamel/yaml/main.py", line 446, in load
    return self.load(fp)
  File "/g/data/vk83/apps/payu/1.1.3/lib/python3.9/site-packages/ruamel/yaml/main.py", line 451, in load
    return constructor.get_single_data()
  File "/g/data/vk83/apps/payu/1.1.3/lib/python3.9/site-packages/ruamel/yaml/constructor.py", line 116, in get_single_data
    return self.construct_document(node)
  File "/g/data/vk83/apps/payu/1.1.3/lib/python3.9/site-packages/ruamel/yaml/constructor.py", line 125, in construct_document
    for _dummy in generator:
  File "/g/data/vk83/apps/payu/1.1.3/lib/python3.9/site-packages/ruamel/yaml/constructor.py", line 1476, in construct_yaml_map
    self.construct_mapping(node, data, deep=True)
  File "/g/data/vk83/apps/payu/1.1.3/lib/python3.9/site-packages/ruamel/yaml/constructor.py", line 1366, in construct_mapping
    if self.check_mapping_key(node, key_node, maptyp, key, value):
  File "/g/data/vk83/apps/payu/1.1.3/lib/python3.9/site-packages/ruamel/yaml/constructor.py", line 278, in check_mapping_key
    raise DuplicateKeyError(*args)
ruamel.yaml.constructor.DuplicateKeyError: while constructing a mapping
  in "config.yaml", line 1, column 1
found duplicate key "qsub_flags" with value "-W umask=027" (original value: "-j oe")
  in "config.yaml", line 62, column 1

To suppress this check see:
    http://yaml.readthedocs.io/en/latest/api.html#duplicate-keys

when trying to checkout a new branch from an existing experiment that has duplicate qsub_flags options in the config.yaml:

https://github.com/aidanheerdegen/access_esm_mio_run/blob/mio_v4_master/config.yaml

The experiment ran without error, so my supposition is that PyYAML silently ignores the duplicate keys. We should suppress the exception, but warn if possible.

@aidanheerdegen aidanheerdegen changed the title Incompatibility between yaml libraries Inconsistency between yaml libraries May 10, 2024
@aidanheerdegen
Copy link
Collaborator Author

This is how to allow duplicate keys

https://yaml.readthedocs.io/en/latest/api/#duplicate-keys

yaml = ruamel.yaml.YAML()
yaml.allow_duplicate_keys = True
yaml.load(stream)

Not sure if we need to make sure the two libraries are consistent with what key values they keep?

There is an existing issue here where others are annoyed pyyaml isn't following the standard

yaml/pyyaml#165

some work-arounds to make pyyaml disallow duplicate keys

https://gist.github.com/pypt/94d747fe5180851196eb

@aidanheerdegen
Copy link
Collaborator Author

Not sure if we need to make sure the two libraries are consistent with what key values they keep?

To be specific ruamel ignores subsequent duplicate values, whereas pyyaml overwrites them.

So given dupe.yaml:

a: 1
b: 2
a: 3

Running this:

from ruamel.yaml import YAML
import yaml as pyyaml

yaml = YAML()
yaml.allow_duplicate_keys = True

with open('dupe.yaml') as stream:
   data = yaml.load(stream)
print(data)

with open('dupe.yaml') as stream:
   data = pyyaml.safe_load(stream)
print(data)

produces this:

$ python test.py 
{'a': 1, 'b': 2}
{'a': 3, 'b': 2}

@aidanheerdegen
Copy link
Collaborator Author

This is blocking answering this query on the forum because there is a duplicate key in a config, and ruamel can't deal with it and throws an exception.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants