Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate the python workflows yaml module to use ruamel.yaml #2614

Open
prudhvigodithi opened this issue Sep 13, 2022 · 4 comments
Open

Migrate the python workflows yaml module to use ruamel.yaml #2614

prudhvigodithi opened this issue Sep 13, 2022 · 4 comments
Labels
enhancement New Enhancement good first issue Good for newcomers

Comments

@prudhvigodithi
Copy link
Member

prudhvigodithi commented Sep 13, 2022

Is your feature request related to a problem? Please describe

Coming from the PR, which requires the modification version increment workflow to auto add the branch entry, as part of developing this automation, the package ruamel.yaml, seems to be a better fit when compared to PyYaml. My proposal is to have a single library for yaml parsing and use ruamel.yaml as it makes it simple to work on complex yaml parsing.

  • Using ruamel.yaml it's easy to override the methods like explicit_start, preserve_quotes and right indent which honors yamllint and gh workflow .yml rules.

  • ruamel.yaml also does not change the on key to boolean (default the library considers on as boolean) which is not easy with PyYAML, need some extra logic to create constructer and override methods.
    https://stackoverflow.com/questions/36463531/pyyaml-automatically-converting-certain-keys-to-boolean-values

  • ---, this is also preserved with ruamel.yaml easily compared to PyYAML, without this the GH and yamllint flags it as error.

Describe the solution you'd like

Use ruamel.yaml
Some more useful details: https://snyk.io/advisor/python/ruamel-yaml
ruamel.yaml difference with PyYAML: https://yaml.readthedocs.io/en/latest/pyyaml.html#

Describe alternatives you've considered

No response

Additional context

cookiecutter migration to ruamel.yaml cookiecutter/cookiecutter#557

@prudhvigodithi prudhvigodithi added enhancement New Enhancement untriaged Issues that have not yet been triaged labels Sep 13, 2022
@prudhvigodithi
Copy link
Member Author

prudhvigodithi commented Sep 13, 2022

Files that need to be changed (Please add more if I have missed)

manifests/manifest.py:14:import yaml
test_workflow/integ_test/service_opensearch.py:11:import yaml
test_workflow/integ_test/service_opensearch_dashboards.py:12:import yaml
test_workflow/test_recorder/test_recorder.py:12:import yaml
test_workflow/perf_test/perf_test_runner_opensearch.py:10:import yaml

@jace
Copy link

jace commented Sep 15, 2022

The Synk profile for ruamel.yaml is showing stats from the GitHub repository of PyOxidizer. The Ruamel project is hosted on Sourceforge and does not have an up-to-date mirror on GitHub, at least for the first few results that come up. The latest release is not yet 1.0 and therefore should not be considered API stable.

While the claimed improvements from YAML 1.2 support are significant, there are a number of red flags:

  1. Weak evidence of projects migrating from PyYAML to ruamel.yaml from GitHub issues (even cookiecutter moved away to a custom YAML parser).
  2. Sub 1.0 version number, and the PyPI classifier is "4 - Beta". The documentation continues to show a 2019 date despite the latest release being in 2022.
  3. Libraries.io profiles show [PyYAML] has 14.2k dependent packages and 57.2k dependent repositories while ruamel.yaml has only 1.4k and 2.05k dependents.
  4. YAML 1.2 support requires the pure Python implementation, which is optional by default. That means code that does not mandate the pure Python implementation will parse YAML differently depending on how it is deployed.

I'm not affiliated with either parser or with opensearch. I came here trying to decide for myself which parser to use.

@prudhvigodithi
Copy link
Member Author

Hey @jace thanks for the inputs, also I found this useful https://stackoverflow.com/questions/20805418/pyyaml-dump-format/36760452#36760452, in terms of maintenance I see ruamel.yaml is being updated actively compared to PyYaml

@dblock
Copy link
Member

dblock commented Sep 15, 2022

I don't have a strong opinion about whether these red flags should prevent anyone from using ruamel.yaml. Sounds like it was added in #2602 to avoid having to explicitly lint updated .yaml files. Why wasn't it an issue with other YML files created? @prudhvigodithi do you have an example of differences when PyYaml is used vs. ruamel.yaml for the file you are modifying in #2602?

@bbarani bbarani removed the untriaged Issues that have not yet been triaged label Sep 16, 2022
@prudhvigodithi prudhvigodithi added the good first issue Good for newcomers label Sep 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New Enhancement good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

4 participants