Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate list entries after default merge-last #54

Closed
gbunt opened this issue Jun 19, 2019 · 7 comments
Closed

Duplicate list entries after default merge-last #54

gbunt opened this issue Jun 19, 2019 · 7 comments

Comments

@gbunt
Copy link

gbunt commented Jun 19, 2019

output of salt --versions:

Salt Version:
           Salt: 2018.3.3
 
Dependency Versions:
           cffi: 1.11.5
       cherrypy: Not Installed
       dateutil: 2.7.5
      docker-py: Not Installed
          gitdb: Not Installed
      gitpython: Not Installed
          ioflo: Not Installed
         Jinja2: 2.7.2
        libgit2: Not Installed
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.5.6
   mysql-python: Not Installed
      pycparser: 2.19
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: Not Installed
         Python: 2.7.5 (default, Oct 30 2018, 23:45:53)
   python-gnupg: Not Installed
         PyYAML: 3.11
          PyZMQ: 14.3.1
           RAET: Not Installed
          smmap: Not Installed
        timelib: Not Installed
        Tornado: 4.2.1
            ZMQ: 3.2.5
 
System Versions:
           dist: centos 7.6.1810 Core
         locale: UTF-8
        machine: x86_64
        release: 3.10.0-957.5.1.el7.x86_64
         system: Linux
        version: CentOS Linux 7.6.1810 Core

Using the default merge-last, on some nested lists we get duplicate entries in final merged Pillar like:

    root_object:
        ----------
        firewall:
            ----------
            services:
                ----------
                some_service:
                    ----------
                    main:
                        ----------
                        http:
                            ----------
                            allow:
                                - ANY
                                - ANY

Where "stack" is yaml:

root_object:
  firewall:
    services:
      some_service:
        main:
          http:
            allow:
            - ANY

and JSON merged on top:

{
  "root_object": {
    "firewall": {
      "services": {
        "some_service": {
          "main": {
            "http": {
              "allow": [
                "ANY"
              ]
            }
          }
        }
      }
    }
  }
}

This only seems to happen on deeply nested lists, lists entries higher up the chain do not seem to result in duplicates.

@bbinet
Copy link
Owner

bbinet commented Jun 19, 2019

This is expected behavior: items are appended to lists when merged.

@gbunt
Copy link
Author

gbunt commented Jun 19, 2019

mmm ok, thanks. Now that you mention the other lists i mentioned are separate Pillars, not managed by PillarStack. So we need to pick one with __: <merging strategy>, in our case we need the JSON document to overwrite whatever is in our default Pillars (the YAML) so i'd use { "__": "overwrite" }, inside every such list in the JSON version. The problem is that we have less control over that JSON as it gets generated by an application. Would there be any other way of telling PillarStack it should use overwrite for a specific path?

@bbinet
Copy link
Owner

bbinet commented Jun 20, 2019

No, there is no other way for the moment.
@mmulqueen recently has recently submitted PR #53 to be able to change the default merge strategy, but it needs to be slightly modified before it can be merged.

@gbunt
Copy link
Author

gbunt commented Jun 20, 2019

I saw that one, but globally we cannot use overwrite and do need merge-last. Looking at the code it would be a minor change to support specifying strategy on either side. Would you accept a PR for that? It may break environments where strategy is wrongly specified on the stack side and is currently just ignored. Other than that i don't see any downsides.

@bbinet
Copy link
Owner

bbinet commented Jun 21, 2019

I'm not sure I understand what you mean when you say: " support specifying strategy on either side".
But a PR can help me understand ;)

@gbunt
Copy link
Author

gbunt commented Jun 21, 2019

Basically instead of the following:

+----------------+-------------------------+-------------------------+
| ``stack``      | ``yaml_data``           | ``stack`` (after merge) |
+================+=========================+=========================+
| .. code:: yaml | .. code:: yaml          | .. code:: yaml          |
|                |                         |                         |
|     users:     |     users:              |     users:              |
|       - tom    |       - __: overwrite   |       - mat             |
|       - root   |       - mat             |                         |
+----------------+-------------------------+-------------------------+

Also allow:

+--------------------------+-------------------------+-------------------------+
| ``stack``                | ``yaml_data``           | ``stack`` (after merge) |
+==========================+=========================+=========================+
| .. code:: yaml           | .. code:: yaml          | .. code:: yaml          |
|                          |                         |                         |
|     users:               |     users:              |     users:              |
|       - __: overwrite    |       - tom             |       - tom             |
|       - mat              |                         |                         |
+--------------------------+-------------------------+-------------------------+

However, it would also need to support specifying strategy in either stack or yaml_data. In that case whatever gets merged on top, in this example yaml_data, would need to take precedence over the strategy defined in stack, in case that would be a different strategy.

As i don't think there will be that many use-cases for this feature, i'll use a modified version for now, which currently only takes a strategy defined in the stack (but is applied to the data merged in)

@gbunt
Copy link
Author

gbunt commented Jun 21, 2019

thanks so far.

@gbunt gbunt closed this as completed Jun 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants