Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vault execution module broken in pillar lookups #49671

Closed
mchugh19 opened this issue Sep 17, 2018 · 11 comments
Closed

Vault execution module broken in pillar lookups #49671

mchugh19 opened this issue Sep 17, 2018 · 11 comments
Assignees
Labels
Bug broken, incorrect, or confusing behavior fixed-pls-verify fix is linked, bug author to confirm fix P2 Priority 2 severity-critical top severity, seen by most users, serious issues severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around v2018.3.4 unsupported version
Milestone

Comments

@mchugh19
Copy link
Contributor

mchugh19 commented Sep 17, 2018

Description of Issue/Question

When using salt minions connected to a master, where the master does have access to a vault endpoint, but the minion does not, placing a vault.read_secret into a pillar should fail. As of 2018.3, this broke and those lookups now succeed.

Steps to Reproduce Issue

For the case of vault lookups in a pillar, the master should be running the vault token generation on behalf of the minion and not just connecting to vault to perform the lookup itself, this occurs in the _get_token_and_url_from_master() function. It looks like the logic in https://github.com/saltstack/salt/blob/develop/salt/utils/vault.py#L136-L138 broke this flow and the _get_token_and_url_from_master() function is never called.

The problem code is currently

    if 'vault' in __opts__ and __opts__.get('__role', 'minion') == 'master':
        # will query vault
        return _use_local_config()
    elif any((__opts__['local'], __opts__['file_client'] == 'local', __opts__['master_type'] == 'disable')):
        # will query vault
        return _use_local_config()
    else:
        log.debug('Contacting master for Vault connection details')
        # will perform lookup on behalf of the minion
        return _get_token_and_url_from_master()

but previous version of the code used to read:

if __opts__.has_key('vault') and not __opts__.get('__role', 'minion') == 'master':
    log.debug('Using Vault connection details from local config')
    try:
      return {
        'url': __opts__['vault']['url'],
        'token': __opts__['vault']['auth']['token']
      }
    except KeyError as err:
      errmsg = 'Minion has "vault" config section, but could not find key "{0}" within'.format(err.message)
      raise salt.exceptions.CommandExecutionError(errmsg)
  else:
    log.debug('Contacting master for Vault connection details')
    return _get_token_and_url_from_master()

As you can see, the logic was adjusted to remove the not in the not __opts__.get('__role', 'minion') == 'master' lookup. Fixing this just results in the elif erroring out on a missing key with __opts__['master_type']. If that is corrected to use __opts__.get('master_type'), then that conditional still succeeds because __opts__['local'] results to True.

In the case of a vault in a pillar where the master connects to vault with the vault profile of a minion, the first two conditionals in that if elif should fail. The commit messages state these lines were added to support vault lookups in salt-ssh scenarios, but since we don't have that environment to test I'm not sure of the best way to progress.

Leakages from vault are capable of being catastrophic, so this requires a complete unit test to validate all scenarios.

Setup

  • Create vault secret which only the salt master can access
  • Create pillar to lookup secret, and assign to salt minion
{%- set supersecret = salt['vault'].read_secret('secret/master_only') %}
test:
  - {{ supersecret}}
  • Perform a vault.read_secret on the minion. This should fail, but actually returns the secret data!

Versions Report

# salt --versions-report
Salt Version:
           Salt: 2018.3.0

Dependency Versions:
           cffi: 1.6.0
       cherrypy: unknown
       dateutil: 2.7.3
      docker-py: Not Installed
          gitdb: Not Installed
      gitpython: Not Installed
          ioflo: Not Installed
         Jinja2: 2.10
        libgit2: 0.26.3
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.5.6
   mysql-python: Not Installed
      pycparser: 2.18
       pycrypto: 2.6.1
   pycryptodome: 3.4.3
         pygit2: 0.26.4
         Python: 2.7.5 (default, Jul 13 2018, 13:06:57)
   python-gnupg: Not Installed
         PyYAML: 3.13
          PyZMQ: 17.1.2
           RAET: Not Installed
          smmap: Not Installed
        timelib: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.2.5

System Versions:
           dist: centos 7.5.1804 Core
         locale: UTF-8
        machine: x86_64
        release: 3.10.0-862.11.6.el7.x86_64
         system: Linux
        version: CentOS Linux 7.5.1804 Core
@garethgreenaway garethgreenaway added this to the Approved milestone Sep 17, 2018
@garethgreenaway garethgreenaway added Bug broken, incorrect, or confusing behavior severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around P2 Priority 2 severity-critical top severity, seen by most users, serious issues v2018.3.4 unsupported version labels Sep 17, 2018
@The-Loeki
Copy link
Contributor

The-Loeki commented Sep 21, 2018

also ping @garethgreenaway @gtmanfred

I've currently done the legwork on #49343 / #43288; these are about adding pillar support to the Vault config for doling out policies to the tokens (rather than the completely arbitrary grains now, see discussions there).
As noted this created the possibility of an endless loop.
The bug/undocumented feature in this issue is what prevents this from happening; I've now tested a multitude of scenarios without being able to reproduce it.
The problem is therefore that fixing both seems impossible.

@mchugh19 you're right in that the docs promise the behaviour you seek.
I however disagree that the Runner or it's peer-run should be used at all during the pillar rendering; that is a master-side operation anyway and as an operator I very strongly expect it to remain that way.
If the minions have interface to Vault, they should be using it.
In fact, I think it just about only makes sense to serve up Vault secrets through pillar, precisely because you don't want the minions to access Vault or the secrets directly or want pillar to be in control of policing access to secrets. I can't think of another reason to do so?

If the master wants to render a pillar, it at most should be using the local config & get an impersonated token.
The impersonated token would have the minion policies, which (practically) by definition are a subset of the masters'; hence everything a minion-token can read, a master-token can read (and render into a pillar).

The behaviour does therefore not introduce breakage and the only very small benefit I can see is that you're positive your pillar doesn't violate the policies on the tokens you're now no longer using.
However, it will reintroduce the infinite recursion problem mentioned in the issues I'm working on now.

All of this considered I'd like to propose to close this; I'll handle doc updates as part of my work on the other issues as I'd really much prefer it to be able to dole out policies based on pillar records and maintain pillar functionality by using the masters' token.

@mchugh19
Copy link
Contributor Author

@The-Loeki the behavior described here isn't just a theoretical implementation described in the documentation, but the way that it previously used to work. We currently keep an older version of _utils/vault.py to run salt 2018.3. We've been running the vault execution modules since before it was in upstream salt, and only noticed this breakage when we attempted to cleanup our local modules in favor of upstream.

I don't disagree that relying on grains for vault policies is a potential security issue. However, in our environment the same ops team that manages the hosts also manages vault. So while a machine could be made to spoof another, the internal folks with access and knowledge already have access to vault. I agree grains still aren't the best solution here, but it just isn't at the top of our priority list to update. If we were able to use pillars for this functionality, we'd gladly switch.

The impersonated token would have the minion policies, which (practically) by definition are a subset of the masters'; hence everything a minion-token can read, a master-token can read (and render into a pillar).

This is incorrect and is at the root of this issue. In our environment, the salt-master only has vault access to generate tokens from vault (using orphan tokens)

# vault token-create -orphan=true -display-name=salt-master -policy=salt-master-creator
# cat salt-master-creator.policy
path "auth/token/*" {
  policy = "sudo"
}

Minion policies are different from the masters, and the master itself has no other vault privileges. With the code changes in 2018.3, we would now need to grant our salt-master access to all vault secrets in order to allow a minion to access anything which would be a security leak.

The current functionality of the salt master returning _use_local_config() and not impersonating a minion to generate the resulting vault token has been changed in 2018.3 and is the problem in this issue. If it were possible to have the logic detected when a vault lookup in a pillar resources into a pillar, so that no further vault connections are attempted, that would be wonderful. If anyone has any thoughts on this issue, point me in the right direction and I'll gladly take a stab at it, but for the moment, we should unbreak the 2018.3 version.

@The-Loeki
Copy link
Contributor

The-Loeki commented Sep 21, 2018

isn't just a theoretical implementation described in the documentation, but the way that it previously used to work.

Well I only figured that one out after reading this bug and plowing through trying to get the promised infinite recursion ;) What I meant to say is I'd rather fix the docs & add pillar-based policies than revert to the old behaviour.

The impersonated token would have the minion policies, which (practically) by definition are a subset of the masters';

This is incorrect and is at the root of this issue.

I'm relatively new to Vault, but I understand that a token can only give out policies the token itself already has?
Besides, if Master has a sudo token capable of generating tokens of arbitrary policy, then uses that token to impersonate a Minion and fetches secrets from Vault (& generates a pillar w/it), is that not in effect the very same access escalation you seek to avoid?

So from what I can tell

  • Pillars must not be using (Peer)-Runners
  • Pillars should/might/must impersonate minions during secret fetch
  • Vault policies must not be given out through grains
  • Vault policies might be given out through pillar
    • -but- we can't do that if the master impersonates because it introduces the infinite recursion loop

@mchugh19
Copy link
Contributor Author

@The-Loeki

I'm relatively new to Vault, but I understand that a token can only give out policies the token itself already has?

Nope. That's the orphan functionality.

Besides, if Master has a sudo token capable of generating tokens of arbitrary policy, then uses that token to impersonate a Minion and fetches secrets from Vault (& generates a pillar w/it), is that not in effect the very same access escalation you seek to avoid?

In code yes, but not via salt. Since the salt-master has sudo to generate tokens, it could generate a token with all policies (if you knew the names), or generate a multi-use token with a 3 year ttl. However, the tokens generated by the salt vault implementation are single use and only for the requested policies.

Might it make sense to create a similar vault.read_secret_pillar_policy function which contains the desired pillar policy lookup behavior?

@The-Loeki
Copy link
Contributor

The-Loeki commented Sep 21, 2018

In code yes, but not via salt.

So what's the difference in terms of security then? Anyone with sufficient write-access to the Salt master's repos has the capability to abuse that power, with or without impersonation. So as I mentioned before, the main difference (aside from you having to make the access explicit) is the operators having to be aware which secrets they are passing on through the pillar (which is already true anyway).

That's why I vastly prefer building in pillar-based policies and updating the docs to reflect this (lack of) impersonation.

Fixxing it outright seems impossible without reintroducing aforementioned infinite recursion.
One thing I thought of is using the Vault ext_pillar; it could simply __virtual__() to False if you use {pillar} in the config and then use the impersonation. Thinking that idea further we might build an impersonate=True into the ext_pillar or something.

Other than that the only solution I currently see is a big fat WARNING in the docs ;)
Which is also fine by me btw, but I wanna know before I resubmit my PR for pillar-based policies.

@garethgreenaway
Copy link
Contributor

garethgreenaway commented Sep 27, 2018

@mchugh19 Apologies for the delay on this one, circled back to it today. Can you provide some examples of your Vault policies for both the minions and the master, as well as the relevant Salt master configuration for the Vault. Also to confirm, in your setup the Salt master does not have access to the Vault endpoint but only access to generate tokens, while the minions have access to the endpoints.
Thanks!

@mchugh19
Copy link
Contributor Author

Sure thing...

Master vault config:

vault:
  auth:
    method: token
    token: some-token
  policies:
  - saltstack/minions
  - saltstack/minion/{minion}
  - saltstack/{grains[ec2][account_id]}
  - saltstack/minion/roles/{grains[ec2_roles]}
  url: https://vault.utils.dnbaws.net

The salt master's vault token was created with

vault token-create -orphan=true -display-name=salt-master -policy=salt-master-creator

using the salt-master-creator policy of

salt-master-creator:
  vault.policy_present:
    - name: salt-master-creator
    - rules: |
        path "auth/token/create-orphan" {
          capabilities = ["update", "sudo"]
        }
        path "auth/token/create" {
          capabilities = ["update", "sudo"]
        }

This means that the salt-master can create tokens (with the sudo permission on the auth/token/create endpoints) but because the salt-master's token was created with the -orphans=true option, it is able to create tokens with policies that it is not a member of.

Policies applied to test minion:

salt-run vault.show_policies i-0baaf4036e96a8b35
- saltstack/minions
- saltstack/minion/i-0baaf4036e96a8b35
- saltstack/219829223289

As you can see, the minion is successfully getting the saltstack/{grains[ec2][account_id]} policy applied

Test Vault policy - using the saltstack/{grains[ec2][account_id]} policy

prime-devqa-firehose:
  vault.policy_present:
    - name: saltstack/219829223289
    - rules: |
        path "secret/aws-devqa/firehose_send" {
          policy = "read"
        }

This shows members of the saltstack/219829223289 policy should have read permission on the secret/aws-devqa/firehose_send endpoint.

With the initial utils/vault.py from 2017.7, the minion is able to successfully access the firehose_send endpoint

salt i-0baaf4036e96a8b35 vault.read_secret secret/aws-devqa/firehose_send
i-0baaf4036e96a8b35:
    ----------
    keyid:
        someid
    keysec:
        somevalue

While the master is not

salt-call vault.read_secret secret/prime/aws-devqa/firehose_send
[ERROR   ] Failed to read secret! HTTPError: 403 Client Error: Forbidden for url: https://vault.url/v1/secret/aws-devqa/firehose_send
Error running 'vault.read_secret': 403 Client Error: Forbidden for url: https://vault.url/v1/secret/aws-devqa/firehose_send

If that endpoint is used in a pillar, and assigned to the host, the minion can access the pillar just fine:
pillar file

{%- set supersecret = salt['vault'].read_secret('secret/aws-devqa/firehose_send') %}
test:
  - {{ supersecret}}
salt i-0baaf4036e96a8b35 pillar.item test
i-0baaf4036e96a8b35:
    ----------
    test:
        |_
          ----------
          keyid:
              somekey
          keysec:
              somevalue

Pshew! This sets up the current state. The problem comes when we remove the local copy of _utils/vault.py, and use the upstream one from 2018.3.

rm -f /var/cache/salt/master/extmods/utils/vault.py*
salt i-0baaf4036e96a8b35 saltutil.pillar_refresh
salt i-0baaf4036e96a8b35 pillar.item test
i-0baaf4036e96a8b35:
    ----------
    test:

With the following in the master log:

2018-09-27 13:05:51,696 [salt.loaded.int.module.vault:148 ][ERROR   ][5607] Failed to read secret! HTTPError: 403 Client Error: Forbidden for url: https://vault.url/v1/secret/aws-devqa/firehose_send
2018-09-27 13:05:51,697 [salt.utils.templates:181 ][ERROR   ][5607] Rendering exception occurred
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/salt/utils/templates.py", line 170, in render_tmpl
    output = render_str(tmplstr, context, tmplpath)
  File "/usr/lib/python2.7/site-packages/salt/utils/templates.py", line 419, in render_jinja_tmpl
    tmplstr)
SaltRenderError: Problem running salt function in Jinja template: 403 Client Error: Forbidden for url: https://vault.url/v1/secret/aws-devqa/firehose_send; line 1

---
{%- set supersecret = salt['vault'].read_secret('secret/aws-devqa/firehose_send') %}    <======================
test:
  - {{ supersecret}}
---
2018-09-27 13:05:51,698 [salt.pillar      :741 ][CRITICAL][5607] Rendering SLS 'vault-test' failed, render error:
Problem running salt function in Jinja template: 403 Client Error: Forbidden for url: https://vault.url/v1/secret/aws-devqa/firehose_send; line 1

---
{%- set supersecret = salt['vault'].read_secret('secret/aws-devqa/firehose_send') %}    <======================
test:
  - {{ supersecret}}
---
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/salt/pillar/__init__.py", line 736, in render_pstate
    **defaults)
  File "/usr/lib/python2.7/site-packages/salt/template.py", line 93, in compile_template
    ret = render(input_data, saltenv, sls, **render_kwargs)
  File "/usr/lib/python2.7/site-packages/salt/renderers/jinja.py", line 70, in render
    **kws)
  File "/usr/lib/python2.7/site-packages/salt/utils/templates.py", line 170, in render_tmpl
    output = render_str(tmplstr, context, tmplpath)
  File "/usr/lib/python2.7/site-packages/salt/utils/templates.py", line 419, in render_jinja_tmpl
    tmplstr)
SaltRenderError: Problem running salt function in Jinja template: 403 Client Error: Forbidden for url: https://vault.url/v1/secret/aws-devqa/firehose_send; line 1

---
{%- set supersecret = salt['vault'].read_secret('secret/aws-devqa/firehose_send') %}    <======================
test:
  - {{ supersecret}}
---
2018-09-27 13:05:51,822 [salt.pillar      :1019][CRITICAL][5607] Pillar render error: Rendering SLS 'vault-test' failed. Please see master log for details.

So obviously the minion is now being denied by the vault policy.

At this point, as described in the initial description of this ticket, it seems that the master, when running the vault.read_secret from the pillar, is no longer creating a vault token with the minion's policy list, but instead running the _use_local_config() function in utils/vault.py. Looking over the git history, it seems this was broken with the logic to support salt-ssh.

If I was to modify the vault policy to allow the salt master to read this data, then the pillar lookup would work for this minion.

Having the salt-master create a vault token with the minion's vault policy has been the design of the vault modules from the beginning. So it is this behavior that I think should be restored.

P.S.
As pointed out by @The-Loeki on this thread, allowing vault lookups from pillars is also the behavior which forces the use of only the potentially insecure grains to be used to determine the vault policy which is applied (like saltstack/{grains[ec2][account_id]} ). If you were to allow pillars to be used to determine vault policy (like saltstack/{pillar[role]} or something), then by putting vault.read_secret lookups into a pillar would result in the salt-master needing to to lookup the policy from the pillar which would make it render the pillars, which would find the vault.read_secret.... and thus get into an infinite loop.

@garethgreenaway
Copy link
Contributor

@mchugh19 I believe I have this figured out, can you give this patch a try:

diff --git a/salt/utils/vault.py b/salt/utils/vault.py
index 0942a1543a..b3f5c68b77 100644
--- a/salt/utils/vault.py
+++ b/salt/utils/vault.py
@@ -123,7 +123,8 @@ def get_vault_connection():
             raise salt.exceptions.CommandExecutionError(errmsg)
 
     if 'vault' in __opts__ and __opts__.get('__role', 'minion') == 'master':
-        return _use_local_config()
+        log.debug('Contacting master for Vault connection details')
+        return _get_token_and_url_from_master()
     elif any((__opts__['local'], __opts__['file_client'] == 'local', __opts__['master_type'] == 'disable')):
         return _use_local_config()
     else:

@mchugh19
Copy link
Contributor Author

Yep! That works. I just wasn't sure if that was the proper fix. If you are all good with it, than that works for me.

Thanks!

@The-Loeki
Copy link
Contributor

@garethgreenaway you do realize you're possibly regressing on #45928 doing that?
Have you considered everything @mchugh19 and I discussed here, and what is your pov?

@garethgreenaway
Copy link
Contributor

@The-Loeki Definitely. Always a concern 😄 Before sending along the patch for @mchugh19 to test, I did some manual testing with both the scenario that was described above and using the previous salt-ssh scenario, with the above change both scenarios worked as expected.

The section of code that is used when salt-ssh is used was not changed, it's continues to use _use_local_config when interacting with the Vault. The above change just has the code use _get_token_and_url_from_master when the Vault call is happening from the Salt master, that function has logic to handle the situation when the call is coming from the Salt master.

As far as the discussion above about adding pillar based policies, I like it and I think that would be a great feature to add. Definitely look forward to seeing a future PR 😄. My main focus on this particular issue was to fix the bug that we introduced in 2018.3 and not introducing any additional issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug broken, incorrect, or confusing behavior fixed-pls-verify fix is linked, bug author to confirm fix P2 Priority 2 severity-critical top severity, seen by most users, serious issues severity-medium 3rd level, incorrect or bad functionality, confusing and lacks a work around v2018.3.4 unsupported version
Projects
None yet
Development

No branches or pull requests

4 participants