Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

salt.utils.vault.make_response Token Handling Fix #62552

Closed
wants to merge 2 commits into from

Conversation

damonearp
Copy link

What does this PR do?

Fixes a logic error where a Vault Single Use Token is used but its reference count uses isn't decremented.

What issues does this PR fix or reference?

Possibly Fixes:

Related:

Previous Behavior

If a path in a secret does not exist and the same state tries to write the secret, the write will fail with an HTTP 403.

secret = __salt__["vault.read_secret"]("secret/subpath", default=None)
if secret is None:
  __salt__["vault.write_secret"]("secret/subpath", mykey="myvalue")

In this example the vault.write_secret will get an HTTP 403.

New Behavior

secret = __salt__["vault.read_secret"]("secret/subpath", default=None)
if secret is None:
  __salt__["vault.write_secret"]("secret/subpath", mykey="myvalue")

The vault.write_secret will now succeed as it will request a new token before trying to write the secret.

Merge requirements satisfied?

I will need help with this part, but I also want someone more knowledgable to validate this is the correct approach. For example: Shouldn't all limited use tokens be decremented regardless of the HTTP response code?

Commits signed with GPG?

Yes/No

Please review Salt's Contributing Guide for best practices.

See GitHub's page on GPG signing for more information about signing commits with GPG.

@damonearp damonearp requested a review from a team as a code owner August 26, 2022 19:33
@damonearp damonearp requested review from garethgreenaway and removed request for a team August 26, 2022 19:33
@welcome
Copy link

welcome bot commented Aug 26, 2022

Hi there! Welcome to the Salt Community! Thank you for making your first contribution. We have a lengthy process for issues and PRs. Someone from the Core Team will follow up as soon as possible. In the meantime, here’s some information that may help as you continue your Salt journey.
Please be sure to review our Code of Conduct. Also, check out some of our community resources including:

There are lots of ways to get involved in our community. Every month, there are around a dozen opportunities to meet with other contributors and the Salt Core team and collaborate in real time. The best way to keep track is by subscribing to the Salt Community Events Calendar.
If you have additional questions, email us at saltproject@vmware.com. We’re glad you’ve joined our community and look forward to doing awesome things with you!

@garethgreenaway garethgreenaway added the Sulfur v3006.0 release code name and version label Sep 2, 2022
@garethgreenaway garethgreenaway added this to the Sulphur v3006.0 milestone Sep 2, 2022
@garethgreenaway garethgreenaway added the Needs-Testcase PR needs test cases written, or the issue is about a bug/feature that needs test cases label Sep 2, 2022
@garethgreenaway
Copy link
Contributor

@damonearp Thanks for the PR. Adding some tests for this change would be great, please let us know if you need assistance doing so. Thanks!

@Ch3LL
Copy link
Contributor

Ch3LL commented Sep 22, 2022

bump @damonearp did you see @garethgreenaway 's comment?

@damonearp
Copy link
Author

Hey. sorry for the delay. I need to set this back up and confirm this fix will work. In our custom states I didn't want to modify the salt python files under /usr so I attempted to just delete the token in the __context__ after each request... in practice this didn't solve our issue, as single use tokens were still being used multiple times.

To get everything working I upped the # of uses for the tokens generated, but haven't had the chance to revisit this change to see if it would have solved our issue.

Next week I'll reduce the token uses back to 1 and apply this patch to /usr before applying the state. Once I verify this patch solves the issue, I'll work on formalizing this PR.

@damonearp
Copy link
Author

I have verified that our custom states work when this change is made, and tokens are issued as single use.

Moving forward:

  1. While I'm not familiar with the project's testing setup I assume https://github.com/saltstack/salt/blob/master/tests/pytests/unit/utils/test_vault.py#L293-L310 and the following function would be a good template to follow. I can stare at it a few times to see if I can figure out how to properly mock a 404 followed by a write.

  2. I feel like this is a band-aide to a larger issue: This patch makes it so 1 use tokens are properly cleaned up if the response codes are 2xx, 403, and 404. But what about a 400, presumably the token is now "used". Would it be correct to simply move the check of response.ok to after the token count logic that follows immediately? Specifically move the if on L351-353 to L373? Or even just delete line 353, after all that'll leave the logging of the non 2xx response but still run the token cleanup.

I'm just not an expert on vault's API nor do I have the time to drill down into what is the correct approach.

lkubb added a commit to lkubb/salt-vault-formula that referenced this pull request Oct 5, 2022
This commit represents a fundamental rewrite in how Salt interacts with
Vault. The master should still be compatible with minions running the
old code. There should be no breaking changes to public interfaces and
the old configuration format should still apply.

Core:
- Issue AppRoles to minions
- Manage entities with templatable metadata for minions
- Use inbuilt Salt cache
- Separate config cache from token cache
- Cache: introduce connection-scope vs global scope

Utility module:
- Support being imported (__utils__ deprecation)
- Raise exceptions on queries to simplify response handling
- Add classes to wrap complexity, especially regarding KV v2
- Lay some groundwork for renewing tokens

Execution module:
- Add patch_secret
- Add version support to delete_secret
- Allow returning listed keys only in list_secret
- Add policy_[fetch/write/delete] and policies_list
- Add query for arbitrary API queries

State module:
- Make use of execution module
- Change output format

Docs:
- Update for new configuration format
- Correct examples
- Add configuration examples
- Add required policies

Fixes:
saltstack/salt#62552
saltstack/salt#59827
saltstack/salt#62380
saltstack/salt#58174

Probably fixes:
saltstack/salt#60779
saltstack/salt#57561

Might fix:
saltstack/salt#59846
Copy link
Contributor

@twangboy twangboy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still looking for some tests here and a changelog file

@garethgreenaway
Copy link
Contributor

@damonearp I was just reading the description of the scenario that this PR is fixing, is the count for the token(s) not being deprecated in Vault or is it Salt? If it's Vault then this seems like it might be an upstream problem?

@Ch3LL
Copy link
Contributor

Ch3LL commented Dec 5, 2022

@damonearp are you willing to come back to this and implement the feedback and answer @garethgreenaway 's question?

@damonearp
Copy link
Author

Wow so sorry I missed the notification of @garethgreenaway message.

The issue is that the salt code isn't decrementing the count on certain responses from vault. What lead me to find this is the situation when the secret does not exist. Salt returns with a 404 not a 2XX, and the code assumes only 2XX responses happen when a token's use is decremented in vault.

Here is an example custom state (this is heavily minimized and no longer a proper state):

# vaultkv.present updates or creates a secret setting the provided key value inside
def present(name, value, path):
    secret = __salt__["vault.read_secret"](path, metadata=True, default=dict())
    data = secret.setdefault("data", dict())
    val = data.get(name)
    if val is not None and deepequal(val, value):
        # we return here that the value is already the requested value
    data[name] = value
    if not __salt__["vault.write_raw"](path, data):
        # return error 
    # return success

And for argument sake let's say this is in our sls (doubt the jinja is right but it's only here as an example)

    ip:
      vaultkv.present:
        - value: {{ salt['grain.get']('network.ip_addrs')[0] }}
        - path: myapp/nodeinfo

If the kv secrets engine in vault myapp exists and does not have a secret named nodeinfo then this state would create it. However it needs either 2 1-use tokens or 1 2-use token. The vault.read_secret will get an HTTP 404 and salt thinks that the token didn't get decremented since response.ok will be false.

Does that answer your question?

@damonearp
Copy link
Author

In response to are you willing to comeback to this and implement the feedback?

  1. Yes, I've looked at Contributing link above however this is a whole new system for me to learn with little benefit so it's going to be slow.
  2. I'm still waiting for a response to what I wrote in Sept:

    I feel like this is a band-aide to a larger issue: This patch makes it so 1 use tokens are properly cleaned up if the response codes are 2xx, 403, and 404. But what about a 400, presumably the token is now "used". Would it be correct to simply move the check of response.ok to after the token count logic that follows immediately? Specifically move the if on L351-353 to L373? Or even just delete line 353, after all that'll leave the logging of the non 2xx response but still run the token cleanup.

@Ch3LL
Copy link
Contributor

Ch3LL commented Dec 6, 2022

@garethgreenaway any chance you can answer @damonearp 's questions and review his latest answer to your question since you have recently been in this code.

@Ch3LL Ch3LL added Chlorine v3007.0 and removed Sulfur v3006.0 release code name and version labels Dec 6, 2022
@damonearp
Copy link
Author

Seeing as there is no communication and I'm not working on it, I guess we close and just create a bug issue for this.

Bug: Vault Use Tokens aren't decremented on 404 and other valid HTTP statuses.

@nufissime
Copy link

I can't believe this still hasn't been fixed. The flaw in the code is obvious and fixing the bug only requires a few small changes.

@anilsil anilsil modified the milestones: Sulfur v3006.1, Sulfur v3006.2 May 1, 2023
@garethgreenaway
Copy link
Contributor

@lucasarnulphy The fix might be small but we do require tests for any pull requests that are changing any code. We do this for two main reasons, 1) to ensure that a later change doesn't break functionality that this current change is introducing and 2) to ensure that this change doesn't break existing functionality. For a long time we did not require tests and the state of the project got into a very bad state, so now they are always required regardless of the size of the change.

@dwoz
Copy link
Contributor

dwoz commented Dec 11, 2023

Closing this due to inactivity. Anyone should feel free to re-open it if they want to see it through to the end in one release cycle.

@dwoz dwoz closed this Dec 11, 2023
@dwoz dwoz added help-wanted Community help is needed to resolve this Abandoned labels Dec 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Abandoned help-wanted Community help is needed to resolve this needs-changelog-file Needs-Testcase PR needs test cases written, or the issue is about a bug/feature that needs test cases pending-close
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants