Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ACL Token Persistence and Reloading #5328

Merged
merged 8 commits into from
Feb 27, 2019
Merged

Conversation

mkeeler
Copy link
Member

@mkeeler mkeeler commented Feb 11, 2019

This PR adds two features which will be useful for operators when ACLs are in use.

  1. Tokens set in configuration files are now reloadable.
  2. If acl.enable_token_persistence is set to true in the configuration, tokens set via the v1/agent/token endpoint are now persisted to disk and loaded when the agent starts (or during configuration reload)

Note that token persistence is opt-in so our users who do not want tokens on the local disk will see no change.

Some other secondary changes:

  • Refactored a bunch of places where the replication token is retrieved from the token store. This token isn't just for replicating ACLs and now it is named accordingly.
  • Allowed better paths in the v1/agent/token/ API. Instead of paths like: v1/agent/token/acl_replication_token the path can now be just v1/agent/token/replication. The old paths remain to be valid.
  • Added a couple new API functions to set tokens via the new paths. Deprecated the old ones and pointed to the new names. The names are also generally better and don't imply that what you are setting is for ACLs but rather are setting ACL tokens. There is a minor semantic difference there especially for the replication token as again, its no longer used only for ACL token/policy replication.
  • Docs updated to reflect the API additions and to show using the new endpoints.
  • Updated the ACL CLI set-agent-tokens command to use the non-deprecated APIs. I kind of went back and forth on this one. We may want to let it continue using the deprecated ones so that it will work with 1.4.0 - 1.4.2. However people could always just used the older CLI which is probably what they will be using if they haven't upgraded things to 1.4.3 yet? If anyone has concerns about this I would be open to flipping it back to the deprecated APIs.

@mkeeler mkeeler added theme/acls ACL and token generation theme/operator-usability Replaces UX. Anything related to making things easier for the practitioner labels Feb 11, 2019
@mkeeler mkeeler added this to the 1.4.3 milestone Feb 11, 2019
@mkeeler mkeeler requested review from a team February 11, 2019 16:11
agent/agent.go Outdated Show resolved Hide resolved
}

func (a *Agent) loadTokens(conf *config.RuntimeConfig) error {
persistedTokens, persistenceErr := a.getPersistedTokens()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not hard-fail if the file is unreadable or corrupt? It seems like an unlikely event that should warrant extraordinary operational attention.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it could be a hard fail. I modeled this off of other load* functions which did similarly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should a corrupt file prevent the agent from starting at all? Thats what would happen with a hard fail.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should corrupt persisted services prevent the agent from starting up?. I think the answer is no and we should instead just log the warning.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we go for worst case scenario, the agent isn't going to be terribly useful on restart if the sole store of tokens is that file, it's corrupt, and the operator is has a full belt-and-suspenders ACL setup configured.

Sure the agent will start, but it won't have any acl tokens with which to operate. If you had a fleet of hundreds of agents how can you tell the difference between an operational one and one that starts up but has no valid tokens?

But then I guess this is nearly indistinguishable from an agent having all of its acl tokens be rotated into the bit bucket already today.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but with the agent up you can remedy the situation by PUTing new tokens to the API. Which is better than needing to, remote in, blow away/fix the corrupt file and restart (maybe still need to repost up tokens depending on the corruption).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well if you were being secure about it, the API would only be bound to loopback anyway, so remoting in is still required.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree that the only secure way to enable the API is bound to loopback. If using HTTPs + ACLs I would allow it on the LAN.

Additionally, even if not able to directly connect to the API I would say that doing a consul acl set-agent-token ... is much easier/reliable then having to hand edit files (which is still always a viable alternative).

@mkeeler mkeeler requested review from a team and rboyer February 22, 2019 21:20
@mkeeler mkeeler force-pushed the feature/acl-token-persistence branch 2 times, most recently from ac1f9c5 to 755801c Compare February 25, 2019 20:31
Copy link
Contributor

@kyhavlov kyhavlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, Matt. Just one comment/question but otherwise 👍

agent/agent.go Outdated Show resolved Hide resolved
mkeeler and others added 6 commits February 26, 2019 13:27
Before it sounded to ACL specific (disregarding the fact that it is an ACL token). This token also gets used for connect replication.

Also the agent API for setting tokens now recognizes simpler names in the URL for setting the tokens which match their names in the consul configuration. The API has a few new functions but left the others in place for backwards compatibility.
Co-Authored-By: mkeeler <mkeeler@users.noreply.github.com>
@mkeeler mkeeler force-pushed the feature/acl-token-persistence branch from d6a09a5 to 70b6dd2 Compare February 26, 2019 18:40
@mkeeler mkeeler requested a review from kyhavlov February 26, 2019 18:46

default:
resp.WriteHeader(http.StatusNotFound)
fmt.Fprintf(resp, "Token %q is unknown", target)
return nil, nil
}

if s.agent.config.ACLEnableTokenPersistence {
tokens := make(map[string]string)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The persistedTokens struct could be used here as well

When setting a token if a 404 is returned then we will fallback to the original names to make the request.
@mkeeler mkeeler merged commit 118adbb into master Feb 27, 2019
@mkeeler mkeeler deleted the feature/acl-token-persistence branch March 4, 2019 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
theme/acls ACL and token generation theme/operator-usability Replaces UX. Anything related to making things easier for the practitioner
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants