Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Vault integration and validation #2226

Merged
merged 10 commits into from
Jan 23, 2017
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions command/agent/config-test-fixtures/basic.hcl
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,7 @@ vault {
key_file = "/path/to/key/file"
tls_server_name = "foobar"
tls_skip_verify = true
create_from_role = "test_role"
}
tls {
http = true
Expand Down
1 change: 1 addition & 0 deletions command/agent/config_parse.go
Original file line number Diff line number Diff line change
Expand Up @@ -710,6 +710,7 @@ func parseVaultConfig(result **config.VaultConfig, list *ast.ObjectList) error {
"ca_file",
"ca_path",
"cert_file",
"create_from_role",
"key_file",
"tls_server_name",
"tls_skip_verify",
Expand Down
1 change: 1 addition & 0 deletions command/agent/config_parse_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,7 @@ func TestConfig_Parse(t *testing.T) {
Addr: "127.0.0.1:9500",
AllowUnauthenticated: &trueValue,
Enabled: &falseValue,
Role: "test_role",
TLSCaFile: "/path/to/ca/file",
TLSCaPath: "/path/to/ca",
TLSCertFile: "/path/to/cert/file",
Expand Down
10 changes: 10 additions & 0 deletions nomad/structs/config/vault.go
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,13 @@ type VaultConfig struct {
// lifetime.
Token string `mapstructure:"token"`

// Role sets the role in which to create tokens from. The Token given to
// Nomad does not have to be created from this role but must have "update"
// capability on "auth/token/create/<create_from_role>". If this value is
// unset and the token is created from a role, the value is defaulted to the
// role the token is from.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? What is the use case, ever, for using a role to create a token to use the role?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backwards compatibility, we are in a point release, would break peoples configs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has anyone actually set it up like this? There has never been a need for this to be the required setup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was no other option which is what started this all 👍 If one was using role based tokens, they would be hit by the behavior change

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was always an option, well before the first release with Vault integration.

Role string `mapstructure:"create_from_role"`

// AllowUnauthenticated allows users to submit jobs requiring Vault tokens
// without providing a Vault token proving they have access to these
// policies.
Expand Down Expand Up @@ -99,6 +106,9 @@ func (a *VaultConfig) Merge(b *VaultConfig) *VaultConfig {
if b.Token != "" {
result.Token = b.Token
}
if b.Role != "" {
result.Role = b.Role
}
if b.TaskTokenTTL != "" {
result.TaskTokenTTL = b.TaskTokenTTL
}
Expand Down
197 changes: 182 additions & 15 deletions nomad/vault.go
Original file line number Diff line number Diff line change
Expand Up @@ -47,11 +47,60 @@ const (
// vaultRevocationIntv is the interval at which Vault tokens that failed
// initial revocation are retried
vaultRevocationIntv = 5 * time.Minute

// vaultCapabilitiesLookupPath is the path to lookup the capabilities of
// ones token.
vaultCapabilitiesLookupPath = "/sys/capabilities-self"

// vaultTokenRenewPath is the path used to renew our token
vaultTokenRenewPath = "auth/token/renew-self"

// vaultTokenLookupPath is the path used to lookup a token
vaultTokenLookupPath = "auth/token/lookup"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should add auth/token/lookup-self as a lookup possibility. Most tokens can be looked up using that path as it's in the default policy, so if auth/token/lookup doesn't work you should use that instead (and probably prefer it).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this to limit the set of policies we have to document as Nomad requiring since we can reuse that for both looking our own token up and incoming tokens. But since I guard that validation (https://github.com/hashicorp/nomad/pull/2226/files#diff-87a4c76eca3de54716c55ba6b8c06475R682) I agree. Will make the change

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using lookup-self can be a good secondary option. I agree that keeping lookup there is not an issue.


// vaultTokenRevokePath is the path used to revoke a token
vaultTokenRevokePath = "auth/token/revoke-accessor"

// vaultRoleLookupPath is the path to lookup a role
vaultRoleLookupPath = "auth/token/roles/%s"

// vaultRoleCreatePath is the path to create a token from a role
vaultTokenRoleCreatePath = "auth/token/create/%s"
)

var (
// vaultUnrecoverableError matches unrecoverable errors
vaultUnrecoverableError = regexp.MustCompile(`Code:\s+40(0|3|4)`)

// vaultCapabilitiesCapability is the expected capability of Nomad's Vault
// token on the the path. The token must have at least one of the
// capabilities.
vaultCapabilitiesCapability = []string{"update", "root"}

// vaultTokenRenewCapability is the expected capability Nomad's
// Vault token should have on the path. The token must have at least one of
// the capabilities.
vaultTokenRenewCapability = []string{"update", "root"}

// vaultTokenLookupCapability is the expected capability Nomad's
// Vault token should have on the path. The token must have at least one of
// the capabilities.
vaultTokenLookupCapability = []string{"update", "root"}

// vaultTokenRevokeCapability is the expected capability Nomad's
// Vault token should have on the path. The token must have at least one of
// the capabilities.
vaultTokenRevokeCapability = []string{"update", "root"}

// vaultRoleLookupCapability is the the expected capability Nomad's Vault
// token should have on the path. The token must have at least one of the
// capabilities.
vaultRoleLookupCapability = []string{"read", "root"}

// vaultTokenRoleCreateCapability is the the expected capability Nomad's Vault
// token should have on the path. The token must have at least one of the
// capabilities.
vaultTokenRoleCreateCapability = []string{"update", "root"}
)

// VaultClient is the Servers interface for interfacing with Vault
Expand Down Expand Up @@ -417,7 +466,7 @@ func (v *vaultClient) renewalLoop() {

// Set base values and add some backoff

v.logger.Printf("[DEBUG] vault: got error or bad auth, so backing off: %v", err)
v.logger.Printf("[WARN] vault: got error or bad auth, so backing off: %v", err)
switch {
case backoff < 5:
backoff = 5
Expand Down Expand Up @@ -477,8 +526,9 @@ func (v *vaultClient) renew() error {
// getWrappingFn returns an appropriate wrapping function for Nomad Servers
func (v *vaultClient) getWrappingFn() func(operation, path string) string {
createPath := "auth/token/create"
if !v.tokenData.Root {
createPath = fmt.Sprintf("auth/token/create/%s", v.tokenData.Role)
role := v.getRole()
if role != "" {
createPath = fmt.Sprintf("auth/token/create/%s", role)
}

return func(operation, path string) string {
Expand All @@ -497,7 +547,7 @@ func (v *vaultClient) getWrappingFn() func(operation, path string) string {
func (v *vaultClient) parseSelfToken() error {
// Get the initial lease duration
auth := v.client.Auth().Token()
self, err := auth.LookupSelf()
self, err := auth.Lookup(v.client.Token())
if err != nil {
return fmt.Errorf("failed to lookup Vault periodic token: %v", err)
}
Expand All @@ -516,7 +566,28 @@ func (v *vaultClient) parseSelfToken() error {
}
}

// Store the token data
data.Root = root
v.tokenData = &data

// The criteria that must be met for the token to be valid are as follows:
// 1) If token is non-root or is but has a creation ttl
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tokens should never be root unless Nomad is in dev mode.

Copy link
Contributor Author

@dadgar dadgar Jan 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't restrict people to that. That is an organizational decision not one Nomad will enforce on people. We just highly discourage it in our docs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use a root token and would really like to keep doing that... we only use vault for ease of storage, not for the hardcore security and ACL level stuff

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dadgar Per our original discussions around implementation it was never supposed to be allowed outside of dev mode in the first place.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jippi You should still not use root tokens, if for no other reason than that they are quite difficult to generate at this point. Just use a periodic token with policies giving all privileges to /*.

// a) The token must be renewable
// b) Token must have a non-zero TTL
// 2) Must have update capability for "auth/token/lookup/" (used to verify incoming tokens)
// 3) Must have update capability for "/auth/token/revoke-accessor/" (used to revoke unneeded tokens)
// 4) If configured to create tokens against a role:
// a) Must have read capability for "auth/token/roles/<role_name" (Can just attemp a read)
// b) Must have update capability for path "auth/token/create/<role_name>"
// c) Role must:
// 1) Not allow orphans
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing that this is copypasta from Asana but you probably mean must allow orphans. Or should allow orphans.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was not changing this behavior in this PR. Down the line we may want to relax this constraint and let the operator decide if they would like to allow orphans. But for now we are not allowing it so that when Nomad revokes a token the whole tree is revoked.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By extension this means that if Nomad's token becomes invalid all jobs also now have invalid tokens. This is not the suggested operational model that we tell people to use; we recommend orphan periodic tokens for services. There is no reason to artificially impose this upon administrators; they should be free to use their own judgement, especially since you're already requiring periodic tokens.

// 2) Must allow tokens to be renewed
// 3) Must not have an explicit max TTL
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you can limit job length, allowing an explicit max TTL is a decent idea as an additional check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not a bad idea for those who want it but for now I am keeping it at being a periodic token with no max TTL. We can revisit in another PR/release

// 4) Must have non-zero period
// 5) If not configured against a role, the token must be root
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If not configured against a role, you still shouldn't require a root token. There's no need. You could require a token that has sudo capability against auth/token/create. Or you could just try to make a token and see if it works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was based on your own suggestion:

5a) Don't require this, it's nearly the same as requiring a root token.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly. Don't require a root token. You also shouldn't require a sudo token, but if you insist that you really must have something with super high privileges on that endpoint, make it a sudo token so that you aren't encouraging people to pass in root tokens.

What you should do instead is to just try making the token, and it will work or it will not. There's no reason to be running these checks.


var mErr multierror.Error
role := v.getRole()
if !root {
// All non-root tokens must be renewable
if !data.Renewable {
Expand All @@ -534,7 +605,7 @@ func (v *vaultClient) parseSelfToken() error {
}

// There must be a valid role since we aren't root
if data.Role == "" {
if role == "" {
multierror.Append(&mErr, fmt.Errorf("token role name must be set when not using a root token"))
}

Expand All @@ -548,18 +619,106 @@ func (v *vaultClient) parseSelfToken() error {
}
}

// Check we have the correct capabilities
if err := v.validateCapabilities(role, root); err != nil {
multierror.Append(&mErr, err)
}

// If given a role validate it
if data.Role != "" {
if err := v.validateRole(data.Role); err != nil {
if role != "" {
if err := v.validateRole(role); err != nil {
multierror.Append(&mErr, err)
}
}

data.Root = root
v.tokenData = &data
return mErr.ErrorOrNil()
}

// getRole returns the role name to be used when creating tokens
func (v *vaultClient) getRole() string {
if v.config.Role != "" {
return v.config.Role
}

return v.tokenData.Role
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't do this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mentioned above why it is like this for now

}

// validateCapabilities checks that Nomad's Vault token has the correct
// capabilities.
func (v *vaultClient) validateCapabilities(role string, root bool) error {
// Check if the token can lookup capabilities.
var mErr multierror.Error
_, _, err := v.hasCapability(vaultCapabilitiesLookupPath, vaultCapabilitiesCapability)
if err != nil {
// Check if there is a permission denied
if vaultUnrecoverableError.MatchString(err.Error()) {
// Since we can't read permissions, we just log a warning that we
// can't tell if the Vault token will work
msg := fmt.Sprintf("Can not lookup token capabilities. "+
"As such certain operations may fail in the future. "+
"Please give Nomad a Vault token with one of the following "+
"capabilities %q on %q so that the required capabilities can be verified",
vaultCapabilitiesCapability, vaultCapabilitiesLookupPath)
v.logger.Printf("[WARN] vault: %s", msg)
return nil
} else {
multierror.Append(&mErr, err)
}
}

// verify is a helper function that verifies the token has one of the
// capabilities on the given path and adds an issue to the error
verify := func(path string, requiredCaps []string) {
ok, caps, err := v.hasCapability(path, requiredCaps)
if err != nil {
multierror.Append(&mErr, err)
} else if !ok {
multierror.Append(&mErr,
fmt.Errorf("token must have one of the following capabilities %q on %q; has %v", requiredCaps, path, caps))
}
}

// Check if we are verifying incoming tokens
if !v.config.AllowsUnauthenticated() {
verify(vaultTokenLookupPath, vaultTokenLookupCapability)
}

// Verify we can renew our selves tokens
verify(vaultTokenRenewPath, vaultTokenRenewCapability)

// Verify we can revoke tokens
verify(vaultTokenRevokePath, vaultTokenRevokeCapability)

// If we are using a role verify the capability
if role != "" {
// Verify we can read the role
verify(fmt.Sprintf(vaultRoleLookupPath, role), vaultRoleLookupCapability)

// Verify we can create from the role
verify(fmt.Sprintf(vaultTokenRoleCreatePath, role), vaultTokenRoleCreateCapability)
}

return mErr.ErrorOrNil()
}

// hasCapability takes a path and returns whether the token has at least one of
// the required capabilities on the given path. It also returns the set of
// capabilities the token does have as well as any error that occured.
func (v *vaultClient) hasCapability(path string, required []string) (bool, []string, error) {
caps, err := v.client.Sys().CapabilitiesSelf(path)
if err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"root" in the required slice should be conditional on Nomad being in dev mode.

return false, nil, err
}
for _, c := range caps {
for _, r := range required {
if c == r {
return true, caps, nil
}
}
}
return false, caps, nil
}

// validateRole contacts Vault and checks that the given Vault role is valid for
// the purposes of being used by Nomad
func (v *vaultClient) validateRole(role string) error {
Expand All @@ -575,10 +734,11 @@ func (v *vaultClient) validateRole(role string) error {

// Read and parse the fields
var data struct {
ExplicitMaxTtl int `mapstructure:"explicit_max_ttl"`
Orphan bool
Period int
Renewable bool
ExplicitMaxTtl int `mapstructure:"explicit_max_ttl"`
Orphan bool
Period int
Renewable bool
DisallowedPolicies []string `mapstructure:"disallowed_policies"`
}
if err := mapstructure.WeakDecode(rsecret.Data, &data); err != nil {
return fmt.Errorf("failed to parse Vault role's data block: %v", err)
Expand All @@ -602,6 +762,12 @@ func (v *vaultClient) validateRole(role string) error {
multierror.Append(&mErr, fmt.Errorf("Role must have a non-zero period to make tokens periodic."))
}

for _, d := range data.DisallowedPolicies {
if d == "default" {
multierror.Append(&mErr, fmt.Errorf("Role can not disallow allow default policy"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure it can.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good catch! That was for me while I was testing something!

}
}

return mErr.ErrorOrNil()
}

Expand Down Expand Up @@ -679,12 +845,13 @@ func (v *vaultClient) CreateToken(ctx context.Context, a *structs.Allocation, ta
// token or a role based token
var secret *vapi.Secret
var err error
if v.tokenData.Root {
role := v.getRole()
if v.tokenData.Root && role == "" {
req.Period = v.childTTL
secret, err = v.auth.Create(req)
} else {
// Make the token using the role
secret, err = v.auth.CreateWithRole(req, v.tokenData.Role)
secret, err = v.auth.CreateWithRole(req, v.getRole())
}

// Determine whether it is unrecoverable
Expand Down
Loading