Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Resource: aws_kms_grant #3038

Merged
merged 7 commits into from
Mar 12, 2018
Merged

New Resource: aws_kms_grant #3038

merged 7 commits into from
Mar 12, 2018

Conversation

jammerful
Copy link
Contributor

@jammerful jammerful commented Jan 17, 2018

Hey,

Basically I think this is close to being ready to merge, but I see two flaws with it:

  • Dealing with the InvalidArnException upon creation of a grant that depends on a newly created role. I have a retry in my code, but it will just retry in the case that the arn is actually invalid. I don't see a way to distinguish between eventual consistency and a truly invalid arn.
    • I think this maybe okay behavior.
  • In a similar vein, if you refresh the state quickly after creating a role and grant (as is the case with the acceptance tests currently). The read function will sometimes get an internal identifier (AROAXXXXX instad of arn:aws:XXX) to the grantee principal instead of an ARN, which would if you naively take the value be written to the state file and cause a new resource to be created since grants have no way of being updated.
    • What I did, which I saw done in another resource file, is if the read doesn't return valid grant or retiree principals then don't update the state file for those attributes and log a warning. This will only happen in two cases ~1 minute after iam principal creation (aws acknoleged eventual consistency problem) and if one of the principals has been deleted (in cases where it is managed outside of terraform).

If merged this resolves #641

I have run the acceptance tests in their current form, and they pass:

$ make testacc TESTARGS='-run=TestAWSKmsGrant'
==> Checking that code complies with gofmt requirements...
TF_ACC=1 go test ./... -v -run=TestAWSKmsGrant -timeout 120m
?   	github.com/terraform-providers/terraform-provider-aws	[no test files]
=== RUN   TestAWSKmsGrant_Basic
--- PASS: TestAWSKmsGrant_Basic (39.08s)
=== RUN   TestAWSKmsGrant_withConstraints
--- PASS: TestAWSKmsGrant_withConstraints (30.75s)
=== RUN   TestAWSKmsGrant_withRetiringPrincipal
--- PASS: TestAWSKmsGrant_withRetiringPrincipal (39.39s)
=== RUN   TestAWSKmsGrant_bare
--- PASS: TestAWSKmsGrant_bare (38.95s)
PASS
ok  	github.com/terraform-providers/terraform-provider-aws/aws	148.205s

I have also run the unit tests and they pass:

$ make test
==> Checking that code complies with gofmt requirements...
go test ./... -timeout=30s -parallel=4
?   	github.com/terraform-providers/terraform-provider-aws	[no test files]
ok  	github.com/terraform-providers/terraform-provider-aws/aws	1.765s

@bflad bflad added new-resource Introduces a new resource. service/kms Issues and PRs that pertain to the kms service. labels Jan 17, 2018
@radeksimko radeksimko changed the title Add AWS KMS Grant Resource New Resource: aws_kms_grant Jan 18, 2018
// an InvalidArnException to be thrown. TODO: Possibly change the aws_iam_role code?
if awsErr.Code() == "DependencyTimeoutException" ||
awsErr.Code() == "InternalException" ||
awsErr.Code() == "InvalidArnException" {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the first issue I mentioned.

// Confirmed with AWS that this is expected behavior.
// TODO: Handle upstream or here?
if grant != nil {
if !strings.HasPrefix(*grant.GranteePrincipal, "arn:aws") {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the second issue I mentioned.

@ghost ghost added the size/XL Managed by automation to categorize the size of a PR. label Feb 9, 2018
@jammerful
Copy link
Contributor Author

jammerful commented Feb 9, 2018

@radeksimko @bflad @paddycarver @Ninir @Puneeth-n Can you please take a look at this soon?

@kikitux
Copy link
Contributor

kikitux commented Feb 27, 2018

Sorry for the delay, finding someone for a review

Copy link
Contributor

@paddycarver paddycarver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay in reviewing this. Thanks for opening the PR! It looks nice overall, but I had some feedback.

ValidateFunc: validateArn,
},
"operations": {
Type: schema.TypeList,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does order matter here? Can the API return items in a different order without breaking its contract? And does it make sense to make the elements of this list available for interpolation?

If the answer to all of these is "no", then this should probably be a Set. If the answer to the first two is no but the third is yes, that should probably be a conversation we have and come to a decision about how to make this work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The answer to all of these is no. Will make a set, I think the real benefit here is the set avoids duplication which could cause errors upon creation.

ForceNew: true,
},
"constraints": {
Type: schema.TypeList,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same questions here, re: making it a Set possibly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will make a set.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm almost done with my revision, so I'll update the PR tomorrow. Sorry for the delay, the hash function for constraints now that it is a set is causing me a little heartache.

Type: schema.TypeMap,
Optional: true,
Elem: schema.TypeString,
ConflictsWith: []string{"constraints.0.encryption_context_subset"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not 100% sure that ConflictsWith works this way. E.g., I don't know if setting the encryption context subset on the first element will allow the second element to have encryption context equals set.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It worked it my testing, I'm trying not allow both encryption_context_subset and encryption_context_equals both set, since the AWS KMS Grant API does not allow that. I'll have to change how that looks with the move to sets. Make sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think I get what you were trying to do here, the syntax is just a bit :/ It may be worthwhile to just implement that logic manually to throw an error in the Create function.

ValidateFunc: validateArn,
},
"grant_creation_tokens": {
Type: schema.TypeList,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same questions here about whether this should be a Set or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will move to a set, as the tokens should be unique. I don't see this a generating a lot of value, but I'll move it to a set.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main value is that if the API ever changes the order, or the user changes the order, the grant doesn't need to be recreated for no real purpose.

}
if grant != nil {
// The grant sometimes contains principals that identified by their unique id: "AROAJYCVIVUZIMTXXXXX"
// instead of "arn:aws:...", in this case don't update the state file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic worries me, as not setting things in state allows for drift between state and reality, which can be the source of some gnarly bugs. What conditions make principals return as their unique IDs? Is there any way to translate the unique IDs into arn:aws:... format?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see, you acknowledged this in the PR body. In this case, I think it'd be better to retry until we get an ID in the arn:aws: format (or a max threshold is hit) than to not set state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is one of the big question that I had.

I tried putting it in the retry earlier, the real problems is that if the IAM Principal/Role is managed outside of terraform and it is deleted before the grant is deleted the refresh step will fail, and if the person wants to delete the grant even if we let the error through it will be stuck on refreshing the state for a few minutes for nothing which seems like pretty un-optimal behavior.

So there are two times the internal ID will be not be resolved:

  1. The IAM role is newly created and hasn't propagated through the system (eventual consistency)
  2. The IAM role has been deleted, and no amount of retries will help here.

That mapping from internal ID to arn:aws:... is exactly what AWS itself is trying to do on the backend, I wouldn't try to re-implement that here.

Let me know what you think is the best path, what I've written above is what makes the most sense to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, the second case does give me pause. Another silly question: is it worth storing the internal ID? What's the bad stuff that happens, in that case? That would protect us from server drift while side-stepping the eventual consistency problem. And if the IAM role has been deleted, I think it'd be better for Terraform to make some noise about that, rather than pretending everything is fine, right?

Copy link
Contributor Author

@jammerful jammerful Mar 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could store the internal ID.
But think of this case: the only thing is that typically Create returns the Read function, which will in cases of new IAM roles cause internal IDs to be stored. Then when lets say some just applies terraform again, the change from internal ID to ARN will cause new resources to be created. So I think it may cause unnecessary resource churn.

Let me ensure that there isn't something clever I'm missing, I'll ask a few people I know that have dealt with these eventual consistency issues. Would it be possible for you to ask a few co-workers as well?

m["encryption_context_subset"] = pointersMapToStringList(constraint.EncryptionContextSubset)
}

return []interface{}{m}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we only have one constraint? Why does the expand version loop over it then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one of encryption_context_subset or encrytpion_context_equals can be set, but that itself can be a map. Will have to refactor this as part of the move to sets.


// Custom error, so we don't have to rely on
// the content of an error message
type KmsGrantMissingError struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could just be a string. type KmsGrantMissingError string.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix.

}

resource "aws_kms_grant" "%s" {
name = "%s"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this need to be unique?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No grant names don't need to be unique, they are optional. I just put names incase of orphaned resources it will be easier to identify them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks! I just wanted to be sure we wouldn't run into an error if the resource got orphaned, which we usually try to avoid using randomized names. Sounds like we don't need a randomized name, here, though.


if _, ok := validTypes[value]; !ok {
es = append(es, fmt.Errorf(
"%s must be one of the following: [ Decrypt, Encrypt, GenerateDataKey, GenerateDataKeyWithoutPlaintext, ReEncryptFrom, ReEncryptTo, CreateGrant, RetireGrant, DescribeKey ]", k))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we programmatically generate this list, to prevent drift?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will pull from the above map above thanks. By the way this kind of duplication is pretty common in this file, is probably worth cleaning up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets follow the suggestion in this PR, your suggestion is welcome but lets focus the conversation on the PR it self

alvaro

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, was just bringing a small possible issue to your attention.

func validateAwsKmsGrantOperation(v interface{}, k string) (ws []string, es []error) {
value := v.(string)

validTypes := map[string]bool{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be a map[string]struct{} with each value set to struct{}{}, just to denote that the value of the map has no meaning.

Copy link
Contributor Author

@jammerful jammerful Feb 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix thanks.

@jammerful
Copy link
Contributor Author

jammerful commented Feb 28, 2018

@paddycarver Thanks for the review. I left some comments, most of your concerns will be fixed. Except for the one about not loading in IAM principals into terraform state if it doesn't resolve to an ARN--I left some reasons on why retrying isn't optimal (maybe if we do something like if d.isNewResource but even then you really want that to be on the IAM principal not if the grant is new).

I did have one other concern, about the scalability of this approach. I was stress testing my code, and it can handle about ~200 grants managed in one resource file, if it is more the resource refreshing takes longer than the read timeout. The reason for the scale limit, is that for each resource/grant we are calling list grants, which will get throttled by AWS. Would it make sense to have a global list of grants? Have you seen other resources implement this (resources that don't have a Describe* call)? I could see this getting messy.

I will upload a new commit when finished tonight or tomorrow.

Will squash commits when ready to merge.
@paddycarver
Copy link
Contributor

Thanks for the work on this and the careful explanations! For the scale part... The only solution I can think of at the moment is to introduce user-specified timeouts, like this: https://github.com/terraform-providers/terraform-provider-aws/blob/177a19851fbbf8ec6a54f6733312ae205d7327d3/aws/resource_aws_ami_from_instance.go#L30-L34

But I'm toying with the idea of just waiting to see if people hit that timeout case, and then adding the custom timeouts if that turns into an issue people have. I don't know how common it is, unfortunately, to create 200 grants.

There were murmurs and talks of eventually adding some sort of request batching capabilities to Terraform, but I don't think that's something people were talking about for the near future. Though that would ideally be the "real" solution to this, reusing a single List call across resources.

Unfortunately, I think trying to build batching in using a single List call at the moment would be buggy and error-prone, so I'd advise against it unless we discover we absolutely must implement it.

@jammerful
Copy link
Contributor Author

jammerful commented Mar 2, 2018

I may add user specific timeouts, is that fine? I think I'll have a use for them myself. I'll update my PR tomorrow, sorry for the delay.

Will squash commits when ready to merge.
@ghost ghost added the size/XL Managed by automation to categorize the size of a PR. label Mar 2, 2018
log.Print -> log.Printf
Will squash
@jammerful
Copy link
Contributor Author

@paddycarver Addressed your comments. Please take a look.

Regarding sometimes not setting the roles in state, the only thing I could think of is to always store the role id in the state file. However, I this won't work if the role is from another account since the caller won't have GetRole permissions on that role from another account.


// The hash needs to encapsulate what type of constraint it is
// as well as the keys and values of the constraint.
func resourceKmsGrantConstraintsHash(v interface{}) int {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to fix this up. As of now, I know I at least need to sort the maps so that the constraint gets consistent hashes. I couldn't find any common utilities for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@ghost ghost added the size/XL Managed by automation to categorize the size of a PR. label Mar 6, 2018
@jammerful
Copy link
Contributor Author

@paddycarver I think all of the initial comments have been resolved. Please take another look.

I still don't have a sense if not setting the roles in state is a non-starter/blocker. Keep in mind there is no way to modify a grant, so I wouldn't be extremely concerned about drift between the state file and what is in AWS.

input.Name = aws.String(v.(string))
}
if v, ok := d.GetOk("constraints"); ok {
if !kmsGrantConstraintsIsValid(v.(*schema.Set)) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how much value this is delivering, I really liked when we used conflicts with in the schema because it validates this before going out to AWS. Right now this is caught right before AWS would let us know of the conflict, after other resources have already been created.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd love this to fail at plan time, too, but in the interest of moving this forward, I wanted to keep our scope here simple and not introduce complicated things like CustomDiff or ConflictsWith on a Set, which I've never used on a Set before, and don't know the intricacies of how it works. @bflad or @paultyng may have more insight on that.

It's almost certainly possible to have this fail at plan time. I wouldn't consider making that happen a blocker, but if you'd like it to be in the first pass, I'm happy to have it included.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will address in a follow up PR, as I mentioned in another comment. Might make the most sense to move this back to a list? The set adds a lot of additional code, as well as not adding much value I think given that there can only be one element.

@jammerful
Copy link
Contributor Author

@paddycarver @kikitux If you can't look at this soon could I please get an indication of whether you are inclined to merge this, or are there major issues preventing this from being merged?

@copumpkin
Copy link

Unfortunately, I think trying to build batching in using a single List call at the moment would be buggy and error-prone, so I'd advise against it unless we discover we absolutely must implement it.

@paddycarver if I understand correctly, without batching we get basically quadratic (actually O(m * n) where m is number of TF grant resources and n is number of grants in the account, which after the first time is always >= m). My concern here is that unlike other resources with similar issues (I think KMS aliases), grants are often quite numerous. One of my keys today has several hundred grants on it, the default limit per CMK is 2500 (and I've had it raised by AWS support in the past), and I'd like to manage many of those grants with KMS. That seems like it'll become untenable pretty quickly, unless you have a small handful of grants. But with a small number of grants, you might as well just encode it in the key policy.

I get that Terraform doesn't have a proper generic solution for this problem yet, but it doesn't seem awful to maintain a bit of state across grant resource calls so they don't each have to list grants from scratch.

Copy link
Contributor

@bflad bflad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jammerful! 👋 I'm just leaving a drive-by code review for a few things since this PR popped up on my radar, minus looking at the eventual consistency code. Architecturally, I do think this resource fits into the Terraform model if we can figure out the eventual consistency issues (I'll let @paddycarver keep working with you on that angle unless it gets handed off).

Please reach out with any questions!

Set: schema.HashString,
Elem: &schema.Schema{
Type: schema.TypeString,
ValidateFunc: validateAwsKmsGrantOperation,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can avoid a custom validation function and associated unit test by using Terraform core's validation.StringInSlice() 😄 (it should also use SDK provided constants)

ValidateFunc: validation.StringInSlice([]string{
  kms.GrantOperationCreateGrant,
  kms.GrantOperationDecrypt,
  kms.GrantOperationDescribeKey,
  kms.GrantOperationEncrypt,
  kms.GrantOperationGenerateDataKey,
  kms.GrantOperationGenerateDataKeyWithoutPlaintext,
  kms.GrantOperationReEncryptFrom,
  kms.GrantOperationReEncryptTo,
  kms.GrantOperationRetireGrant,
}, false),

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

out, err = conn.CreateGrant(&input)

if err != nil {
if awsErr, ok := err.(awserr.Error); ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can avoid dealing with awserr directly by using the isAWSErr() helper function we have. 😄 (similarly the SDK usually provides constants for the error codes)

if isAWSErr(err, kms.ErrCodeDependencyTimeoutException, "") {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if grant == nil {
log.Printf("[WARN] %s KMS grant id not found for key id %s, removing from state file", grantId, keyId)
d.SetId("")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should return nil so the if grant != nil { below (and its nesting) is unnecessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

return nil
}

// Retiring grants requires special permissions (i.e. the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this behavior should be noted in the documentation. Better yet, it would probably be best to provide an attribute flag to control the behavior (e.g. retire_on_delete) that defaults to true so we can always return an error if an expected deletion (retire or just revoke) is not possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. However, I inverted the default to false. Functionally retiring and revoking are the same, they don't show up in the ListGrants call, however retiring requires special permissions. It seemed to me that defaulting to that would be noisy and cause confusion.

return err
}

d.SetId("")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: d.SetId("") is no longer required in delete functions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Didn't realize that.

_, err := conn.RevokeGrant(&input)

if err != nil {
return err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should allow non-existent grants to skip returning an error so its not required to do a terraform state rm:

if isAWSErr(err, kms.ErrCodeNotFoundException, "") {
  return nil
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch. Done.


// Custom error, so we don't have to rely on
// the content of an error message
type KmsGrantMissingError string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of creating our own error class, can we instead utilize the upstream SDK's awserr.Error and set to a code like kms.ErrCodeNotFoundException?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kms.ErrCodeNotFoundException is a string, and I wanted a type or code to use so I don't have to key off the error message. Might be missing something here?

}

log.Printf("[DEBUG] Created new KMS Grant: %s", *out.GrantId)
d.SetId(*out.GrantId)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that grant IDs are really under a key ID, we might want to consider making the resource ID in Terraform contain both, e.g. KeyID:GrantID. This will allow us to import them.

Saving as part of the resource identifier:

// Using keyID := d.Get("key_id").(string) above input
d.SetId(fmt.Sprintf("%s:%s", keyID, aws.StringValue(out.GrantId)))

Parsing anytime necessary:

keyID, grantID, err := decodeKmsGrantId(d.Id())
if err != nil {
  return err
}
// ...
func decodeKmsGrantId(id string) (string, string, error) {
	parts := strings.Split(id, ":")
	if len(parts) != 2 {
		return "", "", fmt.Errorf("Unexpected format of ID (%q), expected KeyID:GrantID", id)
	}
	return parts[0], parts[1], nil
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@ghost ghost added the size/XL Managed by automation to categorize the size of a PR. label Mar 9, 2018
@paultyng
Copy link
Contributor

paultyng commented Mar 9, 2018

@copumpkin I think the issue is what if there is a grant that depends on a grant, etc, things like that? If you needed the token to create another grant for example then that cache starts to work against you, which is why it hasn't been used in other places.

I think since the grants are listed by key you could just create an aws_kms_grants resource that worked in bulk potentially where the ID was the key? And it could just share code here? Just a thought at least, I'm not very familiar with KMS, and that may also not be able to leverage count as effectively, etc.

Edit: here are some example bulk resources:

@copumpkin
Copy link

copumpkin commented Mar 9, 2018

@paultyng that's an idea that occurred to us, but is there precedent for it in Terraform? Edit: I see you've amended your comment to give some examples, thanks!

Also, is there a good place (like a TF contributors Slack) to chat live about this sort of thing? Might help sort through these concerns more quickly. There's a few options that have occurred to us but it's hard to factor in what might work vs. trying to guess what HashiCorp might be willing to merge.

Copy link
Contributor

@paddycarver paddycarver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @copumpkin / @jammerful, apologies for not getting back to you last week--my week got away from me with other providers. :/ I'm glad @paultyng and @bflad were able to jump in, because they know way more about this than I do. I took another pass on this, but I defer to either @bflad or @paultyng about the batching question. My instinct is to not borrow trouble, and address it if it pops up, but I'm not the most familiar with KMS, so that may be a sub-optimal stance. I do agree with @paultyng that the separate, plural resource has been a successful pattern for batching things in the past where there are too many resources to handle 1-by-1. I'm, unfortunately, not enough of a domain expert to be of much use in deciding whether it will be useful in this case or not. :(

As for the eventual consistency, I'm willing to accept the WARN log, and conditionally not setting it if we have an issue there. I don't like it, I'm not super happy with it, but the risk seems relatively small, and I think it's probably premature optimisation on my part to try and get rid of it. If @bflad disagrees, however, and wants it handled, I will happily propose workarounds and try to make that work.

I left a couple minor comments, but I don't think there's anything I see here that I'd hold the merge over. I'll circle up with @bflad in a couple of hours and make sure he and I are on the same page and get this merged, or report back if necessary.

As for a Slack, we only have a Slack channel for maintainers, at the moment, unfortunately. I apologise for the slow-motion code review and guesswork involved here. My general rule of thumb about what I'll accept (@bflad may have a different rule of thumb, and as he's a lead for the AWS provider, I'll defer to him) is:

  • Will this work in unexpected ways for users or break on users?
  • Does this use the framework in unusual or unorthodox ways, which may trigger an edge case in the future? If so, can we avoid that for a more mainstream approach? I'm generally more of a "follows the semantics" than a "technically works" reviewer, personally.
  • Is there an example of a similar resource that's already in Terraform? Does it do something differently? Why?
  • If we need to change this in the future, can we do it without breaking backwards compatibility? If not, how likely is it that we'll need to change this in the future?

I know that's not an enormously helpful list, but a good test is: does something like this already exist in Terraform? If yes, that's a good indicator it'll probably be acceptable. If no, then there may be some back and forth about it.

I'll report back in a few hours when the Eastern US is awake and I can make sure you're not going to get three separate answers on this.

input.Name = aws.String(v.(string))
}
if v, ok := d.GetOk("constraints"); ok {
if !kmsGrantConstraintsIsValid(v.(*schema.Set)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd love this to fail at plan time, too, but in the interest of moving this forward, I wanted to keep our scope here simple and not introduce complicated things like CustomDiff or ConflictsWith on a Set, which I've never used on a Set before, and don't know the intricacies of how it works. @bflad or @paultyng may have more insight on that.

It's almost certainly possible to have this fail at plan time. I wouldn't consider making that happen a blocker, but if you'd like it to be in the first pass, I'm happy to have it included.

return nil, NewKmsGrantMissingError(fmt.Sprintf("[DEBUG] Grant %s not found for key id: %s", grantId, keyId))
}

// Can't have both constraint options set:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll admit I'm confused by this part. You're only allowed to have 1 constraint, right? If you're only allowed 1 constraint, is the only reason constraints is a TypeSet for the block syntax? If that's the case, we could use MaxItems: 1 on constraints to make sure no more than 1 block is specified. Doesn't obviate the need for this function, but can simplify it a little.

Note: This is not a blocker for merge.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting MaxItems won't work given the way the constraint set is being hashed, it ends up being one item with two different sub-attributes. Maybe I should have left it as a list, mostly had moved it on your suggestion.

Copy link
Contributor

@paddycarver paddycarver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I talked with @bflad about this, and we got on the same page. This is totally good to go and ready to merge, with one caveat: it'd be great to see the ARN eventual consistency behavior documented. Feel free to ping me when you do that (or if you don't have time, reply here and I'll do it) and we'll get this merged.


* `name` - (Optional, Forces new resources) A friendly name for identifying the grant.
* `key_id` - (Required, Forces new resources) The unique identifier for the customer master key (CMK) that the grant applies to.
* `grantee_principal` - (Required, Forces new resources) The principal that is given permission to perform the operations that the grant permits.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we document here the eventual consistency problem and workaround we have in place, with the ARN weirdness?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@bflad bflad added this to the v1.12.0 milestone Mar 12, 2018
@ghost ghost added the size/XL Managed by automation to categorize the size of a PR. label Mar 12, 2018
@jammerful
Copy link
Contributor Author

@paddycarver Thank you for all the help and comments. I updated the documentation, and replied to your comments. Should be ready to merge. Let me know if you want me to squash the commits.

@paddycarver
Copy link
Contributor

Looks great! Thanks for contributing to Terraform's AWS provider!

@paddycarver paddycarver merged commit 525fd78 into hashicorp:master Mar 12, 2018
@copumpkin
Copy link

Sweet, thanks! I also have an idea for how to get rid of the O(m * n) behavior that I'll propose in a separate issue and tag you on, if you don't mind.

@paddycarver
Copy link
Contributor

Feel free to tag me, and I'll do my best to get to it, but to set expectations appropriately: I am responsible for other providers, so it may take me a bit to get to it; this week is already looking unlikely. There's no harm in tagging me on it, but your best bet for getting eyes on it is to open it up to the maximum number of maintainers. :)

@bflad
Copy link
Contributor

bflad commented Mar 23, 2018

This has been released in version 1.12.0 of the AWS provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

@ghost
Copy link

ghost commented Apr 7, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!

@ghost ghost locked and limited conversation to collaborators Apr 7, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
new-resource Introduces a new resource. service/kms Issues and PRs that pertain to the kms service. size/XL Managed by automation to categorize the size of a PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support management of AWS KMS grants
6 participants