-
Notifications
You must be signed in to change notification settings - Fork 9.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New resource: aws_network_interface_sg_attachment #860
New resource: aws_network_interface_sg_attachment #860
Conversation
This is a transfer of work from hashicorp/terraform#15167. This adds the aws_security_group_attachment resource, allowing one to attach security groups to ENIs outside of an aws_instance or aws_network_interface resource. Use cases for this would include more granular management of security groups, or attachment of security groups to instances that are managed out-of-band from Terraform.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 for supporting such resource.
Besides the comments I left there:
Is there a reason to (or not to) support management of multiple SGs, i.e. security_group_ids
instead of security_group_id
? I'm not suggesting either way - genuine question - as I'm curious if you thought about that. 😃
Do I understand correctly this resource won't work in EC2 Classic?
I'd think about alternative name that can fit better in the context of other resources which accept SGs too (RDS instance, Lambda, ELB, ALB, ElastiCache, EFS MT, RedShift, LaunchConfiguration).
How about aws_network_interface_sg_attachment
? I know we tend to stick to full names, but aws_network_interface_security_group_attachment
seems way too long and the ENI is more important (than SG) in the overall context IMO.
and the [`aws_network_interface`][2] resources. Using this resource in | ||
conjunction with security groups provided in-line in these resources will cause | ||
conflicts, and will lead to spurious diffs and undefined behavior - please use | ||
one or the other. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
func delSGFromENI(conn *ec2.EC2, sgID string, iface *ec2.NetworkInterface) error { | ||
old := iface.Groups | ||
var new []*string | ||
for _, v := range iface.Groups { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 Isn't it dangerous to delete groups which we do not manage in this context by default? I don't mind having force_delete
bool field which enables such behaviour but I think this is kind of breaking a promise to never touch infrastructure it didn't create or doesn't manage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, let me see if there's a safer way to do this. The main issue is that I don't think ModifyNetworkInterfaceAttribute
allows you to individually add/remove security groups, but that's something I need to double check. So to effectively remove the security group we need to loop over the groups, skip the group we are removing, and build a set off the rest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see, that's a bit annoying (the API limitation) 😢 Let us know if you find a better way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't find a better way I think we'll need to use mutex to avoid modifying the same ENI from different resources in parallel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack - good catch 😨
Do we have a provider-level mutex for this kind of thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like mutexkv
is my answer. :) Will roll this in if there's no other way (I did check the SDK docs though and could not see anything more granular).
EDIT: Is the global awsMutexKV
safe to use?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The safety just depends on the name of the mutex you pick - if we assume ENI IDs are globally unique (across regions) then that should be sufficiently safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All done adding in the locks and yeah it does look like it was racing, but not anymore! 😀
|
||
func attachSecurityGroupToInterface(d *schema.ResourceData, meta interface{}) error { | ||
sgID := d.Get("security_group_id").(string) | ||
interfaceID := d.Get("network_interface_id").(string) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we rely on this field, shouldn't it be Required
in the schema then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated - this was like this because this resource supported both ENIs and actual instances at one point in time.
return baseConfig + fmt.Sprintf(optionalConfig, externalResPre, externalDataPre, externalAttrPre) | ||
} | ||
return baseConfig | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It took me a while to understand the logic here (+ imagine what exactly is the sprintf going to do) and what's the reason for the condition.
It may be just me, but I personally think for a test case this logic is overly complicated - having two, nearly identical tests with similar configs would IMO work better. Duplication isn't always bad. 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And after all the two use cases are sufficiently different (data source vs resource) that we should test them separately anyway, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
De-coupled the configuration generation functions for the most part... I still parameterize on enabling the attachment so that we can test removal on the second step, but things should hopefully be a lot more clear now. Let me know if it isn't though. 🙂
Hey @radeksimko! I'll carry out the updates/changes when I have a chance here - just wanted to address the mainline comment about singular vs multiple security groups: I think my opinion on that is that I prefer the singular form better, on part of that in light of things like The only thing I could think of when writing this that could be a problem would be graph sprawl, but that's kind of what automation is for - and also the one-to-one mapping might be easier to read for that matter. The biggest win is I think much less complicated code - the code only has to support the simplest use case, which ultimately should mean safer code in the long run. The requested changes make sense - just need to check if there's any way we can make the SG modification step safer - but otherwise everything else seems straightforward. Will update soon! |
That makes total sense 👌 Looks like you thought that through, good. 👍 😃 |
Renamed as pre review comments in hashicorp#860. This should help differentiate it between the other kinds of security groups available in the AWS provider.
This attribute was set to optional back when this resource allowed either an instance or network interface specified. Now that this is no longer the case, there's no reason to keep it this way.
Make the test configs a bit easier to understand. Each case (via resource or data source) now has its own config, but we still parameterize on enabling/disabling the security group resource for the removal check.
The resource was actually racing when there was multiple attachments trying to work with the same network interface. This is fixed now with locks added in create and delete. The added test checks the race in a couple of steps, switching up the resource names on the second run for the security groups and security group attachments to get a good mix of creation and deletion events to really test the effectiveness of the serialization. Also a small cosmetic re-refactoring of test names and configuration generation functions.
Alright, all updated! Definitely good we caught the locking too as it was for sure racing. I put in a good test for that I think that should generate enough concurrency to ensure that the locking is working. Anything else that needs fixing let me know 🙂 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now looking functionally good 👍
Besides my comments in code I think it would be good if the interfaces of functions referenced from inside the CRUD weren't so greedy 😄
e.g. attachSecurityGroupToInterface
surely doesn't need the whole schema and certainly not all metadata.
Let me know what you think.
resource.TestStep{ | ||
Config: testAccAwsNetworkInterfaceSGAttachmentRaceCheckConfig("step2"), | ||
Check: checkSecurityGroupAttachmentRace("step2"), | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we're actually testing the scenario of multiple attachments at the same time here. 🤔
I think we just create & destroy and create & destroy another set afterwards.
Wouldn't it be easier to just have a single config with 4 aws_network_interface_sg_attachment
referencing the same network_interface_id
? Then Terraform will create all 4 resources at the same time by default without us having to explicitly do something in the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Step 2 should basically destroy all aws_network_interface_sg_attachment_step1.*
instances and create aws_network_interface_sg_attachment_step2.*
. This should have the effect of sufficient chaos as 4 detachment and 4 attachment operations are going on at the same time in step 2 (4 resources being destroyed and 4 being created).
I think what might need work is aws_security_group.sg_step1|step2
- I don't know if maintaining these separately is getting the desired above effect and maybe a central group of SGs should be created instead.
} | ||
} | ||
|
||
func checkSecurityGroupAttachment(checkPrimaryInterfaceAttr bool, expected bool) resource.TestCheckFunc { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably a nitpick, but where possible string arguments are usually better in readability than boolean ones as it's more obvious how do they affect the behaviour without having to read the body of the function (related to checkPrimaryInterfaceAttr
).
func resourceAwsNetworkInterfaceSGAttachmentCreate(d *schema.ResourceData, meta interface{}) error { | ||
mk := "network_interface_sg_attachment_" + d.Get("network_interface_id").(string) | ||
awsMutexKV.Lock(mk) | ||
defer awsMutexKV.Unlock(mk) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great if the mutexes were inside attachSecurityGroupToInterface
& detachSecurityGroupFromInterface
so we could encapsulate the logic of detaching/attaching SG and avoid forgetting mutexes when these functions get called elsewhere in the codebase.
That said I understand you probably put them here to keep it locked until we read the data back in resourceAwsNetworkInterfaceSGAttachmentRead
below, which is reasonable due to eventual consistency issues, so I guess you can ignore me. 🤷♂️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, these are on the outer CRUD functions because I really don't want any dependent read operations getting messed up by concurrency (there are possibly ones in the create/delete operations too).
With that said, I think the overly functional nature of this code in its current form (again a relic of when I had this supporting instances too), so I think on the next refactor run I will probably just flatten most of this functionality to where it makes sense (at the very least, on attachSecurityGroupToInterface
and detachSecurityGroupFromInterface
).
NetworkInterfaceIds: aws.StringSlice([]string{interfaceID}), | ||
} | ||
|
||
dniResp, err := conn.DescribeNetworkInterfaces(dniParams) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason we can't use fetchNetworkInterface
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There isn't and I don't know why I missed this - will update 👍
Simplified the interface check function so that the test case directly takes the attribute that we are deriving the interface ID from, rather than taking a bool. This actually uncovered the fact my attribute logic was reversed (the bool logic was giving primary_network_interface_id for a false value passed to checkPrimaryInterfaceAttr instead of a true value, and this error was propagated to the test cases). So this is fixed now as well.
Still working on the re-factor here, will knock out the rest tomorrow morning 👍 |
Race check needed simplifying as well, in addition to being reduced from two steps to one. Reason for the latter is once security groups were modified so that the were operating off the same set of groups in both steps (so step1 -> step2), it was discovered that there was no way we could reasonably expect the deletion/creation order would never favour a situation where the new SG attachment would be ordered after the old SG attachment was removed (as both step1 destroys and step2 creations would be happening at the same level in the graph and without any dependencies).
Removed a bunch of the single-use functions, which has moved most logic to the main CRUD functions. The old functions served a purpose when this resource was designed to support both instance IDs and network interface IDs, but just adds more cruft now that only network interface IDs are supported. Also moved all the messages to the DEBUG level as TF_LOG=info does not mean anything at a provider level, currently.
Alright, last little bit added - should address all of the new review items Let me know if there's more 😉 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just left you some nitpicky comments there - really low importance.
This now looks good to me - merge away! 🎉 👍
func resourceAwsNetworkInterfaceSGAttachmentDelete(d *schema.ResourceData, meta interface{}) error { | ||
// Get a lock to prevent races on other SG attachments/detatchments on this | ||
// interface ID. This lock is released when the function exits, regardless of | ||
// success or failure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think when the lock is released is obvious from the code below, so the second sentence seems a bit redundant.
} | ||
|
||
func resourceAwsNetworkInterfaceSGAttachmentCreate(d *schema.ResourceData, meta interface{}) error { | ||
// Get a lock to prevent races on other SG attachments/detatchments on this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
detatchments
-> detachments
func resourceAwsNetworkInterfaceSGAttachmentCreate(d *schema.ResourceData, meta interface{}) error { | ||
// Get a lock to prevent races on other SG attachments/detatchments on this | ||
// interface ID. This lock is released when the function exits, regardless of | ||
// success or failure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think when the lock is released is obvious from the code below, so the second sentence seems a bit redundant. Why (the 1st sentence) is more important then what.
} | ||
|
||
func resourceAwsNetworkInterfaceSGAttachmentDelete(d *schema.ResourceData, meta interface{}) error { | ||
// Get a lock to prevent races on other SG attachments/detatchments on this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
detatchments
-> detachments
I forgot you don't have merge perms 🤦♂️ merging now! |
Haha, no prob... and I was about to remove those comments too 😜 I'll put in another PR! |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks! |
This is a transfer of work from hashicorp/terraform#15167.
This adds the aws_security_group_attachment resource, allowing one to
attach security groups to ENIs outside of an aws_instance or
aws_network_interface resource.
Use cases for this would include more granular management of security
groups, or attachment of security groups to instances that are managed
out-of-band from Terraform.