-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
provider/aws: Update Security Group Rules to Version 2 #3019
Conversation
I think this is the very same issue we've been running into recently. I'll try your branch and see if it resolves our issues. Update: It seems to have solved the issue when testing on a fresh stack. It did not help recovering from a situation where the state file had already been "corrupted", though (which it might not have intended to do anyway). |
This patch solves my issue as well. Waiting to see it in the released version. |
continue | ||
} | ||
|
||
remaining = len(p.UserIdGroupPairs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to add a comment here, mentioning that it's for cases of EC2-Classic/default-VPC.
If you're already changing the ID/hash, I'm thinking it would make sense to make one more tiny change to make things a little bit more obvious, when looking into the state: return fmt.Sprintf("sg-%d", hashcode.String(buf.String())) return fmt.Sprintf("sg-rule-%d", hashcode.String(buf.String())) one would almost think that it's an actual security group ID, while it's a hash. |
@@ -256,6 +256,92 @@ func TestAccAWSSecurityGroupRule_SelfReference(t *testing.T) { | |||
}) | |||
} | |||
|
|||
// testing partial match implementation | |||
func TestAccAWSSecurityGroupRule_PartialMatching_Basic(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the convention (at least based on quick grep
) is _basic
(small caps).
This definitely seems to fix the problem we are having with rules not being written to state. To fix rules in an existing infrastructure we had to delete/clean up the rules on the amazon side, and then re-apply with this terraform branch. I'd like to see this merged as soon as possible! |
The "cleanest" solution is to taint these security groups. Also I was trying to find out how to make the transition smooth, but I don't think there's any easy way, since there was no state previously -> we can't find out which rule comes from where and we cannot guess it either. |
720562e
to
c4034e4
Compare
Computed: true, | ||
Type: schema.TypeString, | ||
Optional: true, | ||
Computed: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do this whitespace diff separately? Sorry to nit - I'm a big git blame
spelunker 😀
LGTM! |
317c3e8
to
9f3a17e
Compare
provider/aws: Update Security Group Rules to Version 2
At long last, we have merged this PR. I sincerely apologize to everyone who suffered through this and waited on the merge. Please give it a spin and let us know if there's anything else hanging here. Thanks! |
@catsby we're still seeing the same issues within the SGs where rules order is changing, using latest released version. We tried to taint as advised by @radeksimko but unfortunately while doing an apply we can see it live within AWS console, the rules order is changing non stop. Initially: Couple seconds later: It repeats multiple times. Diff in state file:
During that shuffling time across all our SGs, we also lose traffic due to SG changes (some rules I'm assuming are being removed and re-added couple secs later). |
@scalp42 Just to double check - have you tried tainting security group rules or security groups themselves? You should be tainting security groups. |
@radeksimko I tried to taint the SGs (not the rules). Is this PR supposed to help with #2843 ? |
Is this actually fixed? We run |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further. |
This patch aims to resolve a lingering issue with Security Group Rules,
where different resources generate rules that are consolidated on Amazon's
side, as a result resulting in loss of local resources in the state file
and duplication errors on AWS's side.
Example:
After applying this rule, our security group
nat
looks like this:This rule, with 2
cidr_block
entires, is a single IPPermission rule according to the API.At present, the Security Group Rule resource hashes it's values (ports, cidrs, security groups)
to generate a key. This breaks down however, when another rule is added with a new
cidr_block
.Now, adding a second rule that also uses a
cidr_block
like so, will reveal the bug:The IPPermissions now look like this:
The API has consolidated this new
cidr_block
into an existing rule, based on the port andprotocol it applies to. Because of the way Terraform hashes the rules locally, it can no longer find
either resource, as it hashes the entire IPPermission looking for a match.
This pull request resolves this by iterating through
IpPermissions
and looking for matching criteria,ensuring that each
cidr_block
orsecurity_group_id
is found in a set.This patch also fixes another issue, where similar rules applied to different security groups caused the
same kind of conflict, because we were not hashing the security group id itself into the id of the rule.
We now consider the id in the hash for the rule to avoid this.
Refs: