-
Notifications
You must be signed in to change notification settings - Fork 9.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add aws_security_group_rules
resource
#9032
Add aws_security_group_rules
resource
#9032
Conversation
@bflad Just curious if there's any sort of ETA for review? |
Hi @jakauppila, I came across this PR while looking for a solution to the issue where Terraform couldn't detect diffs on the |
HI @igozali, in my own testing of inline rules of This PR adds an additional Are you defining your rules inline within the |
Thanks for responding! After reading your description, I think your PR solves a different issue. My issue is the following: if I have a security group as follows (which, to answer your question, security groups are defined inline): provider "aws" {
version = "~> 2.19"
region = "us-west-1"
}
terraform {
required_version = "~> 0.12.0"
}
resource aws_security_group "test" {
name = "please_delete_me"
vpc_id = "vpc-xxxxxxxx"
ingress {
from_port = 2222
to_port = 2222
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 2222
to_port = 2222
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
} And then in AWS console, I manually add an ingress rule, Terraform seems to detect the drift properly. However, if I remove the last ingress rule in Terraform provider "aws" {
version = "~> 2.19"
region = "us-west-1"
}
terraform {
required_version = "~> 0.12.0"
}
resource aws_security_group "test" {
name = "please_delete_me"
vpc_id = "vpc-xxxxxxxx"
egress {
from_port = 2222
to_port = 2222
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
} Terraform won't be able to detect changes on the ingress rules on the Sorry for the noise due to an unrelated issue! |
@bflad - I am also eagerly awaiting this new feature for the AWS provider plugin, do you think this will get included with the 2.23.0 release? |
What remains to get this merged in? The PR has been sitting for 2 months. |
Progress update please? Desperately in need of this. |
Hi everyone, thanks for your patience on this. As you probably noticed, we have a large backlog of PRs to look at right now. In addition, most of the team will be at HashiConf this week. Given the nature of this resource, I think we'll need to discuss it internally before continuing -- it may overlap with other product plans. We really appreciate the work you've put in, and I hope we can give you a more detailed answer soon. |
Should |
@aeschright Just curious if there's any timeline that could be provided to set expectations? |
@aeschright Any chance we could get an update on this? |
Progress update please? |
Waiting eagerly... |
@aeschright @bflad Maybe we can have a progress update? This is the second most 👍 PR and is opened for almost one year :( |
@bflad During the AWS Provider Office Hours, you mentioned that more thought needs to go into the use-cases before this PR could possibly be merged. Could you express your concerns concretely here so we can address them? |
Hi @jakauppila 👋 Thank you for bringing this up during our office hours and we apologize that this continues to be a frustrating experience with the Terraform AWS Provider. To be fully transparent upfront: A lot of the silence on this manner stems from the fact that we (the HashiCorp maintainers of the project), do not have what we feel is an acceptable path forward for this particular issue that does not potentially increase confusion or potentially burden operators with existing configurations. Most importantly, this has nothing to do with the quality or proposal of this particular contribution. We should certainly be more upfront about that and I apologize since that may come across as not appreciating contributions or potentially raises other negative feelings. That is certainly not something we want to foster. We want to open a larger dialogue. Going forward with this particular issue, it would be great if we can come together as a community to discuss these points, but likely in a different forum than comments on a proposed change request since this warrants a higher bandwidth discussion and likely a lengthy write up about the Terraform design decisions impacting this area. Below I will briefly try to summarize the problem and the recommended guidance on the problem, outline some of the historical and internal Terraform context to set the stage, present some of conflicting and confusing design decisions with any proposed change to this area, and finally offer some potential paths forward on the manner. To begin, here is a summary this issue in a Terraform configuration from my understanding. Please let me know if this is incorrect. While the below only shows resource "aws_security_group" "a" {
name = "a"
ingress {
from_port = 22
to_port = 22
protocol = "tcp"
source_security_group_id = aws_security_group.b.id
}
}
resource "aws_security_group" "b" {
name = "b"
ingress {
from_port = 22
to_port = 22
protocol = "tcp"
source_security_group_id = aws_security_group.a.id
}
} Effectively, the desire is to allow each of the EC2 Security Groups to cross-communicate. However, when this configuration is applied, Terraform will return a The current recommended guidance on this situation is to switch from using resource "aws_security_group" "a" {
name = "a"
}
resource "aws_security_group_rule" "a_from_b" {
security_group_id = aws_security_group.a.id
type = "ingress"
from_port = 22
to_port = 22
protocol = "tcp"
source_security_group_id = aws_security_group.b.id
}
resource "aws_security_group" "b" {
name = "b"
}
resource "aws_security_group_rule" "b_from_a" {
security_group_id = aws_security_group.b.id
type = "ingress"
from_port = 22
to_port = 22
protocol = "tcp"
source_security_group_id = aws_security_group.a.id
} We will discuss why this configuration change may not be desirable later on. Terraform's core logic is based on a directed acyclic graph. All operations that are to be implemented in Terraform must abide to being composed into well defined nodes and edges. Each Terraform resource is a self-contained node and is generally wholly separate from other resources (nodes) except for connecting edges created by configuration references. (Aside: There is a special exception of provisioners, which have completely separate rules and handling with their own class of bugs and potentially confusing behaviors in comparison to resources). That graph determines operation ordering and concurrency, built from practitioner configurations. For more information about this particular design, see also the Terraform Internals section of the documentation. This implementation detail does not typically matter to most practitioners or contributions, however it is relevant to this discussion so it feels worth mentioning for anyone without that context. When Terraform was initially being designed a few years ago, the schema that defines a Terraform resource was written around two conceptual implementation details. The first being standard read-write/read-only attributes and the second being "sub" resources. This can be seen with the Go types used to represent the resource schema, Example of this implementation in the // ... other schema omitted for brevity ...
"ingress": {
Type: schema.TypeSet,
Optional: true,
Computed: true,
ConfigMode: schema.SchemaConfigModeAttr, // Please note: this in of itself is a special implementation detail for this attribute, but is unrelated to this discussion
Elem: &schema.Resource{
Schema: map[string]*schema.Schema{
"from_port": {
Type: schema.TypeInt,
Required: true,
},
// ... other schema omitted for brevity ... With the equivalent Terraform configuration: resource "aws_security_group" "example" {
# ... other configuration omitted for brevity ...
ingress {
from_port = 22
# ... other configuration omitted for brevity ...
}
} Back then, this concept of "sub" resources was presented as a potential future enhancement to Terraform to allow these inner pieces of schema to be handled separately in configurations (e.g. separate referencing) and the directed acyclic graph logic (e.g. ordering / cycle detection). Implementing these "sub" resources as separate graph nodes and edges which would help prevent cycle errors like this situation proved to be an exhaustive challenge however and has never come to fruition. Even through today (Terraform 0.12, 0.13, and the foreseeable pre-1.0 future), this internal feature of Terraform has become less and less likely to be implemented. While there have been glimmers of potential hope with updates to some of the major underlying logic such as Another important Terraform concept within this discussion is Terraform's drift detection abilities. Given a previous state of a resource and its underlying schema attributes, Terraform and the Terraform Plugin SDK build a difference between that state and a desired configuration. A nice writeup of this concept and its internal implementation details within Terraform can be found in the Resource Instance Change Lifecycle document within the Terraform CLI repository. Terraform resources cannot support a partial configuration of an individual attribute as this is core to how Terraform represents a difference between the configuration and state. The exceptions to enabling drift detection are encoded by provider developers adding From a practitioner perspective, this drift detection means that any Terraform configuration will report a difference via Given the above constraints and when an API has a parent-child relationship between components, Terraform resources must be designed in one of two manners. The first being all children components being exclusively managed by a parent resource and the second being all child components individually managed by a child resource without the ability to detect when extraneous children components exist. It is possible to allow both of these resources to exist for different use cases (and we have plenty of cases of this in the Terraform AWS Provider), but they cannot coexist in the configuration of a single parent, otherwise a perpetual difference will occur between the two. This behavior cannot be avoided by provider developers coding the resources differently or by practitioner configurations without an Separate children resources allow for greater composition (separate configurations can worry only about the components they need) while solely having a parent resource with exclusive management provides greater drift detection (forcing configurations to worry about all components). Across the Terraform ecosystem, there is no definitive methodology applied to this situation as its ambiguous whether one implementation may be better than the other. The Wholly separate, the There are two unique differences for the Given that background, we can hopefully lay out some of the design decisions we need to consider:
# This example would introduce perpetual differences
# without Terraform providing any user interface warnings.
# Practitioners would be required to do one of the following to learn its not supported:
# * (Re-)Read resource documentation
# * Ask colleagues or in a forum
# * Report a GitHub issue
resource "aws_security_group" "a" {
name = "a"
}
resource "aws_security_group_rules" "a-ingress-ssh" {
security_group_id = aws_security_group.a.id
ingress {
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["10.0.0.0/8"]
}
# ... potentially others ...
}
# Potentially in another Terraform configuration, managed by some other team
resource "aws_security_group_rules" "a-ingress-https" {
security_group_id = aws_security_group.a.id
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
# ... potentially others ...
} # This example would introduce perpetual differences
# without Terraform providing any user interface warnings.
# The ingress/egress attributes do not have Computed: true
resource "aws_security_group" "a" {
name = "a"
}
resource "aws_security_group_rules" "a-ingress" {
security_group_id = aws_security_group.a.id
ingress {
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["10.0.0.0/8"]
}
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
# ... potentially others ...
}
# Potentially in another Terraform configuration, managed by some other team
# aws_security_group_rules.a-ingress will remove egress
# aws_security_group_rules.a-egress will try to re-add
resource "aws_security_group_rules" "a-egress" {
security_group_id = aws_security_group.a.id
egress {
from_port = 0
to_port = 65536
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
# ... potentially others ...
}
# This example would introduce perpetual differences
# without Terraform providing any user interface warnings.
# The ingress/egress attributes do not have Computed: true
resource "aws_security_group" "a" {
name = "a"
}
resource "aws_security_group_rules" "a-ingress" {
security_group_id = aws_security_group.a.id
ingress {
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["10.0.0.0/8"]
}
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
# ... potentially others ...
}
# Potentially in another Terraform configuration, managed by some other team
# aws_security_group_rules.a-ingress will always try to remove this rule, while this tries to add it
resource "aws_security_group_rule" "a-egress" {
security_group_id = aws_security_group.a.id
type = "egress"
from_port = 0
to_port = 65536
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
} # This example would introduce perpetual differences
# without Terraform providing any user interface warnings.
# The ingress/egress attributes do not have Computed: true
# aws_security_group_rules.a-ingress will always try to remove this rule, while this tries to add it
resource "aws_security_group" "a" {
name = "a"
egress {
from_port = 0
to_port = 65536
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
}
resource "aws_security_group_rules" "a-ingress" {
security_group_id = aws_security_group.a.id
ingress {
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["10.0.0.0/8"]
}
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
# ... potentially others ...
}
All these put us in a rough position with the current proposal, since there is additional burden somewhere. We would prefer to not have a single resource that operates differently than the majority of other resources. While the above configurations may seem obvious when the resources are declared next to each other, varying team structures lead to varying configuration layouts and ownership. By example: an application team, separate from an operations team which handles core networking management, may reach first for the plural rules resource since they wish to add multiple rules to a security group. They don't want to manage the security group itself (or at least if they tried, it'd error and require a manual import step, which is our warning mechanism in that scenario), so they avoid the Ideally this issue would be fixed upstream to either support these sub resources so we do not need a separate resource or we would be provided a mechanism for providers to provide warnings/errors if there are conflicting resources managing the same infrastructure to reduce confusion and reliance on purely documentation warnings. It is worth noting though that any of these warning enhancements would not work across separate Terraform states (usually split by team function), which is often the cause of these sorts of problems, so they are really only helpful in the more "obvious" same configuration scenario. All of the above gives us pause to change the current situation, given that Terraform is still able to correctly provision this infrastructure, just without drift detection capabilities. To reiterate, this has nothing to do with this particular code submission, but rather an overall Terraform design problem. Since this design problem only exists in very few real world cases (in the amount of affected infrastructure sense), the broader Terraform teams with their limited resources have opted to spend time working on larger initiatives in the Terraform ecosystem such as configuration language improvements that positively impact more of the community. We would encourage a few actions to move forward here, if this issue is drastically affecting your environment:
Please reach out with any questions and thank your for your time and consideration. |
@bflad I appreciate the breadth and depth of your response, it really helps illustrate the problem and issues that could arise if merged as-is. Thank you. I have created a post over on the forums to continue the discussion further: https://discuss.hashicorp.com/t/discussion-of-aws-security-group-rules-for-absolute-management-while-avoiding-cyclical-dependencies/9647 As included at the bottom of that post, I boiled the ask down to 2 user stories which most succinctly capture what we're trying to accomplish. User Story # 1As a User Story # 2As a @bflad If you wouldn't mind, feel free to create the issue to track this, I don't think there was one that clearly captured it. |
Another limitation of using |
@tmccombs you can use |
@bflad it's been a while since I had to deal with this, but I think that only works if the changes result in a rule that doesn't conflict with the original. So, for example, if you change the port range, or add or remove a cidr block from |
Progress update please, can this be merged? |
I think it looks like this was the intention, but #9032 (comment) seems to indicate that discussion is being moved to https://discuss.hashicorp.com/t/discussion-of-aws-security-group-rules-for-absolute-management-while-avoiding-cyclical-dependencies/9647 |
Pull request #21306 has significantly refactored the AWS Provider codebase. As a result, most PRs opened prior to the refactor now have merge conflicts that must be resolved before proceeding. Specifically, PR #21306 relocated the code for all AWS resources and data sources from a single We recognize that many pull requests have been open for some time without yet being addressed by our maintainers. Therefore, we want to make it clear that resolving these conflicts in no way affects the prioritization of a particular pull request. Once a pull request has been prioritized for review, the necessary changes will be made by a maintainer -- either directly or in collaboration with the pull request author. For a more complete description of this refactor, including examples of how old filepaths and function names correspond to their new counterparts: please refer to issue #20000. For a quick guide on how to amend your pull request to resolve the merge conflicts resulting from this refactor and bring it in line with our new code patterns: please refer to our Service Package Refactor Pull Request Guide. |
Sorry if my question was naive, but could it solve the compatibility concerns if there was a new boolean argument |
Keeping fingers crossed for this PR! |
@m00lecule Given this PR is three years old, I doubt it is going to be merged. |
Yup it has been slept on @GoodMirek for too long. This resource might enhance ec2 network security by drift detection without cilcular dependency issue. I am also willing to help to resolve PR conflicts. |
I'm happy to rebase this PR if/when the team commits to merging; but their reluctance is detailed in #9032 (comment). As far as I am aware, this stance has not changed. |
Problem to Be Solved
Problem Now
Going forwardThe problem I see here is that providing what is requested is not technically possible while letting Terraform itself manage the diff/drift of the rule set. Doing so would lead to cyclic errors and/or perpetual diffs. Within the resource / provider we would need to manage changes to provide Thus, I suggest that we close this issue by Aug 16, 2022 unless:
|
For me, the problem to be solved is a valid and common use case.
|
@YakDriver first of all, I think your description is backwards. Secondly, from what I understand, it is technically possible, with the proposed new |
@YakDriver I would re-iterate the user stories I had posted in #9032 (comment) as to why this change is still needed.
The justification of not merging this change from @bflad in #9032 (comment) seems to focus on the user experience, and that introducing another method to manage security group rules will cause confusion. Might I suggest that the |
Hello everyone - We want to thank you all for continued patience regarding this feature request. At this point the considerations outlined in our previous response have not changed, and we are choosing to close this pull request. We want to reiterate that this decision is in no way a reflection on the quality of the proposed resource itself, but rather to avoid placing an additional burden on practitioners using security group resources. At times we have to make difficult judgment calls as maintainers, and always attempt to weigh impact to the broader community with support for specific cases. The fact that Terraform is still able to correctly provision this infrastructure (albeit without drift detection capabilities) heavily factored into the decision to avoid impacting all users for this specific case. Adding a third method for managing security group rules without an obvious deprecation path for the existing two patterns also factored in. With all of this said, we recognize the importance of detecting drift on such a critical resource, and acknowledge we are leaving a gap for this particular case with this trade-off. As AWS now supports resource identifiers and tags for security group rules, we have begun exploring how this could improve the provider implementation. Effort thus far has focused on the unregistered (ie. not externally visible or documented) aws_vpc_security_group_[ingress|egress]_rule resources. These are in the very early design stages, and we have no deprecation plans or changes to recommended patterns at this moment. While we can make no guarantees, we will continue to keep this use case (circular rule dependencies with drift detection) in consideration as we weigh potential changes to security group rule patterns and to these resources under active development. We understand that this will be a frustrating outcome for those who have invested time and effort into developing this feature or advocating for its inclusion. Thank you again for your effort, participation and patience. |
@jar-b The addition of the |
Hi @jakauppila, thanks for your response. I should have provided some additional context around the While the existing I also want to reiterate that these resources are not yet enabled, and that appropriate documentation and deprecation plans (if applicable) would be part of that effort. Apologies for any confusion references to these new resources may have caused in my previous comment. By including them I meant to convey that we are continuing to consider how to improve the workflow around security groups, and not abandoning improvements to this resource group as closure of this pull request might imply. |
Hi @jar-b, appreciate the additional context. So the creation of the new Adding additional plural |
I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. |
Takes the changes in #1824 and applies them on top of master with a couple required tweaks.
I do not have an account with EC2 Classic to test with, thus the single failed test.
Community Note
Release note for CHANGELOG:
Output from acceptance testing: