Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lifecycle configuration to disable field validation #105

Open
fahrradflucht opened this issue May 11, 2018 · 8 comments
Open

lifecycle configuration to disable field validation #105

fahrradflucht opened this issue May 11, 2018 · 8 comments
Labels
enhancement New feature or request upstream-terraform

Comments

@fahrradflucht
Copy link

Feature Description

Terraform should provide a way for the user to disable the ValidationFunc for a specific field via a lifecycle configuration.

Example (from @bflad)

resource "XXX" "example" {
  # ... potentially other configuration ...
  example_attribute = "incorrectly-validated-value"

  lifecycle {
    ignore_validation = ["example_attribute"]
  }
}

Reasoning

While discussing an MR about upgrading the validation on S3 bucket names the need of a feature like this came up due to the fact that there could be old resources that would fail the validation even though no new ones not passing the ValidateFunc can be created. The proposed feature would provide a way for providers to adapt to new requirements and still provide a way for users with grandfathered resources to upgrade.

As @bflad mentioned in the linked MR this would have the additional use case of being able to disable validations in cases where a service changed in a way that a ValidateFunc is unnecessary strict and the provider didn't update yet.

Possible other solutions:

Initially I proposed to solve the problem of grandfathered resources with having a way for providers to declare ValidateFunc's that only get executed against resources that will be created and not against existing once, but I think this will be unevenly more complex and of less general use then this proposal.

References

@apparentlymart
Copy link
Contributor

apparentlymart commented May 11, 2018

Hi @fahrradflucht! Thanks for sharing this problem and potential solution.

I think I see the use-case here: the validation rules in the remote API have changed over time, possibly with a transitional period where old objects that violate the rules can still exist, and so a provider developer would like to implement the new validation rule but this would cause configurations that manage older objects to erroneously fail. (Please correct me if I didn't quite get that right...)

I'm a little concerned about this specific solution to the problem because providers tend to rely on the automatic enforcement of the validation rules done by the SDK -- assuming that given values will always conform to them -- and so may behave badly (e.g. crash) if it were suddenly possible to bypass those rules.

Another interesting challenge here is that the validation functions actually happen inside the provider plugin itself, rather than in Terraform Core. This means that the part of the code actually doing the validation can't "see" the lifecycle settings in order to know that disabling the validation is desired.

With all of this said, I think this is a problem better solved within the provider itself, since it can then be prepared to accept potentially-invalid values and it has all the information needed to decide whether to enforce the rule or not.

I'm not totally sure of the best strategy here, but I think you could achieve the desired goal here using the CustomizeDiff mechanism on the resource in question. If you perform the validation check in e.g. a ValidateChange callback, rather than in a ValidateFunc, then you can compare the new and old values and bypass the check if they are both the same. This will then accepting any unchanged value but apply the validation to any changed value.

That would look something like this, inside the &schema.Resource{ ... } block:

    CustomizeDiff: customdiff.All(
        customdiff.ValidateChange("bucket", func (old, new, meta interface{}) error {
            if old == new {
                // Existing buckets are not subject to the new validation rules implemented below,
                // so we'll allow them to pass until they are renamed.
                return nil
            }
            // (then call something like your validateS3BucketName function here)
        }),
    ),

@fahrradflucht
Copy link
Author

@apparentlymart
Thanks for the fast feedback. I will have a look at using CustomizeDiff here.

But just to clarify

Another interesting challenge here is that the validation functions actually happen inside the provider plugin itself, rather than in Terraform Core. This means that the part of the code actually doing the validation can't "see" the lifecycle settings in order to know that disabling the validation is desired.

While the ValidateFunc is actually declared in the provider, isn't terraform core the one actually calling it and could therefore just skip doing this?

@apparentlymart
Copy link
Contributor

@fahrradflucht, the responsibility here actually belongs to a part of the system that isn't obvious at first glance. Perhaps a diagram will help show what's going on here...

provider_plugin_rpc_barrier 2

I originally made this diagram for a different reason, so it's not a perfect fit for what we're discussing here, but the darker blue boxes show the main subsystems at play. The interesting non-obvious detail here is that the "provider framework" is the package helper/schema in this core repository but it actually runs inside the provider plugin process rather than in Terraform Core. This is why each provider codebase has a copy of github.com/hashicorp/terraform/helper/schema in its vendor directory: it is linked in to the plugin binary, and all the code runs there.

Where "schema.Provider ApplyResourceChange" is indicated on this diagram is also where schema.Provider.ValidateResource and schema.Provider.Diff run, and those functions are responsible for running the ValidateFuncs and the CustomizeDiff function (respectively). These both involve running arbitrary code from the provider codebase and so this must run inside the Plugin Process rather than the Terraform Core Process. (we can only send data over the RPC barrier, not code.)

The contents of the lifecycle block never cross the RPC barrier into the Plugin Process, so it's not possible for either of these functions to access it. The lifecycle flags affect graph construction in Terraform Core -- deciding which actions are taken in which order -- but cannot affect the details of exactly what those actions do, implemented inside the provider itself.

The tricky part here is that while many of the schema.Provider methods map obviously onto a single function in the provider, the ValidateResource and Diff functions actually have most of their work done inside helper/schema, calling into the provider's own functions only for small details like the validation and diff customization functions. Nonetheless, this functionality is still implemented inside the provider from Terraform Core's perspective: Terraform Core doesn't "know" about the distinction between the provider framework and the provider's own codebase.

I hope that helps clarify my earlier comment! This is probably more detail than you wanted 😀 but I hope this background information is interesting/useful.

@bflad
Copy link
Contributor

bflad commented Mar 25, 2019

Just to further document some additional use cases or edge cases where the Terraform Provider SDK is not currently able to meet operator or developer needs:


An incorrect implementation of an attribute ValidateFunc either due to implementation error or API documentation error. In this situation, it prevents operators from using the resource entirely. Provider maintainers must either fix the resource implementation or remove the ValidateFunc entirely. In either case, operators are forced to wait until an upgraded provider version is available to use the resource (if they can upgrade).

Example Reference: hashicorp/terraform-provider-aws#8065


New values are introduced to the attribute which are wholly compatible with an existing resource implementation. This may be due to API updates that occur publicly over time. Operators are forced to wait until an upgraded provider version is available to use the new functionality.

Example Reference: hashicorp/terraform-provider-aws#7951


I cannot provide additional public details about this specific use case, but this may also be due to internal usage within a Terraform Provider's service teams for functionality that is only internal (they would not want to improperly expose internal values and we would not want to confuse operator by having these values in the validation). Internal teams are not able to use the Terraform resource in these situations.


I can provide additional references but these are not uncommon situations, at least in the Terraform AWS Provider.

bflad referenced this issue in hashicorp/terraform-provider-aws Mar 25, 2019
…Mixed Instances Policy allocation strategy

Reference: https://github.com/hashicorp/terraform/issues/18027

Output from acceptance testing:

```
--- PASS: TestAccAWSAutoScalingGroup_ALB_TargetGroups (113.98s)
--- PASS: TestAccAWSAutoScalingGroup_ALB_TargetGroups_ELBCapacity (345.50s)
--- PASS: TestAccAWSAutoScalingGroup_autoGeneratedName (52.22s)
--- PASS: TestAccAWSAutoScalingGroup_basic (367.73s)
--- PASS: TestAccAWSAutoScalingGroup_classicVpcZoneIdentifier (42.77s)
--- PASS: TestAccAWSAutoScalingGroup_emptyAvailabilityZones (52.13s)
--- PASS: TestAccAWSAutoScalingGroup_enablingMetrics (193.69s)
--- PASS: TestAccAWSAutoScalingGroup_initialLifecycleHook (234.93s)
--- PASS: TestAccAWSAutoScalingGroup_launchTemplate (47.23s)
--- PASS: TestAccAWSAutoScalingGroup_LaunchTemplate_IAMInstanceProfile (52.54s)
--- PASS: TestAccAWSAutoScalingGroup_launchTemplate_update (108.56s)
--- PASS: TestAccAWSAutoScalingGroup_MixedInstancesPolicy (55.62s)
--- PASS: TestAccAWSAutoScalingGroup_MixedInstancesPolicy_InstancesDistribution_OnDemandAllocationStrategy (50.51s)
--- PASS: TestAccAWSAutoScalingGroup_MixedInstancesPolicy_InstancesDistribution_OnDemandBaseCapacity (73.60s)
--- PASS: TestAccAWSAutoScalingGroup_MixedInstancesPolicy_InstancesDistribution_OnDemandPercentageAboveBaseCapacity (80.82s)
--- PASS: TestAccAWSAutoScalingGroup_MixedInstancesPolicy_InstancesDistribution_SpotAllocationStrategy (49.38s)
--- PASS: TestAccAWSAutoScalingGroup_MixedInstancesPolicy_InstancesDistribution_SpotInstancePools (83.90s)
--- PASS: TestAccAWSAutoScalingGroup_MixedInstancesPolicy_InstancesDistribution_SpotMaxPrice (148.61s)
--- PASS: TestAccAWSAutoScalingGroup_MixedInstancesPolicy_LaunchTemplate_LaunchTemplateSpecification_LaunchTemplateName (74.23s)
--- PASS: TestAccAWSAutoScalingGroup_MixedInstancesPolicy_LaunchTemplate_LaunchTemplateSpecification_Version (178.65s)
--- PASS: TestAccAWSAutoScalingGroup_MixedInstancesPolicy_LaunchTemplate_Override_InstanceType (79.11s)
--- PASS: TestAccAWSAutoScalingGroup_namePrefix (55.81s)
--- PASS: TestAccAWSAutoScalingGroup_serviceLinkedRoleARN (75.65s)
--- PASS: TestAccAWSAutoScalingGroup_suspendingProcesses (264.75s)
--- PASS: TestAccAWSAutoScalingGroup_tags (243.86s)
--- PASS: TestAccAWSAutoScalingGroup_terminationPolicies (115.67s)
--- PASS: TestAccAWSAutoScalingGroup_VpcUpdates (76.76s)
--- PASS: TestAccAWSAutoScalingGroup_WithLoadBalancer (447.50s)
--- PASS: TestAccAWSAutoScalingGroup_withMetrics (189.36s)
--- PASS: TestAccAWSAutoScalingGroup_withPlacementGroup (174.11s)
```
@hashibot hashibot transferred this issue from hashicorp/terraform Sep 26, 2019
@hashibot hashibot added the enhancement New feature or request label Oct 2, 2019
@bflad
Copy link
Contributor

bflad commented Nov 20, 2019

Additional references just from today for "incorrect"/outdated ValidateFunc implementations:

Personally, I think the operator explicitly signaling to disable the validation via configuration is a pragmatic compromise for any potential unexpected behavior such as errors/crashes.

@Haegin
Copy link

Haegin commented Dec 16, 2019

I'd also find this useful for working around hashicorp/terraform-provider-aws#10958 where the AWS API now accepts new values with different behaviour and the terraform provider hasn't been updated but they're needed for our application. Currently I have to configure those settings manually and use ignore_changes to prevent overwriting them. Being able to disable the validation would mean I could remove this manual step from my provisioning process and reduce the chance of errors.

@bflad
Copy link
Contributor

bflad commented Dec 18, 2019

Additional recent references in the Terraform AWS Provider:

@k0eff
Copy link

k0eff commented Mar 26, 2021

Additional use case regarding not-so-recently announced multi-attach option to the AWS EBS type "io2":

  • terraform-provider-aws/issues/16847: AWS API enhanced availability of the "multi_attach_enabled" option (to include ebs type "io2"), but internal provider validation causes error "Error: 'multi_attach_enabled' must not be set when 'type' is 'io2'"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request upstream-terraform
Projects
None yet
Development

No branches or pull requests

7 participants