Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REGRESSION] v3.13.0 causes perpetual diff of resources containing lambda version number #15952

Closed
lijok opened this issue Oct 31, 2020 · 13 comments
Assignees
Labels
bug Addresses a defect in current functionality. service/lambda Issues and PRs that pertain to the lambda service. upstream-terraform Addresses functionality related to the Terraform core binary.

Comments

@lijok
Copy link

lijok commented Oct 31, 2020

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform CLI and Terraform AWS Provider Version

Affected Resource(s)

  • aws_lambda_function

I'm constantly getting diffs for lambdas that have a vpc config. The diffs look like this (minified a bit, only change is always version and qualified_arn), no matter how many times I run apply

 # module.a326_us_east_1.module.summary.aws_lambda_function.this will be updated in-place
  ~ resource "aws_lambda_function" "this" {
        arn                            = "arn:aws:lambda:us-east-1:272785785785:function:A326_summary"
        description                    = "Managed by Terraform"
        function_name                  = "A326_summary"
        handler                        = "function.handler"
        id                             = "A326_summary"
        invoke_arn                     = "arn:aws:apigateway:us-east-1:lambda:path/2015-03-31/functions/arn:aws:lambda:us-east-1:272785785785:function:A326_summary/invocations"
        last_modified                  = "2020-10-31T03:33:04.656+0000"
        layers                         = []
        memory_size                    = 512
        publish                        = true
      ~ qualified_arn                  = "arn:aws:lambda:us-east-1:272785785785:function:A326_summary:3" -> (known after apply)
        reserved_concurrent_executions = 10
        role                           = "arn:aws:iam::272785785785:role/A326_summary20201006163630280800000003"
        runtime                        = "python3.8"
        timeout                        = 900
      ~ version                        = "3" -> (known after apply)
    }

Code (you will need to provide s3 bucket and key)

terraform {
  backend local {}
}

terraform {
  required_version = "= 0.13.5"

  required_providers {
    aws = {
      version = "= 3.13.0"
      source  = "hashicorp/aws"
    }
  }
}

provider aws { region = "us-east-1" }

resource aws_vpc this {
  cidr_block                       = "10.1.0.0/16"
  instance_tenancy                 = "default"
  enable_dns_support               = true
  enable_dns_hostnames             = true
  enable_classiclink               = false
  enable_classiclink_dns_support   = false
  assign_generated_ipv6_cidr_block = false
}

resource aws_subnet this {
  vpc_id                  = aws_vpc.this.id
  cidr_block              = "10.1.0.0/24"
  availability_zone       = "us-east-1a"
}

resource aws_default_security_group default {
  vpc_id                 = aws_vpc.this.id
  revoke_rules_on_delete = true

  ingress {
    protocol    = -1 # ALL
    self        = true
    from_port   = 0
    to_port     = 0
  }

  egress {
    protocol    = -1 # ALL
    cidr_blocks = ["0.0.0.0/0"]
    from_port   = 0
    to_port     = 0
  }
}

resource aws_lambda_function this {
  function_name                  = "test"
  s3_bucket                      = "packages"
  s3_key                         = "A365/16793c285953c0f0e34e28d1858f19599f519cd3.zip"
  handler                        = "test"
  role                           = aws_iam_role.this.arn
  memory_size                    = 128
  runtime                        = "python3.8"
  timeout                        = 5
  reserved_concurrent_executions = 10
  publish                        = true

  vpc_config {
    security_group_ids = [aws_default_security_group.default.id]
    subnet_ids = [aws_subnet.this.id]
  }
}

data aws_iam_policy_document permissions {
  statement {
    effect = "Allow"
    actions = [
      "ec2:DescribeNetworkInterfaces",
      "ec2:CreateNetworkInterface",
      "ec2:DeleteNetworkInterface"
    ]
    resources = [
      "*"
    ]
  }
}

data aws_iam_policy_document assume_role {
  statement {
    effect  = "Allow"
    actions = ["sts:AssumeRole"]
    principals {
      type = "Service"
      identifiers = ["lambda.amazonaws.com"]
    }
  }
}

resource aws_iam_policy this {
  name_prefix = "test"
  policy      = data.aws_iam_policy_document.permissions.json
}

resource aws_iam_role this {
  name_prefix          = "test"
  assume_role_policy   = data.aws_iam_policy_document.assume_role.json
}

resource aws_iam_role_policy_attachment this {
  role       = aws_iam_role.this.name
  policy_arn = aws_iam_policy.this.arn
}

Steps

  1. terraform init
  2. terraform apply
  3. terraform plan

Note

This only happens when the lambda has a vpc config

@ghost ghost added service/apigatewayv2 Issues and PRs that pertain to the apigatewayv2 service. service/lambda Issues and PRs that pertain to the lambda service. labels Oct 31, 2020
@github-actions github-actions bot added the needs-triage Waiting for first response or review from a maintainer. label Oct 31, 2020
@bill-rich bill-rich added bug Addresses a defect in current functionality. good first issue Call to action for new contributors looking for a place to start. Smaller or straightforward issues. and removed needs-triage Waiting for first response or review from a maintainer. service/apigatewayv2 Issues and PRs that pertain to the apigatewayv2 service. labels Nov 2, 2020
@gdavison gdavison self-assigned this Nov 3, 2020
@gdavison
Copy link
Contributor

gdavison commented Nov 3, 2020

HI @lijok, thanks for raising this issue. Can you please provide the Terraform configuration that is causing the diffs? It will make it easier to reproduce the problem.

@lijok
Copy link
Author

lijok commented Nov 4, 2020

Thanks for having a look at this @gdavison
The config is fairly nested (multiple levels of external module calls)
I'll try and see if I can recreate this in a new new environment and post the code tomorrow

@lijok
Copy link
Author

lijok commented Nov 6, 2020

@gdavison luckily we've got a ton of lambda's running, so noticed that this only happens with lambdas that have a vpc config
Added a minimal terraform config example in the original post. You will however need to provide an s3 bucket and key for a lambda package
Let me know if there's anything else I can help with on this

@lijok
Copy link
Author

lijok commented Nov 6, 2020

@maciejp-ro
Copy link

I experience the same problem. I also have deeply nested modules and it would be hard to extract a minimal example, but I did some debugging. Not sure if it helps with the fix, but here's what I found:

After looking at #15121 I noticed that (both here and for me) qualified_arn and version are reported as changed, but last_modified is not. This would suggest that in updateComputedAttributesOnPublish, configChanged is true, but functionCodeUpdated is not (and that nothing gets actually pushed on apply because the version number does not increase afterwards).

I added some debug logs in updateComputedAttributesOnPublish and hasConfigChanges, rebuilt provider locally, and here are results:

2020-11-06T21:18:14.489+0100 [DEBUG] plugin.terraform-provider-aws: 2020/11/06 21:18:14 [DEBUG] LAMBDA15121 hasConfigChanges: description=%!s(bool=false) handler=%!s(bool=false) file_system_config=%!s(bool=false) memory_size=%!s(bool=false) role=%!s(bool=false) timeout=%!s(bool=false) kms_key_arn=%!s(bool=false) layers=%!s(bool=false) dead_letter_config=%!s(bool=false) tracing_config=%!s(bool=false) vpc_config=%!s(bool=true) runtime=%!s(bool=false) environment=%!s(bool=false)
2020-11-06T21:18:14.490+0100 [DEBUG] plugin.terraform-provider-aws: 2020/11/06 21:18:14 [DEBUG] LAMBDA15121 updateComputedAttributesOnPublish: publish=%!s(bool=true) publishChanged=%!s(bool=false) configChanged=%!s(bool=true) functionCodeUpdated=%!s(bool=false)
2020-11-06T21:18:53.134+0100 [DEBUG] plugin.terraform-provider-aws: 2020/11/06 21:18:53 [DEBUG] LAMBDA15121 hasConfigChanges: description=%!s(bool=false) handler=%!s(bool=false) file_system_config=%!s(bool=false) memory_size=%!s(bool=false) role=%!s(bool=false) timeout=%!s(bool=false) kms_key_arn=%!s(bool=false) layers=%!s(bool=false) dead_letter_config=%!s(bool=false) tracing_config=%!s(bool=false) vpc_config=%!s(bool=true) runtime=%!s(bool=false) environment=%!s(bool=false)
2020-11-06T21:18:53.135+0100 [DEBUG] plugin.terraform-provider-aws: 2020/11/06 21:18:53 [DEBUG] LAMBDA15121 updateComputedAttributesOnPublish: publish=%!s(bool=true) publishChanged=%!s(bool=false) configChanged=%!s(bool=true) functionCodeUpdated=%!s(bool=false)
2020-11-06T21:18:53.141+0100 [DEBUG] plugin.terraform-provider-aws: 2020/11/06 21:18:53 [DEBUG] LAMBDA15121 hasConfigChanges: description=%!s(bool=false) handler=%!s(bool=false) file_system_config=%!s(bool=false) memory_size=%!s(bool=false) role=%!s(bool=false) timeout=%!s(bool=false) kms_key_arn=%!s(bool=false) layers=%!s(bool=false) dead_letter_config=%!s(bool=false) tracing_config=%!s(bool=false) vpc_config=%!s(bool=true) runtime=%!s(bool=false) environment=%!s(bool=false)

Function hasConfigChanges returns true as suspected, it registers changes in vpc_config. The VPC config does not actually change (and is not marked as changed in plan). The section is computed:

variable "subnet_ids" { default = [] }
variable "security_group_ids" { default = [] }

resource "aws_lambda_function" "main" {
  # …
  dynamic "vpc_config" {
    for_each = length(var.subnet_ids) + length(var.security_group_ids) > 0 ? [""] : []
    content {
      subnet_ids         = var.subnet_ids
      security_group_ids = var.security_group_ids
    }
  }
}

In this case, both subnet_ids and security_group_ids are provided. Instances of this module where both variables are left empty (and dynamic "vpc_config" does not execute) do not show this plan.

FWIW, extra debug when subnet_ids and security_group_ids are not provided looks like this:

2020-11-06T21:26:23.706+0100 [DEBUG] plugin.terraform-provider-aws: 2020/11/06 21:26:23 [DEBUG] LAMBDA15121 hasConfigChanges: description=%!s(bool=false) handler=%!s(bool=false) file_system_config=%!s(bool=false) memory_size=%!s(bool=false) role=%!s(bool=false) timeout=%!s(bool=false) kms_key_arn=%!s(bool=false) layers=%!s(bool=false) dead_letter_config=%!s(bool=false) tracing_config=%!s(bool=false) vpc_config=%!s(bool=false) runtime=%!s(bool=false) environment=%!s(bool=false)
2020-11-06T21:26:23.706+0100 [DEBUG] plugin.terraform-provider-aws: 2020/11/06 21:26:23 [DEBUG] LAMBDA15121 updateComputedAttributesOnPublish: publish=%!s(bool=true) publishChanged=%!s(bool=false) configChanged=%!s(bool=false) functionCodeUpdated=%!s(bool=false)

I'm afraid my ability to debug the plugin ends here, hope it helps. If there's any other debugging info I can add that would help with diagnosis or fixing, let me know.

@maciejp-ro
Copy link

Looking closely through the logs I also found the following:

2020/11/06 21:18:53 [WARN] Provider "localhost.localdomain/tmp/aws" produced an invalid plan for module.staging.module.api.module.rotate-secrets-lambda.module.lambda.aws_lambda_function.main, but we are tolerating it because it is using the legacy plugin SDK.
    The following problems may be the cause of any confusing errors from downstream operations:
      - .tracing_config: block count in plan (1) disagrees with count in config (0)

@grahamhar
Copy link
Contributor

Could this also cause issues with other items that are nested config such as dead_letter_config and file_system_config?

@gdavison
Copy link
Contributor

Thanks to @maciejp-ro's PR, I've identified the problem: when we test an attribute for changes, it always incorrectly returns that there is a change if there is a TypeSet element in a nested structure. This is tracked in hashicorp/terraform-plugin-sdk#617

@gdavison gdavison added upstream-terraform Addresses functionality related to the Terraform core binary. and removed good first issue Call to action for new contributors looking for a place to start. Smaller or straightforward issues. labels Nov 11, 2020
@lijok
Copy link
Author

lijok commented Jan 6, 2021

@gdavison does anyone know when this might be fixed?
this bug has us version locked to 3.12.0

@gdavison
Copy link
Contributor

gdavison commented Jan 8, 2021

@lijok, this is due to an issue in the Terraform Plugin SDK that the AWS Provider uses. The issue to track in that project is hashicorp/terraform-plugin-sdk#617. Once that issue is resolved and we merge that version of the Plugin SDK, it will resolve this issue

@deyvsh
Copy link

deyvsh commented Apr 28, 2021

@gdavidson the comments streams on hashicorp/terraform-plugin-sdk#617 and hashicorp/terraform-plugin-sdk#643 now suggest that this might be best fixed in this repo after all, by @grahamhar's #17610.

mskrajnowski added a commit to codequest-eu/terraform-modules that referenced this issue Jun 10, 2021
* feat(lambda): added image input

This reverts commit 746c740.

* ci: lambda requires aws provider 3.19

* fix(lambda): set aws_lambda_function package_type

* ci: lambda requires aws provider 3.19

* fix(lambda): don't pass layers, handler or runtime when using a container image

* refactor(lambda): add vpc_config block only when needed

minimizes the impact of hashicorp/terraform-provider-aws#15952

* feat(lambda): added publish variable

so you can disable publishing versions, eg. as a workaround for the perpetual diff problem

* feat(rds/postgres/management_lambda): disabled lambda version publishing

as a workaround for the perpetual diff problem

* docs: example lambdas should no longer be affected by perpetual diffs
@gdavison
Copy link
Contributor

gdavison commented Jul 6, 2022

This is resolved in the latest version of the provider. I'm going to close this issue

@gdavison gdavison closed this as completed Jul 6, 2022
@github-actions
Copy link

github-actions bot commented Aug 5, 2022

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 5, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Addresses a defect in current functionality. service/lambda Issues and PRs that pertain to the lambda service. upstream-terraform Addresses functionality related to the Terraform core binary.
Projects
None yet
Development

No branches or pull requests

6 participants