Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Terraform forcing replacement of ECS task definition - (only on task definitions with mount points) #11526

Open
ghost opened this issue Jan 8, 2020 · 24 comments
Labels
bug Addresses a defect in current functionality. service/ecs Issues and PRs that pertain to the ecs service.

Comments

@ghost
Copy link

ghost commented Jan 8, 2020

This issue was originally opened by @im-lcoupe as hashicorp/terraform#23780. It was migrated here as a result of the provider split. The original body of the issue is below.


Summary

Hi there,

So this only seems to have become a problem since upgrading my code to the latest version - and strangely only seems to happen on the task definitions with mount points (however, the format of them hasn't changed...)

Terraform will constantly try and replace the two task definitions regardless of whether any changes have been made to them...

Any guidance on this would be greatly appreciated, as it means the task definition revision is changing on every run (which of course is not ideal)...

Terraform Version

0.12.18

Terraform Configuration Files - 1st problematic task definition - followed by the second

[
  {
    "name": "${container_name_nginx}",
    "image": "${container_image_nginx}",
    "memory": ${container_memory},
    "cpu": ${container_cpu},
    "networkMode": "awsvpc",
    "volumesFrom": [],
    "essential": true,
    "portMappings": [
      {
        "containerPort": 443,
        "hostPort": 443,
        "protocol": "tcp"
      }
    ],
    "mountPoints": [
     {
    "readOnly": false,
    "containerPath": "/var/www/symfony/var/log",
    "sourceVolume": "shared_symfony_logs"
     },
     {
     "readOnly": false,
     "containerPath": "/var/log/nginx",
     "sourceVolume": "shared_nginx_logs"
     }
    ],
    "environment": [
      {
        "name": "${env_var_1_name}",
        "value": "${env_var_value_1}"
      },
      {
        "name": "${env_var_2_name}",
        "value": "${env_var_value_2}"
      },
      {
        "name": "${env_var_3_name}",
        "value": "${env_var_value_3}"
      }
    ],
  "logConfiguration" : {
    "logDriver" : "awslogs",
    "options" :{
      "awslogs-create-group": "true",
      "awslogs-group": "${container_name_nginx}",
      "awslogs-region": "${platform_region}",
      "awslogs-stream-prefix": "ecs"
    }
  }
},

{
  "name": "${container_name_php}",
  "image": "${container_image_php}",
  "memory": ${container_memory},
  "cpu": ${container_cpu},
  "networkMode": "awsvpc",
  "volumesFrom": [],
  "essential": true,
  "portMappings": [
    {
      "containerPort": 9000,
      "hostPort": 9000,
      "protocol": "tcp"
    }
  ],
  "mountPoints": [
   {
  "readOnly": false,
  "containerPath": "/var/www/symfony/var/log",
  "sourceVolume": "shared_symfony_logs"
   },
   {
   "readOnly": false,
   "containerPath": "/var/log/nginx",
   "sourceVolume": "shared_nginx_logs"
   }
  ],
  "environment": [
    {
      "name": "${env_var_4_name}",
      "value": "${env_var_value_4}"
    },
    {
      "name": "${env_var_5_name}",
      "value": "${env_var_value_5}"
    },
    {
      "name": "${env_var_6_name}",
      "value": "${env_var_value_6}"
    },
    {
      "name": "${env_var_7_name}",
      "value": "${env_var_value_7}"
    },
    {
      "name": "${env_var_8_name}",
      "value": "${env_var_value_8}"
    },
    {
      "name": "${env_var_10_name}",
      "value": "${env_var_value_10}"
    },
    {
      "name": "${env_var_11_name}",
      "value": "${env_var_value_11}"
    },
    {
      "name": "${env_var_12_name}",
      "value": "${env_var_value_12}"
    },
    {
      "name": "${env_var_13_name}",
      "value": "${env_var_value_13}"
    },
    {
      "name": "${env_var_14_name}",
      "value": "${env_var_value_14}"
    },
    {
      "name": "${env_var_15_name}",
      "value": "${env_var_value_15}"
    },
    {
      "name": "${env_var_16_name}",
      "value": "${env_var_value_16}"
    },
    {
      "name": "${env_var_17_name}",
      "value": "${env_var_value_17}"
    },
    {
      "name": "${env_var_18_name}",
      "value": "${env_var_value_18}"
    },
    {
      "name": "${env_var_19_name}",
      "value": "${env_var_value_19}"
    },
    {
      "name": "${env_var_20_name}",
      "value": "${env_var_value_20}"
    },
    {
      "name": "${env_var_21_name}",
      "value": "${env_var_value_21}"
    },
    {
      "name": "${env_var_22_name}",
      "value": "${env_var_value_22}"
    },
    {
      "name": "${env_var_1_name}",
      "value": "${env_var_value_1}"
    },
    {
      "name": "${env_var_28_name}",
      "value": "${env_var_value_28}"
    },
    {
      "name": "${env_var_29_name}",
      "value": "${env_var_value_29}"
    },
    {
      "name": "${env_var_30_name}",
      "value": "${env_var_value_30}"
    },
    {
      "name": "${env_var_31_name}",
      "value": "${env_var_value_31}"
    },
    {
      "name": "${env_var_32_name}",
      "value": "${env_var_value_32}"
    },
    {
      "name": "${env_var_33_name}",
      "value": "${env_var_value_33}"
    },
    {
      "name": "${env_var_34_name}",
      "value": "${env_var_value_34}"
    },
    {
      "name": "${env_var_35_name}",
      "value": "${env_var_value_35}"
    },
    {
      "name": "${env_var_36_name}",
      "value": "${env_var_value_36}"
    },
    {
      "name": "${env_var_37_name}",
      "value": "${env_var_value_37}"
    },
    {
      "name": "${env_var_38_name}",
      "value": "${env_var_value_38}"
    }
  ],
  "secrets":[
    {
      "name":"${sensitive_var_1}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_1}"
    },
    {
      "name":"${sensitive_var_2}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_2}"
    },
    {
      "name":"${sensitive_var_3}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_3}"
    },
    {
      "name":"${sensitive_var_4}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_4}"
    },
    {
      "name":"${sensitive_var_5}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_5}"
    },
    {
      "name":"${sensitive_var_6}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_6}"
    },
    {
      "name":"${sensitive_var_7}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_7}"
    },
    {
      "name":"${sensitive_var_8}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_8}"
    },
    {
      "name":"${sensitive_var_9}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_9}"
    },
    {
      "name":"${sensitive_var_10}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_10}"
    },
    {
      "name":"${sensitive_var_11}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_11}"
    },
    {
      "name":"${sensitive_var_12}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_12}"
    },
    {
      "name":"${sensitive_var_13}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_13}"
    }
  ],
"logConfiguration" : {
  "logDriver" : "awslogs",
  "options" :{
    "awslogs-create-group": "true",
    "awslogs-group": "${container_name_php}",
    "awslogs-region": "${platform_region}",
    "awslogs-stream-prefix": "ecs"
  }
}
},


{
  "name": "${container_name_logstash}",
  "image": "${container_image_logstash}",
  "memory": ${container_memory},
  "cpu": ${container_cpu},
  "networkMode": "awsvpc",
  "volumesFrom": [],
  "essential": true,
  "portMappings": [
    {
      "containerPort": 9600,
      "hostPort": 9600,
      "protocol": "tcp"
    }
  ],
  "mountPoints": [
   {
  "readOnly": false,
  "containerPath": "/var/www/symfony/var/log",
  "sourceVolume": "shared_symfony_logs"
   },
   {
   "readOnly": false,
   "containerPath": "/var/log/nginx",
   "sourceVolume": "shared_nginx_logs"
   }
 ],
 "environment": [
   {
     "name": "${env_var_23_name}",
     "value": "${env_var_value_23}"
   },
   {
     "name": "${env_var_1_name}",
     "value": "${env_var_value_1}"
   }
 ],
"logConfiguration" : {
  "logDriver" : "awslogs",
  "options" :{
    "awslogs-create-group": "true",
    "awslogs-group": "${container_name_logstash}",
    "awslogs-region": "${platform_region}",
    "awslogs-stream-prefix": "ecs"
    }
  }
}
]

Second Task definition

[
{
  "name": "${container_name_php}",
  "image": "${container_image_php}",
  "memory": ${container_memory},
  "cpu": ${container_cpu},
  "networkMode": "awsvpc",
  "volumesFrom": [],
  "essential": true,
  "command": ["/usr/bin/supervisord", "-c", "/etc/supervisor/conf.d/supervisord.conf"],
  "portMappings": [
    {
      "containerPort": 9001,
      "hostPort": 9001,
      "protocol": "tcp"
    }
  ],
  "mountPoints": [
   {
  "readOnly": false,
  "containerPath": "/var/www/symfony/var/log",
  "sourceVolume": "shared_worker_logs"
   }
 ],
  "environment": [
    {
      "name": "${env_var_4_name}",
      "value": "${env_var_value_4}"
    },
    {
      "name": "${env_var_5_name}",
      "value": "${env_var_value_5}"
    },
    {
      "name": "${env_var_6_name}",
      "value": "${env_var_value_6}"
    },
    {
      "name": "${env_var_7_name}",
      "value": "${env_var_value_7}"
    },
    {
      "name": "${env_var_8_name}",
      "value": "${env_var_value_8}"
    },
    {
      "name": "${env_var_10_name}",
      "value": "${env_var_value_10}"
    },
    {
      "name": "${env_var_11_name}",
      "value": "${env_var_value_11}"
    },
    {
      "name": "${env_var_12_name}",
      "value": "${env_var_value_12}"
    },
    {
      "name": "${env_var_13_name}",
      "value": "${env_var_value_13}"
    },
    {
      "name": "${env_var_14_name}",
      "value": "${env_var_value_14}"
    },
    {
      "name": "${env_var_15_name}",
      "value": "${env_var_value_15}"
    },
    {
      "name": "${env_var_16_name}",
      "value": "${env_var_value_16}"
    },
    {
      "name": "${env_var_17_name}",
      "value": "${env_var_value_17}"
    },
    {
      "name": "${env_var_18_name}",
      "value": "${env_var_value_18}"
    },
    {
      "name": "${env_var_19_name}",
      "value": "${env_var_value_19}"
    },
    {
      "name": "${env_var_20_name}",
      "value": "${env_var_value_20}"
    },
    {
      "name": "${env_var_21_name}",
      "value": "${env_var_value_21}"
    },
    {
      "name": "${env_var_22_name}",
      "value": "${env_var_value_22}"
    },
    {
      "name": "${env_var_1_name}",
      "value": "${env_var_value_1}"
    },
    {
      "name": "${env_var_28_name}",
      "value": "${env_var_value_28}"
    },
    {
      "name": "${env_var_29_name}",
      "value": "${env_var_value_29}"
    },
    {
      "name": "${env_var_30_name}",
      "value": "${env_var_value_30}"
    },
    {
      "name": "${env_var_31_name}",
      "value": "${env_var_value_31}"
    },
    {
      "name": "${env_var_32_name}",
      "value": "${env_var_value_32}"
    },
    {
      "name": "${env_var_33_name}",
      "value": "${env_var_value_33}"
    },
    {
      "name": "${env_var_34_name}",
      "value": "${env_var_value_34}"
    },
    {
      "name": "${env_var_35_name}",
      "value": "${env_var_value_35}"
    },
    {
      "name": "${env_var_36_name}",
      "value": "${env_var_value_37}"
    },
    {
      "name": "${env_var_37_name}",
      "value": "${env_var_value_37}"
    },
    {
      "name": "${env_var_38_name}",
      "value": "${env_var_value_38}"
    }
  ],
  "secrets":[
    {
      "name":"${sensitive_var_1}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_1}"
    },
    {
      "name":"${sensitive_var_2}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_2}"
    },
    {
      "name":"${sensitive_var_3}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_3}"
    },
    {
      "name":"${sensitive_var_4}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_4}"
    },
    {
      "name":"${sensitive_var_5}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_5}"
    },
    {
      "name":"${sensitive_var_6}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_6}"
    },
    {
      "name":"${sensitive_var_7}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_7}"
    },
    {
      "name":"${sensitive_var_8}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_8}"
    },
    {
      "name":"${sensitive_var_9}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_9}"
    },
    {
      "name":"${sensitive_var_10}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_10}"
    },
    {
      "name":"${sensitive_var_11}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_11}"
    },
    {
      "name":"${sensitive_var_12}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_12}"
    },
    {
      "name":"${sensitive_var_13}",
      "valueFrom": "arn:aws:ssm:${platform_region}:${aws_account_number}:parameter/${product}/${service_1}/${environment}/${sensitive_var_13}"
    }
  ],
"logConfiguration" : {
  "logDriver" : "awslogs",
  "options" :{
    "awslogs-create-group": "true",
    "awslogs-group": "${container_name_php}",
    "awslogs-region": "${platform_region}",
    "awslogs-stream-prefix": "ecs"
  }
}
},


{
  "name": "${container_name_logstash}",
  "image": "${container_image_logstash}",
  "memory": ${container_memory},
  "cpu": ${container_cpu},
  "networkMode": "awsvpc",
  "volumesFrom": [],
  "essential": true,
  "portMappings": [
    {
      "containerPort": 9600,
      "hostPort": 9600,
      "protocol": "tcp"
    }
  ],
  "mountPoints": [
   {
  "readOnly": false,
  "containerPath": "/var/www/symfony/var/log",
  "sourceVolume": "shared_worker_logs"
   }
 ],
 "environment": [
   {
     "name": "${env_var_23_name}",
     "value": "${env_var_value_23}"
   },
   {
     "name": "${env_var_1_name}",
     "value": "${env_var_value_1}"
   }
 ],
"logConfiguration" : {
  "logDriver" : "awslogs",
  "options" :{
    "awslogs-create-group": "true",
    "awslogs-group": "${container_name_logstash}",
    "awslogs-region": "${platform_region}",
    "awslogs-stream-prefix": "ecs"
    }
  }
}
]

Example plan output - to me it isn't clear what exactly it needs to change which requires a forced replacement - it looks to remove vars that are already there and then re add them? in other places it seems to re order them also. It also looks to be adding the network mode (awsvpc) which is already defined in the task definition?

      ~ container_definitions    = jsonencode(
          ~ [ # forces replacement
              ~ {
                    cpu              = 512
                  ~ environment      = [
                      - {
                          - name  = "PHP_FPM_PORT"
                          - value = "9000"
                        },
                      - {
                          - name  = "PHP_FPM_HOST"
                          - value = "localhost"
                        },
                        {
                            name  = "APP_ENV"
                            value = "prod"
                        },
                      + {
                          + name  = "PHP_FPM_HOST"
                          + value = "localhost"
                        },
                      + {
                          + name  = "PHP_FPM_PORT"
                          + value = "9000"
                        },
                    ]
                    essential        = true
                    image            = "###########################:latest"
                    logConfiguration = {
                        logDriver = "awslogs"
                        options   = {
                            awslogs-create-group  = "true"
                            awslogs-group         = "###############"
                            awslogs-region        = "eu-west-1"
                            awslogs-stream-prefix = "ecs"
                        }
                    }
                    memory           = 1024
                    mountPoints      = [
                        {
                            containerPath = "/var/www/symfony/var/log"
                            readOnly      = false
                            sourceVolume  = "shared_symfony_logs"
                        },
                        {
                            containerPath = "/var/log/nginx"
                            readOnly      = false
                            sourceVolume  = "shared_nginx_logs"
                        },
                    ]
                    name             = "##################"
                  + networkMode      = "awsvpc"
                    portMappings     = [
                        {
                            containerPort = 443
                            hostPort      = 443
                            protocol      = "tcp"
                        },
                    ]
                    volumesFrom      = []
                } # forces replacement,

Expected Behavior

Terraform should not try and replace the task definitions on every plan.

Actual Behavior

Terraform forces replacement of the task definition on every plan.

Steps to Reproduce

Terraform plan

@moyuanhuang
Copy link
Contributor

I'm having the same issue. I think it's ordering the custom environment variables, plus adding some default configurations if you don't already have them in your task definition. For my case I had to add these default options and reorder the environment variable according to the diff output.

It'd be nice if terraform can

  1. compare the environment variables as a real hash (so that the order doesn't matter)
  2. avoid updating the task definitions because of the absence of some default variables.

@reedflinch
Copy link

Going off of @moyuanhuang I also suspect the issue is the ordering of environment variables. I do NOT see the issue with secrets. One thing to note for my use case is I am changing the image of the task definition. So I do expect a new task definition to be created with the new image, but I do not expect to see a diff for unchanging environment variables.

This makes evaluating diffs for task definitions extremely difficult.

I notice the that AWS API + CLI do return these arrays in a consistent order (from what I can see), so perhaps this is something that Terraform or the provider itself is doing.

@LeComptoirDesPharmacies
Copy link

LeComptoirDesPharmacies commented Mar 27, 2020

Hi, I having the same issue without mount points.
In addition of reordering custom environment variables, I have some variable set at null who indicate terraform to recreate the task definition.

Exemple with docker health check :

~ healthCheck      = {
                        command     = [
                            "CMD-SHELL",
                            "agent health",
                        ]
                        interval    = 15
                        retries     = 10
                        startPeriod = 15
                      - timeout     = 5 -> null
                    }

@moyuanhuang
Copy link
Contributor

@LeComptoirDesPharmacies That probably means there's a default value set for this particular config to be 5. However because you don't specify that config in your task definition so terraform thinks that you're trying to set it to null (which is never gonna happen since there is an enforced default). Add

timeout = 5

...to your task definition and you should be able to avoid terraform recreating the task.

@LeComptoirDesPharmacies
Copy link

LeComptoirDesPharmacies commented Mar 27, 2020

@moyuanhuang Yes thanks.
But I have the problems with custom environment variables too.
I found this fix who is waiting to be merge :

@jonesmac
Copy link

I found that alphabetizing my env variables by name seems to keep it out of the plan. Noticed that the ecs task definition stores them that way in the json output in the console.

@DrFaust92 DrFaust92 added the service/ecs Issues and PRs that pertain to the ecs service. label May 21, 2020
@Drewster727
Copy link

I can confirm that alphabetizing like @jonesmac mentioned and adding in the items with their default values that terraform thinks has changed will resolve this, as a workaround.

@aashitvyas
Copy link

I have also got hit with this.
The workaround suggested in this thread 1. ensure environment variables are in alphabetical order 2. ensure all the default values are filled in with its null/empty values worked for us for now.
I still believe that , this is the terraform aws provider issue and still a bug to address for the future.

@gRizzlyGR
Copy link

gRizzlyGR commented Jul 1, 2021

The same thing happened using FluentBit log router. After adding its container definition, Terraform was forcing a new plan each time, even without touching anything:

~ {
    - cpu = 0 -> null
    - environment           = [] -> null
    - mountPoints           = [] -> null
    - portMappings          = [] -> null
    - user= "0" -> null
    - volumesFrom           = [] -> null
      # (6 unchanged elements hidden)
  },

After setting explicitly these values in the code, no more changes. Here's the FluentBit container definition:

{
  essential = true,
  image = var.fluentbit_image_url,
  name  = "log_router",
  firelensConfiguration = {
    type = "fluentbit"
  },
  logConfiguration : {
    logDriver = "awslogs",
    options = {
      awslogs-group = "firelens-container",
      awslogs-region= var.region,
      awslogs-create-group  = "true",
      awslogs-stream-prefix = "firelens"
    }
  },
  memoryReservation = var.fluentbit_memory_reservation
  cpu   = 0
  environment   = []
  mountPoints   = []
  portMappings  = []
  user  = "0"
  volumesFrom   = []
}

@justinretzolk
Copy link
Member

Hey y'all 👋 Thank you for taking the time to file this issue, and for the continued discussion around it. Given that there's been a number of AWS provider releases since the last update here, can anyone confirm if you're still experiencing this behavior?

@justinretzolk justinretzolk added waiting-response Maintainers are waiting on response from community or contributor. and removed needs-triage Waiting for first response or review from a maintainer. labels Nov 18, 2021
@camway
Copy link

camway commented Nov 30, 2021

@justinretzolk I can tell you this is still an issue. We've been hitting it for almost a year now, and I've made a concerted effort over the last week or so to address it. These are our versions (should be the latest at the time of posting this):

Terraform v1.0.11
on linux_amd64
+ provider registry.terraform.io/hashicorp/aws v3.64.2
+ provider registry.terraform.io/hashicorp/null v3.1.0
+ provider registry.terraform.io/hashicorp/random v3.1.0
+ provider registry.terraform.io/hashicorp/time v0.7.2
+ provider registry.terraform.io/hashicorp/tls v3.1.0

Still actively going through the workarounds above trying to get this to work correctly, but so far no dice.

@github-actions github-actions bot removed the waiting-response Maintainers are waiting on response from community or contributor. label Nov 30, 2021
@aashitvyas
Copy link

@justinretzolk This issue still persists in the latest AWS provider version and should definitely need to be addressed.

@camway
Copy link

camway commented Dec 29, 2021

It's been a little while since I last posted. I've attempted to fix this a few times since my last post, but I've had no success so far. To provide a little more information, this is one of the ECS tasks that's being effected (sorry for the sanitization):

-/+ resource "aws_ecs_task_definition" "task" {
      ~ arn                      = "arn:aws:ecs:AWS_REGION:AWS_ACCOUNT_ID:task-definition/AWS_ECS_TASK_NAME:317" -> (known after apply)
      ~ container_definitions    = jsonencode(
            [
              - {
                  - cpu                   = 0
                  - dockerLabels          = {
                      - traefik.frontend.entryPoints    = "https"
                      - traefik.frontend.passHostHeader = "true"
                      - traefik.frontend.rule           = "Host:MY_DNS_NAME"
                      - traefik.protocol                = "https"
                    }
                  - environment           = [
                      - {
                          - name  = "APP_PORT"
                          - value = "54321"
                        },
                    ]
                  - essential             = true
                  - image                 = "DOCKER_REPOR_URL:DOCKER_TAG"
                  - logConfiguration      = {
                      - logDriver = "awslogs"
                      - options   = {
                          - awslogs-group         = "AWS_LOGS_GROUP"
                          - awslogs-region        = "AWS_REGION"
                          - awslogs-stream-prefix = "AWS_STREAM_PREFIX"
                        }
                    }
                  - mountPoints           = [
                      - {
                          - containerPath = "/PATH/IN/CONTAINER/"
                          - sourceVolume  = "EFS_NAME"
                        },
                      - {
                          - containerPath = "/PATH/IN/CONTAINER"
                          - sourceVolume  = "EFS_NAME"
                        },
                    ]
                  - name                  = "SERVICE_NAME"
                  - portMappings          = [
                      - {
                          - containerPort = 443
                          - hostPort      = 443
                          - protocol      = "tcp"
                        },
                    ]
                  - repositoryCredentials = {
                      - credentialsParameter = "arn:aws:secretsmanager:AWS_REGION:AWS_ACCOUNT_ID:secret:SECRET_VERSION"
                    }
                  - startTimeout          = 120
                  - stopTimeout           = 120
                  - volumesFrom           = []
                },
            ]
        ) -> (known after apply) # forces replacement

This is part of the plan output immediately after an apply. One attempt I made recently was just focused on getting 'cpu' above to stop appearing. Adding a "cpu": 0, into the json for the container definition and reapplying has zero effect on the diff for future plan/applies.

Not sure what I'm doing wrong, but at this point we've begun dancing around the issue by using -target= during terraform applies so that it doesn't update all the time.

@nutakkimurali
Copy link

nutakkimurali commented Jan 20, 2022

Hi Everyone,

While upgrading from TF version 0.11.x to version 1.0.10, we ran into a similar issue. Though alphabetizing and setting the default values with null/empty values worked, it's a cumbersome process to refactor the task definitions with so many parameters. 
I believe that the revision number of the task definition is responsible for this behavior if we provision the task without a revision number. Consequently, I made a few changes to the service task definition in ECS to capture the revision number, which helped resolve the issue.

ECS Task Definition JSON File

[
  {
    "secrets": [
      {
        "name": "NRIA_LICENSE_KEY",
        "valueFrom": "arn:aws:ssm:${xxxxxx}:${xxxxxx}:xxxxxx/portal/${xxxxx}/NewrelicKey"        
      }
    ],
    "portMappings": [],
    "cpu": 200,
    "memory": ${ram},
    "environment": [
      {
        "name": "NRIA_OVERRIDE_HOST_ROOT",
        "value": "/host"
      },
      {
        "name": "ENABLE_NRI_ECS",
        "value": "true"
      },
      {
        "name": "NRIA_PASSTHROUGH_ENVIRONMENT",
        "value": "ECS_CONTAINER_METADATA_URI,ENABLE_NRI_ECS"
      },
      {
        "name": "NRIA_VERBOSE",
        "value": "0"
      },
      {
        "name": "NRIA_CUSTOM_ATTRIBUTES",
        "value": "{\"nrDeployMethod\":\"downloadPage\"}"
      }
    ],
    "mountPoints": [
      {
        "readOnly": true,
        "containerPath": "/host",
        "sourceVolume": "host_root_fs"
      },
      {
        "readOnly": false,
        "containerPath": "/var/run/docker.sock",
        "sourceVolume": "docker_socket"
      }
    ],
    "volumesFrom": [],
    "image": "${image}",
    "essential": true,
    "readonlyRootFilesystem": false,
    "privileged": true,
    "name": "${name}",
    "logConfiguration": {
      "logDriver": "awslogs",
      "options": {
        "awslogs-group": "${awslogs_group}",
        "awslogs-region": "${aws_region}",
        "awslogs-stream-prefix": "${name}"
      }
    }
  }
]


resource "aws_ecs_task_definition" "newrelic_infra_agent" {
  family                   = "${var.workspace}-newrelic-infra-${var.env}"
  requires_compatibilities = ["EC2"]
  network_mode             = "host"
  cpu                      = "256"
  memory                   = "512"
  execution_role_arn       = var.ecs_task_role_arn
  container_definitions    = data.template_file.newrelic_infra_agent.rendered
  #tags                     = "${local.tags}"

  volume  {
    name      = "host_root_fs"
    host_path = "/"
  }

  volume  {
    name      = "docker_socket"
    host_path = "/var/run/docker.sock"
  }

resource "aws_ecs_task_definition" "newrelic_infra_agent" {
  family                   = "${var.workspace}-newrelic-infra-${var.env}"
  requires_compatibilities = ["EC2"]
  network_mode             = "host"
  cpu                      = "256"
  memory                   = "512"
  execution_role_arn       = var.ecs_task_role_arn
  container_definitions    = data.template_file.newrelic_infra_agent.rendered

  volume  {
    name      = "host_root_fs"
    host_path = "/"
  }

  volume  {
    name      = "docker_socket"
    host_path = "/var/run/docker.sock"
  }

}
data "aws_ecs_task_definition" "newrelic_infra_agent" {
  task_definition = "${aws_ecs_task_definition.newrelic_infra_agent.family}"
  depends_on      = [aws_ecs_task_definition.newrelic_infra_agent]
}

resource "aws_ecs_service" "newrelic_infra_agent" {
  name = "${var.workspace}-newrelic-infra-${var.env}"
  cluster = aws_ecs_cluster.ecs-cluster.id
  task_definition = "${aws_ecs_task_definition.newrelic_infra_agent.family}:${max("${aws_ecs_task_definition.newrelic_infra_agent.revision}", "${data.aws_ecs_task_definition.newrelic_infra_agent.revision}")}"
  scheduling_strategy = "DAEMON"
  #tags = "${local.tags}"
  propagate_tags  = "TASK_DEFINITION"

  depends_on = [aws_ecs_task_definition.newrelic_infra_agent]
}

@justinretzolk justinretzolk added the bug Addresses a defect in current functionality. label Jan 21, 2022
@camway
Copy link

camway commented Apr 15, 2022

Recently re-tested on the latest AWS provider 4.10. Issue still seems to be present.

I did find another workaround for this though. It's not great, but I think it's better than what we'd been doing. Essentially it boils down to this:

  • Add lifecycle ignores to all ECS task container definitions.
  • Add pipeline stage which taints ECS containers which you want to redeploy.

This way standard plan/apply never causes the containers to restart (Unless some other attribute changed). If you need to force a restart/redeploy, taint, and then reapply.

This is by no means a request for this issue, but I'm beginning to wish there was a way to override the check for resource replacement. So you could provide a sha or something, and if that changes it will update. Would make this a lot easier.

@eyalch
Copy link

eyalch commented Aug 2, 2022

The same thing happened using FluentBit log router. After adding its container definition, Terraform was forcing a new plan each time, even without touching anything:

~ {
    - cpu = 0 -> null
    - environment           = [] -> null
    - mountPoints           = [] -> null
    - portMappings          = [] -> null
    - user= "0" -> null
    - volumesFrom           = [] -> null
      # (6 unchanged elements hidden)
  },

After setting explicitly these values in the code, no more changes. Here's the FluentBit container definition:

{
  essential = true,
  image = var.fluentbit_image_url,
  name  = "log_router",
  firelensConfiguration = {
    type = "fluentbit"
  },
  logConfiguration : {
    logDriver = "awslogs",
    options = {
      awslogs-group = "firelens-container",
      awslogs-region= var.region,
      awslogs-create-group  = "true",
      awslogs-stream-prefix = "firelens"
    }
  },
  memoryReservation = var.fluentbit_memory_reservation
  cpu   = 0
  environment   = []
  mountPoints   = []
  portMappings  = []
  user  = "0"
  volumesFrom   = []
}

For me, adding just user = "0" to the container definition resolved this. Here's the full container definition:

{
  essential = true
  image     = "public.ecr.aws/aws-observability/aws-for-fluent-bit:stable"
  name      = "log-router"

  firelensConfiguration = {
    type    = "fluentbit"
    options = {
      enable-ecs-log-metadata = "true"
      config-file-type        = "file"
      config-file-value       = "/fluent-bit/configs/parse-json.conf"
    }
  }

  logConfiguration = {
    logDriver = "awslogs"
    options   = {
      awslogs-group         = aws_cloudwatch_log_group.api_log_group.name
      awslogs-region        = local.aws_region
      awslogs-create-group  = "true"
      awslogs-stream-prefix = "firelens"
    }
  }

  memoryReservation = 50

  user = "0"
}

@mo-saeed
Copy link

I have same issue with the latest aws provider: 4.27.0

@ghost
Copy link

ghost commented Sep 22, 2022

We also have this issue on terraform 1.2.7 and aws provider 4.31.0. The plan output is only marking arn, container_definitions, id and revision with ~ ('known after apply'), but after the container_definitions it says # forces replacement, but the content has not changed at all. We tried sorting the json keys and adding default parameters to no avail. Do we also need to format it exactly like the plan is saying? Because in the container_definitions its saying to delete all json keys.

With redactions:

~ container_definitions    = jsonencode(
            [
              - {
                  - cpu               = 64
                  - environment       = [
                      - ...
                    ]
                  - essential         = true
                  - image             = ...
                  - logConfiguration  = {
                      - logDriver = ...
                      - options   = {
                          - ...
                        }
                    }
                  - memoryReservation = 64
                  - mountPoints       = []
                  - name              = ...
                  - portMappings      = [
                      - ...
                    ]
                  - volumesFrom       = []
                },
              - {
                  - cpu               = 512
                  - environment       = [
                      - ...
                    ]
                  - essential         = true
                  - image             = ...
                  - logConfiguration  = {
                      - logDriver = ...
                      - options   = {
                          - ...
                        }
                    }
                  - memoryReservation = ...
                  - mountPoints       = []
                  - name              = ...
                  - portMappings      = [
                      - ...
                    ]
                  - secrets           = [
                      - ...
                    ]
                  - volumesFrom       = []
                },
            ]

No ENV variables changed, no secrets changed and no other configuration keys have been changed.
If there is any way for me to help with debugging, please let me know.

@ghost
Copy link

ghost commented Nov 25, 2022

A follow-up to my previous comment: The replacement was actually not caused by terraform not getting the diff correct. It was actually caused by a variable which was dependent on a file (data resource) which had a depends_on to a null_resource. Even in the documentation for depends_on it is stated that terraform is more conservative and plans to replace more resources as is possibly needed. So in the end the ordering and filling all values with default values worked.

Our null_resource had the trigger set to always. This of course makes the null_resource 'dirty' on every terraform run and I suspect that the other dependent resource then also get tagged as dirty in a transient fashion.

@trallnag
Copy link

Still an issue with 4.63.0. Setting values to the defaults or null helps.

@OscarGarciaF
Copy link

OscarGarciaF commented Sep 8, 2023

In my case I had:

 portMappings = [
    {
      containerPort = "27017"
      protocol      = "TCP"
      hostPort      = "27017"
    },

TCP was being evaluated as tcp and on the second run terraform was not smart enough to recognize that "TCP" was going to be evaluated as "tcp" and was trying to replace the definition, I changed "TCP" for "tcp" and it stopped trying to replace it.

@adamdepollo
Copy link

Still an issue with 5.16.1

@AlbertCintas
Copy link

AlbertCintas commented Apr 10, 2024

In my case, the recreation was caused by the healthcheck definition. I did add to the config block the default values for interval, retries and such, and the problem was solved:

healthcheck = {
        command     = ["CMD-SHELL", "curl -f http://127.0.0.1/ || exit 1"]
        interval    = 30
        retries     = 3
        startPeriod = 5
        timeout     = 5
      }

@t0yv0
Copy link
Contributor

t0yv0 commented Aug 14, 2024

I've submitted a PR for the healthcheck defaults normalization specifically: #38872

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Addresses a defect in current functionality. service/ecs Issues and PRs that pertain to the ecs service.
Projects
None yet
Development

No branches or pull requests