Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data.aws_ecs_task_definition: Failed getting task definition #1274

Closed
sychevsky opened this issue Jul 28, 2017 · 28 comments
Closed

data.aws_ecs_task_definition: Failed getting task definition #1274

sychevsky opened this issue Jul 28, 2017 · 28 comments
Labels
bug Addresses a defect in current functionality. service/ecs Issues and PRs that pertain to the ecs service. stale Old or inactive issues managed by automation, if no further action taken these will get closed.

Comments

@sychevsky
Copy link

sychevsky commented Jul 28, 2017

Terraform Version

0.9.11.

  • aws_ecs_task_definition

Terraform Configuration Files

data "aws_ecs_task_definition" "my-service" {
  task_definition = "${aws_ecs_task_definition.my-service.family}"
}

resource "aws_ecs_task_definition" "my-service" {
  family                = "${var.environment_name}-${var.service_name}-${var.instance_name}"
  network_mode          = "bridge"
  container_definitions = "${data.template_file.my-service.rendered}"
}

resource "aws_ecs_service" "my-service" {
 ...
  #Track the latest ACTIVE revision
  task_definition = "${aws_ecs_task_definition.my-services.family}:${max("${aws_ecs_task_definition.my-service.revision}", "${data.aws_ecs_task_definition.my-service.revision}")}"
...
}

Expected Behavior

if resource not exists create new aws_ecs_task_definition else use latest aws_ecs_task_definition version

this code vork fine in Terraform v0.9.2

Actual Behavior

: Failed getting task definition ClientException: Unable to describe task definition.
status code: 400, request id: "my-service"

Steps to Reproduce

  1. terraform apply
@radeksimko radeksimko added the bug Addresses a defect in current functionality. label Jul 28, 2017
@sychevsky
Copy link
Author

also reproduced in terraform 1.0

@nathanielks
Copy link
Contributor

I'm also experiencing the same issue! What's curious is that when attempting the search using a vanilla state (completely empty), the plan and apply work as expected. It's only when I have an existing state file that it doesn't work.

@nathanielks
Copy link
Contributor

Even more curious, the resources don't exist in the statefile anyhow, and yet it fails? 🤔

@nathanielks
Copy link
Contributor

Diving into debugging... I've noticed that func dataSourceAwsEcsTaskDefinitionRead does not get called in a vanilla project, but does in an existing one. This appears to be a terraform pattern. I was able to reproduce this by creating a simple resource first (a security group) then trying to perform a lookup. The plan failed when a resource was already present in a statefile (the security group in this case). I verified my hypothesis by also creating a different data source which looked up a non-existent security group. The plan for this also failed.

@nathanielks
Copy link
Contributor

nathanielks commented Aug 28, 2017

If the arguments of a data instance contain no references to computed values, such as attributes of resources that have not yet been created, then the data instance will be read and its state updated during Terraform's "refresh" phase, which by default runs prior to creating a plan. This ensures that the retrieved data is available for use during planning and the diff will show the real values obtained.

Data instance arguments may refer to computed values, in which case the attributes of the instance itself cannot be resolved until all of its arguments are defined. In this case, refreshing the data instance will be deferred until the "apply" phase, and all interpolations of the data instance attributes will show as "computed" in the plan since the values are not yet known.

This is doubly interesting to me. Based on the above docs, OP's config shouldn't be failing because data.aws_ecs_task_definition.my-service depends on aws_ecs_task_definition.my-service.family, but it's failing in the plan* phase (my problem as well). Perhaps this is a terraform-level bug and not a provider-level?

  • Edit: incorrectly said it failed in the apply phase instead of the plan phase.

@nathanielks
Copy link
Contributor

@radeksimko could we get your eyes on this? I don't want to spam the main repo if it's not a terraform issue.

@dpnova
Copy link

dpnova commented Aug 29, 2017

I'm seeing this issue as well.

@dpnova
Copy link

dpnova commented Aug 29, 2017

I actually don't need data and resource for the same thing in the same file. I commented out the data and now it seems to be working better.

@nathanielks
Copy link
Contributor

Related: #632
Explanation: #13 (comment)

@parruda
Copy link

parruda commented Sep 22, 2017

I was able to get around this issue by adding a "depends_on" to the data source:

resource "aws_ecs_task_definition" "task" {
...
}
data "aws_ecs_task_definition" "task" {
  depends_on = [ "aws_ecs_task_definition.task" ]
  ...
}

Hope it helps.

@KIVagant
Copy link

KIVagant commented Oct 27, 2017

It's not really a bug, the solution from @parruda is correct. The resource aws_ecs_service and the data aws_ecs_task_definition both expect that related resource aws_ecs_task_definition must be already created.

@kazeshini178
Copy link

@KIVagant that makes sense, as I was also experiencing the same issue.

Though I would say the Terraform docs for that show the data object and resource being used together should be updated to reflect this. as it stands now the doc's imply that if the resource doesn't exist then nothing should fail.

Otherwise @parruda solutions makes sense for me

@kazeshini178
Copy link

Ya I probably should of tried the fix before replying, it works but it causes continuous change detection to occur.
Which is not the expected/desired result

@bflad bflad added the service/ecs Issues and PRs that pertain to the ecs service. label Jan 28, 2018
@dendrochronology
Copy link

@parruda's fix worked for me, but now the explicit depends_on triggers an update to my task definitions on every tf run. Is there a best practice to prevent that? I'm using Terraform v0.11.5
and provider.aws v1.10.0.

@KIVagant
Copy link

KIVagant commented Apr 4, 2018

@dendrochronology, I use something like this:

data "aws_ecs_task_definition" "blabla" {
  task_definition = "${aws_ecs_task_definition.blabla.family}"
  depends_on = [ "aws_ecs_task_definition.blabla" ]
}


resource "aws_ecs_task_definition" "..." {
  family                = "..."
  task_role_arn         = "${aws_iam_role.blabla.arn}"

  container_definitions = "${data.template_file.task_definition.rendered}"

  depends_on = [
    "data.template_file.task_definition",
  ]

  lifecycle {
    ignore_changes = [
      "container_definitions" # if template file changed, do nothing, believe that human's changes are source of truth
    ]
  }
}


resource "aws_ecs_service" "blabla" {
  name            = "blabla"
  cluster         = "${aws_ecs_cluster.cluster_name.id}"
  task_definition = "${aws_ecs_task_definition.blabla.family}:${max("${aws_ecs_task_definition.blabla.revision}", "${data.aws_ecs_task_definition.blabla.revision}")}"
  desired_count   = 1
  iam_role        = "${aws_iam_role.ecs_service.name}"

// Not compatible with placement_constraints:distinctInstance, commented
//  placement_strategy {
//    type  = "binpack"
//    field = "cpu"
//  }

  placement_constraints {
    type  = "distinctInstance"
  }

  load_balancer {
    elb_name       = "${aws_elb.blabla.name}"
    container_name = "internal"
    container_port = "${var.blabla_port}"
  }

  depends_on = [
    "aws_iam_role.ecs_service",
    "aws_elb.blabla",
    "aws_iam_role.blabla",
    "aws_ecs_task_definition.blabla"
  ]

  lifecycle {
    ignore_changes = ["task_definition"] # the same here, do nothing if it was already installed
  }
}

@nathanielks
Copy link
Contributor

@KIVagant ahhh, I'm going to play with the ignore_changes lifecycle hook!

@dendrochronology
Copy link

Ah, nice, I'll play with that, too. Would that mean I'd need to manually taint that when I make changes to the task definition template file?

@KIVagant
Copy link

KIVagant commented Apr 5, 2018

It depends on your goals. In our case the template contains empty place for secrets which are filling after first install by Terraform and we don't want to allow it to change exist task definitions. And we control them manually after first install.

@parruda
Copy link

parruda commented May 2, 2018

@dendrochronology sorry for the lack of response. I actually never noticed the problem because we do want to update the task definition on every run. I hope you found a solution.

@jaysonsantos
Copy link
Contributor

This still seems to be a problem, if you just use what is on the docs you will get this:

Error: Error running plan: 1 error(s) occurred:

* module.frontshop_staging.data.aws_ecs_task_definition.frontshop: 1 error(s) occurred:

* module.frontshop_staging.data.aws_ecs_task_definition.frontshop: Resource 'aws_ecs_task_definition.frontshop' not found for variable 'aws_ecs_task_definition.frontshop.family'

The only changed things are that this is inside a module and the name is frontshop. Could it be related to the module?
I tried also with depends_on and it won't work. I am thinking of applying a first version to create the resource and then use the data with max to get the latest revision.

@jaysonsantos
Copy link
Contributor

Actually, what I said is a lie, looks like there is a problem when you have an invalid JSON for container definitions and mine is not using the heredoc syntax but a json file with a template and it should be an array of containers and i have only one main object.
Here where I found out about it #2026

@nanorc
Copy link

nanorc commented Dec 6, 2018

nice one @jaysonsantos. In my case, the error came out because of json syntax error

@wjburn
Copy link

wjburn commented Mar 4, 2019

With a provider upgrade to 1.59 and terraform 11.11, I am still seeing this error.

If terraform destroy completes with no errors, it works fine without a depends_on.

However, if terraform destroy fails on something else for instance:

 Error removing user, role, or group list from IAM Policy Detach bootstrap-iam-group-attach1:
– NoSuchEntity

Unrelated to the ecs service. Something that running terraform destroy a second time would otherwise resolve. On the second pass the

Failed getting task definition ClientException: Unable to describe task definition.

error resurfaces and the state file is corrupt.

duduribeiro pushed a commit to duduribeiro/terraform_ecs_fargate_example that referenced this issue Jun 20, 2019
@ldiaz2-chwy
Copy link

This issue isn't very clear to me. Seems like some folks claim that we should NOT be using a depends_on in the datasource for the task definition but upon the first run it always fails because the resource doesnt exist.

skorfmann pushed a commit to skorfmann/terraform-provider-aws that referenced this issue Sep 25, 2019
This is working around the issue of not having a task definition when the resources are initially rolled out.

Background:

The documetation example of directly referecing "task_family" doesn't work and exits with an error when initially applying it. See also this issue hashicorp#1274

The reason is, that data sources don't handle missing data gracefully. Unfortunately, that's not gonna be addressed, as stated here: hashicorp/terraform#16380 (comment)
One of the suggested workarounds is, to add an explict `depends_on`. However, this causes a potential change in the terraform plan output, even though it's not actually going to change. Furthermore, it's discourage by the Terraform documentation itself.

This thread mentions a few other workarounds, but none of them seem to be suitable hashicorp/terraform#16380

`aws_ecs_task_definition.self.revision` can only be referenced, once the resource is created (in contrast to family, which is already present in code)
Apparently, this allows Terraform to correctly resolve the dependencies and makes the data source behave as expected.
@bentolor
Copy link

bentolor commented Oct 8, 2020

FYI for everybody else stumbling over the issue: @skorfmann illustrated in this MR #10247 a better workaround using aws_ecs_task_definition.self.revision and explains why the discussed depends_on approach is not what you want!

This is working around the issue of not having a task definition when the resources are initially rolled out. The documetation example of directly referecing "task_family" doesn't work and exits with an error when initially applying it. See also this issue #1274

The reason is, that data sources don't handle missing data gracefully. Unfortunately, that's not gonna be addressed, as stated here: hashicorp/terraform#16380 (comment). One of the suggested workarounds is, to add an explict depends_on. However, this causes a potential change in the terraform plan output, even though it's not actually going to change. Furthermore, it's discourage by the Terraform documentation itself.

This thread mentions a few other workarounds, but none of them seem to be suitable hashicorp/terraform#16380

aws_ecs_task_definition.self.revision can only be referenced, once the resource is created (in contrast to family, which is already present in code). Apparently, this allows Terraform to correctly resolve the dependencies and makes the data source behave as expected.

@alvarogmj
Copy link

alvarogmj commented Dec 30, 2020

@bentolor Which version of Terraform is that solution valid for? At my company we are running on 0.12 and the suggested solution with the conditional on .revision causes an error, as Terraform complains about it not being a boolean value.

Since anyway both sides of the conditional end up referencing the same value, as a quick fix I used "revision >0" in the conditional just to force it to be a boolean.

@github-actions
Copy link

Marking this issue as stale due to inactivity. This helps our maintainers find and focus on the active issues. If this issue receives no comments in the next 30 days it will automatically be closed. Maintainers can also remove the stale label.

If this issue was automatically closed and you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thank you!

@github-actions github-actions bot added the stale Old or inactive issues managed by automation, if no further action taken these will get closed. label Dec 21, 2022
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jan 21, 2023
@github-actions
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 21, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Addresses a defect in current functionality. service/ecs Issues and PRs that pertain to the ecs service. stale Old or inactive issues managed by automation, if no further action taken these will get closed.
Projects
None yet
Development

No branches or pull requests