Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provider is initialised with stale data source attribute #16559

Closed
camjackson opened this issue Nov 4, 2017 · 3 comments
Closed

Provider is initialised with stale data source attribute #16559

camjackson opened this issue Nov 4, 2017 · 3 comments

Comments

@camjackson
Copy link

camjackson commented Nov 4, 2017

What I'm trying to do

In my project I have a module that creates the base infrastructure (instance, networking, etc), and then a separate one that deploys Docker images/containers onto the instance. Something like:

  • main.tf
    module "base" {
      source = "./base"
    }
    module "apps" {
      source = "./apps"
      aws_instance_id = "${module.base.aws_instance_id}"
    }
    • base/main.tf: creates the instance and exposes its ID
      resource "aws_instance" "instance" {
          # ...
      }
      output "aws_instance_id" {
        value = "${aws_instance.instance.id}"
      }
    • apps/main.tf: Takes in the ID, looks up the instance, and gives its IP to the docker provider
      variable "aws_instance_id" {
        type = "string"
      }
      data "aws_instance" "instance" {
        instance_id = "${var.aws_instance_id}"
      }
      provider "docker" {
        host = "tcp://${data.aws_instance.instance.public_ip}:2376"
      }
      # Docker resources: image, container, etc.

The reason I pass the ID around and not the IP directly, is that the apps module needs several attributes of the instance. Rather than pass them all in one by one, I can save a lot of code by passing just the ID and deriving everything else from the data source.

The problem

The instance has been recreated, so its ID and IP have changed, but the Docker provider is still using the old IP address.

When I do terraform apply --target module.base, and get it to output the instance ID at the top level, I can see the new, fresh ID. But when I do just terraform plan or terraform apply, as soon as it tries to initialise the Docker provider, I get an error message like:

Error refreshing state: 1 error(s) occurred:
* module.apps.provider.docker: Error pinging Docker server: Get https://<OLD IP HERE>:2376/_ping: dial tcp <OLD IP HERE>:2376: i/o timeout

I have verified that the IP shown in the console is the IP of the previous, terminated instance.

I'm guessing that this is a similar chicken-and-egg problem to #2430. I was already aware that all providers have to be initalised before any resources can be created, which is why I've implemented this particular module structure in the first place. What's news to me here is that providers also have to be initialised before any data sources can be refreshed, which would seem to make it a very bad idea for providers to depend on data sources.

Terraform Version

Terraform v0.10.8

Debug Output

The debug output and state file both contain sensitive data, so I won't post them online. However I can see that both the old and new IDs and IPs are mentioned there. It does look like a DescribeInstances action is done on the new ID, and the new IP is returned. It's just not making its way through to the Docker provider.

Workaround

I think I have two options here, one easy, and one robust. The easiest path forwards is to just hard-code the new IP address into the Docker provider for a single terraform apply. That gets me out of the rut I'm in and I can move on. But the problem will come back whenever the instance changes again. A more robust solution would be to pass the IP address in directly, rather than looking it up from the data source. Which is annoying, but perhaps necessary in this case.

camjackson added a commit to rabblerouser/infra that referenced this issue Nov 4, 2017
@apparentlymart
Copy link
Contributor

Hi @camjackson! Sorry for this confusing behavior.

As some context for what's going on here, a normal terraform apply run has a few different phases:

  • Validate
  • Refresh
  • Plan
  • Apply

Data sources are dealt with during the "refresh" step where possible (that is, when their configuration doesn't depend on something that is <computed>), and this is also the step where Terraform reads out information about any existing resources in order to detect drift.

The intended behavior is that the usual graph machinery in Terraform will ensure that the data source is refreshed before the provider is instantiated, since the provider refers to a result from the data source. Therefore the behavior you've seen here seems like a bug. It may be the case that the docker provider is being optimistically configured during the "validate" phase, since its configuration seems already available, and then it's being re-used for the refresh.

To determine if that theory holds water it'd be necessary to see the sequence of operations in the debug log, but I understand that you don't want to share it in its entirety. If you'd be willing, it'd be useful to see the log lines of the following type, in their relative order within your log:

  • Logs containing [INFO] terraform: building graph: , which indicate the start of each of the phases (you will probably also see the "input" phase, which does nothing in most cases, but it'd be useful to see how that fits in too.)
  • Logs containing Entering eval tree:, which indicate when a particular graph node is being visited
  • Logs containing Exiting eval tree: , which indicate the end of a graph node being visited (they may be interleaved due to the parallel graph walk)
  • Logs containing plugin: starting plugin: which indicate that a plugin process is being launched (this is, however, separate from configuring the provider)
  • Logs containing eval: *terraform.EvalInitProvider, which indicate that a plugin is being initialized (unfortunately which one is inferrable only from context, by looking at the most recent Entering eval tree)
  • Logs containing eval: *terraform.EvalInterpolateProvider, which indicate that the configuration for a provider is being evaluated
  • Logs containing eval: *terraform.EvalConfigProvider, which indicate the configuration actually being passed to a provider
  • Logs containing eval: *terraform.EvalReadDataDiff, which indicate preparation to read a data resource
  • Logs containing root: eval: *terraform.EvalReadDataApply, which indicate actually reading a data source previously prepared

(Some of these are [TRACE] level, so TF_LOG=trace will be necessary to see all of them.)

I know that's a lot 😖 . I put together this grep invocation to catch the above and it seemed to do what I expected for a log I made while trying to repro:

grep -P '(terraform: building graph|Entering eval tree|Exiting eval tree|plugin: starting plugin:|terraform.EvalInitProvider|terraform.EvalInterpolateProvider|terraform.EvalConfigProvider|terraform.EvalReadData)'

This result will include the names of modules and resources as part of their resource addresses, but should otherwise just be Terraform-specific details that are hopefully not concerning to share.

@apparentlymart
Copy link
Contributor

Hi @camjackson,

In the mean time I did some design work around a potential solution to this over in #17034. There is still some more work to do there to finalize that design, but rather than having two issues that cover the same problem I'm going to close this one out just to consolidate. We're not working on #17034 directly at this time due to attention being elsewhere, but we'll post updates there when we have them.

Thanks for reporting this, and sorry for the long silence.

@ghost
Copy link

ghost commented Mar 31, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Mar 31, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants