Provider is initialised with stale data source attribute #16559

camjackson · 2017-11-04T04:03:02Z

What I'm trying to do

In my project I have a module that creates the base infrastructure (instance, networking, etc), and then a separate one that deploys Docker images/containers onto the instance. Something like:

main.tf

module "base" {
  source = "./base"
}
module "apps" {
  source = "./apps"
  aws_instance_id = "${module.base.aws_instance_id}"
}

base/main.tf: creates the instance and exposes its ID

resource "aws_instance" "instance" {
    # ...
}
output "aws_instance_id" {
  value = "${aws_instance.instance.id}"
}

apps/main.tf: Takes in the ID, looks up the instance, and gives its IP to the docker provider

variable "aws_instance_id" {
  type = "string"
}
data "aws_instance" "instance" {
  instance_id = "${var.aws_instance_id}"
}
provider "docker" {
  host = "tcp://${data.aws_instance.instance.public_ip}:2376"
}
# Docker resources: image, container, etc.

The reason I pass the ID around and not the IP directly, is that the apps module needs several attributes of the instance. Rather than pass them all in one by one, I can save a lot of code by passing just the ID and deriving everything else from the data source.

The problem

The instance has been recreated, so its ID and IP have changed, but the Docker provider is still using the old IP address.

When I do terraform apply --target module.base, and get it to output the instance ID at the top level, I can see the new, fresh ID. But when I do just terraform plan or terraform apply, as soon as it tries to initialise the Docker provider, I get an error message like:

Error refreshing state: 1 error(s) occurred:
* module.apps.provider.docker: Error pinging Docker server: Get https://<OLD IP HERE>:2376/_ping: dial tcp <OLD IP HERE>:2376: i/o timeout

I have verified that the IP shown in the console is the IP of the previous, terminated instance.

I'm guessing that this is a similar chicken-and-egg problem to #2430. I was already aware that all providers have to be initalised before any resources can be created, which is why I've implemented this particular module structure in the first place. What's news to me here is that providers also have to be initialised before any data sources can be refreshed, which would seem to make it a very bad idea for providers to depend on data sources.

Terraform Version

Terraform v0.10.8

Debug Output

The debug output and state file both contain sensitive data, so I won't post them online. However I can see that both the old and new IDs and IPs are mentioned there. It does look like a DescribeInstances action is done on the new ID, and the new IP is returned. It's just not making its way through to the Docker provider.

Workaround

I think I have two options here, one easy, and one robust. The easiest path forwards is to just hard-code the new IP address into the Docker provider for a single terraform apply. That gets me out of the rut I'm in and I can move on. But the problem will come back whenever the instance changes again. A more robust solution would be to pass the IP address in directly, rather than looking it up from the data source. Which is annoying, but perhaps necessary in this case.

The text was updated successfully, but these errors were encountered:

- Explanation: hashicorp/terraform#16559

apparentlymart · 2017-11-20T20:44:57Z

Hi @camjackson! Sorry for this confusing behavior.

As some context for what's going on here, a normal terraform apply run has a few different phases:

Validate
Refresh
Plan
Apply

Data sources are dealt with during the "refresh" step where possible (that is, when their configuration doesn't depend on something that is <computed>), and this is also the step where Terraform reads out information about any existing resources in order to detect drift.

The intended behavior is that the usual graph machinery in Terraform will ensure that the data source is refreshed before the provider is instantiated, since the provider refers to a result from the data source. Therefore the behavior you've seen here seems like a bug. It may be the case that the docker provider is being optimistically configured during the "validate" phase, since its configuration seems already available, and then it's being re-used for the refresh.

To determine if that theory holds water it'd be necessary to see the sequence of operations in the debug log, but I understand that you don't want to share it in its entirety. If you'd be willing, it'd be useful to see the log lines of the following type, in their relative order within your log:

Logs containing [INFO] terraform: building graph: , which indicate the start of each of the phases (you will probably also see the "input" phase, which does nothing in most cases, but it'd be useful to see how that fits in too.)
Logs containing Entering eval tree:, which indicate when a particular graph node is being visited
Logs containing Exiting eval tree: , which indicate the end of a graph node being visited (they may be interleaved due to the parallel graph walk)
Logs containing plugin: starting plugin: which indicate that a plugin process is being launched (this is, however, separate from configuring the provider)
Logs containing eval: *terraform.EvalInitProvider, which indicate that a plugin is being initialized (unfortunately which one is inferrable only from context, by looking at the most recent Entering eval tree)
Logs containing eval: *terraform.EvalInterpolateProvider, which indicate that the configuration for a provider is being evaluated
Logs containing eval: *terraform.EvalConfigProvider, which indicate the configuration actually being passed to a provider
Logs containing eval: *terraform.EvalReadDataDiff, which indicate preparation to read a data resource
Logs containing root: eval: *terraform.EvalReadDataApply, which indicate actually reading a data source previously prepared

(Some of these are [TRACE] level, so TF_LOG=trace will be necessary to see all of them.)

I know that's a lot 😖 . I put together this grep invocation to catch the above and it seemed to do what I expected for a log I made while trying to repro:

grep -P '(terraform: building graph|Entering eval tree|Exiting eval tree|plugin: starting plugin:|terraform.EvalInitProvider|terraform.EvalInterpolateProvider|terraform.EvalConfigProvider|terraform.EvalReadData)'

This result will include the names of modules and resources as part of their resource addresses, but should otherwise just be Terraform-specific details that are hopefully not concerning to share.

apparentlymart · 2018-11-08T16:11:25Z

Hi @camjackson,

In the mean time I did some design work around a potential solution to this over in #17034. There is still some more work to do there to finalize that design, but rather than having two issues that cover the same problem I'm going to close this one out just to consolidate. We're not working on #17034 directly at this time due to attention being elsewhere, but we'll post updates there when we have them.

Thanks for reporting this, and sorry for the long silence.

ghost · 2020-03-31T02:11:20Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

camjackson added a commit to rabblerouser/infra that referenced this issue Nov 4, 2017

Pass instance IP around instead of ID

8255d8c

- Explanation: hashicorp/terraform#16559

apparentlymart added bug core labels Nov 20, 2017

apparentlymart closed this as completed Nov 8, 2018

ghost locked and limited conversation to collaborators Mar 31, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provider is initialised with stale data source attribute #16559

Provider is initialised with stale data source attribute #16559

camjackson commented Nov 4, 2017 •

edited

Loading

apparentlymart commented Nov 20, 2017

apparentlymart commented Nov 8, 2018

ghost commented Mar 31, 2020

Provider is initialised with stale data source attribute #16559

Provider is initialised with stale data source attribute #16559

Comments

camjackson commented Nov 4, 2017 • edited Loading

What I'm trying to do

The problem

Terraform Version

Debug Output

Workaround

apparentlymart commented Nov 20, 2017

apparentlymart commented Nov 8, 2018

ghost commented Mar 31, 2020

camjackson commented Nov 4, 2017 •

edited

Loading