-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data-driven Terraform Configuration #4705
Conversation
This represents a data source configuration.
This allows the config loader to read "data" blocks from the config and turn them into DataSource objects. This just reads the data from the config file. It doesn't validate the data nor do anything useful with it.
Hi @apparentlymart, this is looking good so far! I'm working on the Terraform / Vault integration which could really take advantage of Data Sources for several of its main data types. So I'm checking in on your near-term plans are for this branch. If you don't expect to be picking it up soon, perhaps we can discuss me picking up where you left off? Let me know! 😀 |
@phinze, @jen20: I checked off all the items on my list, so this is now feature-complete according to my original plan. I have done some ad-hoc manual testing to exercise the various different combinations of computed/non-computed configs, dependent resources, dependent providers, etc. Unfortunately the one situation that still doesn't seem to work is the very case that this feature was intended to solve: data "null_data_source" "test" {
inputs = {
aws_region = "us-west-2"
}
}
provider "aws" {
region = "${data.null_data_source.test.outputs.aws_region}"
}
resource "aws_instance" "foo" {
instance_type = "t1.foo"
ami = "ami-abc123"
} In the above configuration, the
This error seems to be occuring during the "input" walk; running with
I'm not sure this is really resolvable... the only viable path I can see would be to skip asking for inputs on a provider that has any computed configuration, but even that doesn't seem like it'd work since it is the interpolation itself that is failing. If you have any other ideas I'd love to hear them! 😀 |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further. |
This is where I'm working on the implementation of the proposal from #4169.
(This PR supersedes #4961, and has been rebased onto
dev-0.7
rather thanmaster
so it can build on the reworked plugin bits and type system changes.)Since this change spans multiple Terraform layers, the sections that follow summarize the changes in each layer, in the hope of making this changeset easier to review. The PR is broken into a sequence of commits which, as far as possible, change only one layer at a time so that each change can be understood in isolation.
Configuration (
config
package)In the
config
layer, data sources are introduced by expanding the existingResource
concept with a new fieldMode
, which represents which operations/lifecycle this resource follows:ManagedResourceMode
: previously the only mode; Terraform creates and "owns" this resource, updating its configuration and eventually destroying it.DataResourceMode
: Terraform only reads from this resourceIn the configuration language,
resource
blocks map toManagedResourceMode
resources anddata
blocks map toDataResourceMode
resources.data
blocks don't permitprovisioner
orlifecycle
sub-blocks because these concepts do not make sense for a resource that only has aRefresh
action. Internally, data resources always have an emptyProvisioners
slice and a zero-valueResourceLifecycle
instance.A similar extension has been made to
ResourceVariable
, which can now represent both the existingTYPE.NAME.ATTR
variables and the newdata.TYPE.NAME.ATTR
variables, again using aMode
field as the discriminator.Since both traditional resources and data resources are both kinds of resources, they both appear in the
Resources
slice within the configuration struct. TheResource.Id()
implementation keeps them distinct by adding adata.
prefix to data resource ids, which is a convention that will continue through to the core layer.ResourceMode
enumeration andMode
attribute onconfig.Resource
data
blocks from configuration filesdata.TYPE.NAME.ATTR
variables andMode
attribute onconfig.ResourceVariable
Core changes
Within core is where we find the biggest divergence of codepaths for managed vs. data resources, since data resources have a simpler lifecycle.
The
ResourceProvider
interface has a new methodDataSources
, which is analogous toResources
. The Validate phase is consistent between the two, except that the provider abstraction distinguishes betweenValidateResource
andValidateDataSource
, both of which are supported byEvalValidate
depending on mode.The remainder of the workflow is completely distinct and handled by two different codepaths, switching on the resource mode inside
terraform/transform_resource.go
.Even though ultimately data resources support only a "read" operation, the standard plan/apply model is supported by splitting a read into two steps in the
ResourceProvider
interface:ReadDataDiff
: takes the config and returns a diff as if the data resource were being "created", allowing core to know about the data source's computed attributes without actually reading any data.ReadDataApply
: takes the diff, uses it to obtain the configuration attributes, actually loads the data and returns a state.The important special behavior for data resources is that during the "refresh" walk they will check to see if their config contains computed values, and if it doesn't then the diff/apply steps are run immediately, rather than waiting until the real plan and apply phases. This ensures that non-computed data source attributes can be safely used inside provider configurations, bypassing the chicken-and-egg problems that are caused by computed provider arguments.
A significant difference compared to managed resources is that a data source "read" does not get access to any previous state; we always create an entirely new instance on each refresh. The intended user-facing mental model for data resources is that they are not stateful at all, and we persist them in the on-disk state file only so that
-refresh=false
can act as expected without breaking the rest of the workflow.ResourceProvider
interface changesEvalValidate
calls appropriate provider validate method based on resource mode.ResourceStateKey
understands how to deal with "orphan" data resources in the state.graphNodeExpandedResource
branches inEvalTree
to support the different lifecycle for data resources.graphNodeOrphanResource
branches inEvalTree
to support the different lifecycle for data resources.terraform destroy
(or applying aplan -destroy
)helper/schema
support for data sourcesIn the
helper/schema
layer, the new map of supported data sources is kept separate from the existing map of supported resources. Data sources use the familiarschema.Resource
type but with only aRead
implementation required andCreate
,Update
, andDelete
functions forbidden.The
Read
implementation works in essentially the same way as it does for managed resources, getting access to its configuration attributes viad.Get(...)
and setting computed attributes withd.Set(...)
. The only notable differences are thatd.Get(...)
won't return values of computed attributes set on previous runs, and callingd.SetId(...)
is optional.To help us migrate existing "logical resources" to instead be data sources, a helper is provided to wrap a data source implementation and shim it to work as a resource implementation. In this case, the
Read
implementation must calld.SetId(...)
in order to meet the expectations of a managed resource implementation.DataSourcesMap
withinhelper.Provider
DataSources
,ValidateData
,ReadDataDiff
andReadDataApply
provider/terraform
: example remote state data sourceAs an example to show things working end-to-end, the
terraform_remote_state
resource is transformed into a data source, and the backward-compatibility shim is used to maintain the now-deprecated resource.terraform_remote_state
data sourceTargeting Data Resources
ResourceAddress
is extended with aResourceMode
to handle the distinct managed and data resource namespaces.data.TYPE.NAME
can be used to target data resources, for consistency with how data resources are referenced elsewhere.ResourceAddress
support fordata.TYPE.NAME
syntax andResourceMode
.UI Changes
When data resource reads appear in plan output, we show them using a distinct presentation to make it clear that no real infrastructure will be altered by this operation:
Since a data resource read is internally just a "create" diff for the resource, this is just some sleight of hand in the UI layer to present it differently.
A "read" diff will appear only if the read operation cannot be completed during the "refresh" phase due to computed configuration.
Other stuff
terraform taint
("tainting" is not meaningful for data resources because they are not created/destroyed.)