Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow "count" for non-null check on resource attributes that are always present #26755

Closed
sgrimm opened this issue Oct 29, 2020 · 32 comments
Closed
Labels
enhancement unknown-values Issues related to Terraform's treatment of unknown values

Comments

@sgrimm
Copy link

sgrimm commented Oct 29, 2020

Current Terraform Version

Terraform v0.13.4
+ provider registry.terraform.io/hashicorp/aws v3.10.0
+ provider registry.terraform.io/hashicorp/local v1.4.0
+ provider registry.terraform.io/hashicorp/template v2.2.0

Use-cases

I want a module to conditionally create a particular AWS Route53 entry if a zone ID is passed in. I'm using count and checking whether the variable is null. I'd like Terraform to be able to create everything, including the zone, in one plan/apply so a CI system can do the actual plan application.

variable "public_zone_id" {
  type    = string
  default = null
}

resource "aws_route53_record" "public_cname" {
  count = var.public_zone_id != null ? 1 : 0

  zone_id = var.public_zone_id
  name    = "${var.hostname}"
  type    = "CNAME"
  ttl     = 1800
  records = [aws_instance.this.public_dns]
}

And then in the calling project, I pass in the zone ID, like so:

module "foo" {
  source = "my_module"

  hostname       = "example"
  public_zone_id = module.vpc.public_zone_id
}

But because the count depends on a value from a resource that isn't created yet, terraform plan gives me

Error: Invalid count argument

  on ../modules/ec2/main.tf line 40, in resource "aws_route53_record" "public_cname":
  40:   count = var.public_zone_id != null ? 1 : 0

The "count" value depends on resource attributes that cannot be determined
until apply, so Terraform cannot predict how many instances will be created.
To work around this, use the -target argument to first apply only the
resources that the count depends on.

Attempted Solutions

In the calling project, I have tried various tricks to make it impossible for the variable to ever be null regardless of whether or not the original resource exists, all with the same result as above.

public_zone_id = coalesce(module.vpc.public_zone_id, "dummy")

public_zone_id = trimsuffix("${module.vpc.public_zone_id}!", "!")

public_zone_id = module.vpc.public_zone_id != null ? module.vpc.public_zone_id : "dummy"

Proposal

Although the specific value of the zone ID is unknown here, it seems like Terraform ought to be able to tell that there will definitely be some value that will definitely not be null, and thus that the count should be 1, even if the actual value can't be filled in until later. There are a lot of attributes on a lot of resources that are required to be present if the resource creation succeeds, and since Terraform defines null as the absence of a value, checking for != null should only require the presence of a value.

So the proposal is that in the specific case of count = some_resource.some_attribute != null ? X : Y where the resource will be created in the same Terraform invocation, Terraform evaluate the expression based on whether or not the attribute is guaranteed to be present, rather than based on the actual attribute value. Obviously the "guaranteed" part is important; for optional attributes the current behavior would be unavoidable and expected. But for attributes that are always exported and can never possibly be null, it should be possible to evaluate that expression statically at plan time.

References

@sgrimm sgrimm added enhancement new new issue not yet triaged labels Oct 29, 2020
@apparentlymart
Copy link
Contributor

Hi @sgrimm! Thanks for sending in this enhancement request.

I think the root problem here is that the Terraform schema model doesn't have any explicit sense of an attribute that is guaranteed to be non-null, and so Terraform conservatively assumes that any attribute could potentially end up being null. To fix this would require giving providers some way to either declare that a particular attribute is never null or to say dynamically in the plan "unknown value but definitely not null". The first seems significantly easier to implement than the second, but neither is trivial because they both involve changes to the provider wire protocol, and thus we must coordinate with the provider SDK team and the provider teams to make the change.

In the meantime, I think you could make this work by having the calling module handle the decision about whether it's null. You didn't show how module.vpc decides whether or not to return a zone id, but let's say for example that it's an enable_public_dns_zone variable on your root module, something like this:

variable "enable_public_dns_zone" {
  type = bool
}

module "vpc" {
  source = "./modules/vpc"

  enabled = var.enable_public_dns_zone
}

module "foo" {
  source = "./modules/foo"

  hostname       = "example"
  public_zone_id = var.enable_public_dns_zone ? module.vpc.public_zone_id : null
}

The key here is that Terraform does know the value for var.enable_public_dns_zone, so we can rely on that to take the module.vpc.public_zone_id value out of the decision entirely when var.enable_public_dns_zone is false, and thus it will be known in the case where it's null.

Another potential answer is to design your modules to deal in lists rather than single values, and then you can use an empty list to represent it not being active and a single-element list to represent it being active. This is consistent with Terraform's usual model of disabling things by having zero of them, and is advantageous because it separates the information about whether it's enabled (the length of the list) from the final value (the value of the first element of the list, which might be unknown). Terraform's language is primarily oriented around repetition of objects based on lists, and so modules designed around that principle tend to gel better with other language features.

@pkolyvas pkolyvas removed the new new issue not yet triaged label Oct 30, 2020
@lievertz
Copy link

It appears that count depending on variable is fine, but if count depends on a variable through a transformation function like templatefile it is no longer fine and will likewise hit this issue. Perhaps I am wrong, but I think templatefile will always give a non-null result (or raise an error)? I wanted to bring up this situation because there is no provider involved.

My specific situation I hit this with is I have a module that may create an IAM policy if one is provided. I could use a var.json_doc AND a var.create_policy, but the boolean is implied by the presence of the doc, so the interface is (arguably) better without the explicit var.create_policy. Instead I just have count = var.json_iam_doc == null ? 0 : 1. This has been working, but I recently called this module from another module which provides the json_iam_doc variable as the result of a templatefile which itself accepts a string var -- that's where I hit this issue.

I confirmed this behavior on terraform v0.14.3 and v0.14.4.

To summarize -- it seems like the enhancement here also applies to function results which could arguably be known to be non-null. Since provider dependency was raised as an implementation barrier, it seemed relevant to raise that there are a number of relevant conditions that do not rely on the provider, but are functions within terraform itself.

I hope this is useful input -- thanks for all of your work!

@Cajga
Copy link

Cajga commented Mar 9, 2021

I run into the same. I use terraform 0.14.7. The simplest workaround that I found for the "using freshly created resource ID in count" is the following. (@apparentlymart may I ask you to verify it, just to make sure :))

Put his into the foo module:

locals {
  # note: putting [*] behind a single value will create a list (if it is null then it will be an empty list)
  some_id_list = var.some_id[*]
}

resource "something_which_depends_on_the_fresh_input_id" "this" {
  count = length(local.some_id_list)
.
.
}

variable "some_id" {
  type        = string
  default     = null
}

Then you call the module with the fresh id:

resource "aws_eip" "test" {
}

module "foo" {
  source = "../../"

  some_id = aws_eip.test.id
}

This will run with a single terraform apply.

@Nuru
Copy link

Nuru commented Nov 18, 2021

@Cajga I tried your solution, and it generally appears to work, but it is dangerous. When var.unknown is a string that is unknown at plan time, such as aws_vpc.vpc.id:

  count = var.unknown == null ? 0 : 1 # fails with  "count" value depends on resource attributes that cannot be determined until apply
  count =  length(var.unknown[*]) > 0 ? 1 : 0 # succeeds

However, it appears to me this works due to a bug or bad trade-off in Terraform. It look to me like when var.unknown is unknown, length(var.unknown[*]) always evaluates to 1. This causes weird behavior when var.unknown ultimately evaluates to null, such as requiring terraform apply to be run twice in order to apply all the changes. See #29973 for more details.

@Cajga
Copy link

Cajga commented Nov 18, 2021

@Nuru, Thanks for checking it and opening a related bug report. I must say that I stopped using this as I also had the feeling that this behavior is not correct/intended. I am curious about the outcome for this.

@Cajga
Copy link

Cajga commented Dec 15, 2021

Just to have it here as well: terraform 1.1.0 update breaks my "recommended" solution: #29973 (comment)

@RyanS-C
Copy link

RyanS-C commented Mar 3, 2022

I have this same issue on:

PS > terraform.exe --version
Terraform v1.1.5
on windows_amd64
+ provider registry.terraform.io/betr-io/mssql v0.2.4
+ provider registry.terraform.io/cloudflare/cloudflare v3.5.0
+ provider registry.terraform.io/hashicorp/azurerm v2.71.0
+ provider registry.terraform.io/hashicorp/http v2.1.0
+ provider registry.terraform.io/hashicorp/random v3.1.0

Your version of Terraform is out of date! The latest version
is 1.1.6. You can update by downloading from https://www.terraform.io/downloads.html

Spent ages trying to resolve and I cant. Having this functionality would be a huge help!

@apparentlymart apparentlymart added the unknown-values Issues related to Terraform's treatment of unknown values label Feb 7, 2023
@johnjelinek
Copy link

johnjelinek commented Feb 10, 2023

@apparentlymart regarding your example, how would you then make sure that resources don't get created when the string values of your object are empty?

I think the intent of the OP is not just to bypass the error, but also to get the desired result of making sure resources only get provisioned when all the properties of the object are not "". If you changed this line:

count = var.dns_record != null ? 1 : 0
to
count = var.dns_record != null && var.dns_record.hostname != "" ? 1 : 0

you would get the error in the OP, yes?

@apparentlymart
Copy link
Contributor

The point of my suggestion is that only the nullness of the containing object is used to make the decision, and then the containing object is always known to be either null or not.

I expect that if the object isn't null then the attributes inside must always have valid values and are not used as part of the decision.

@johnjelinek
Copy link

That is fine for non-null, but how should decisions be made based on content?

@apparentlymart
Copy link
Contributor

If the content is something that will be decided only during the apply phase then the answer is that you just shouldn't do that. The point of this alternative approach is to avoid the need to make decisions based on values that won't be known until the apply step, by wrapping in a value that will be known during the apply step.

@haytham-salhi
Copy link

haytham-salhi commented Apr 1, 2023

Hi @apparentlymart and guys, is there a way to check if a variable is unknown but not null to avoid the error: The "count" value depends on resource attributes that cannot be determined until apply,...?

@theherk
Copy link
Contributor

theherk commented Apr 5, 2023

Yes @haytham-salhi. @apparentlymart shows an example in this comment above. Use a complex type and count on it being null or not. Then you can access the resource attribute contained within, without the count depending on its value.

@haytham-salhi
Copy link

Thanks @theherk. Yes I noticed that solution. The issue with making a wrapper object is you need to modify the variable definition which goes against OCP principal in coding. Moreover, I do believe it is a good idea to provide a built-in function to check if this variable is unknown but not null at the plan time. Another workaround (that I went with) could be by introducing a new boolean flag indicating that variable is set, and then using the boolean flag in the count argument.

@theherk
Copy link
Contributor

theherk commented Apr 7, 2023

There are other workarounds. You can for over a list with enumeration, this keying an index rather than the value. I'm not disagreeing at all, just throwing out some methods.

This is discussed a bit on this community discussion where @apparentlymart does a great job explaining why the solution isn't straight forward. In it I actually say:

It is, in my view, the most frustrating shortcoming in terraform after years of use.

@david-wb
Copy link

david-wb commented Aug 16, 2023

I encountered the exact same issue in terraform v1.3.2. Changing the variable from a string to an object stops the Invalid count argument error. This bug is very confusing.

In case it's helpful, here is my solution:

Change the input variable to an object:

variable "x" { // Results in "invalid count argument" error
  type = string
  default = null
}

->

variable "x" { // Works
  type = object({
    val = string,
  })
  default = null
}

Then the count based on a local variable will work:

locals {
  foo = var.x != null
}

resource "foo" "bar" {
   count = local.foo ? 0 : 1
   ...
}

@apparentlymart
Copy link
Contributor

In the forthcoming Terraform 1.6 there is a new concept at the language level which I hope we will use to improve this in future versions of Terraform.

This new concept is the ability to track certain extra details about unknown values that constrain what final values the unknown could possibly be a placeholder for. And for this issue in particular one of these extra details is particularly useful: the language can track if it knows that a particular unknown value is definitely not null, in which case foo != null can return true instead of unknown when foo is unknown.

This will not have immediate benefit because the rule for whether a resource attribute might be null would need to be decided by the provider that the resource type belongs to, rather than by the core language runtime, and so the full benefit of this new mechanism won't be apparent until this concept is also somehow integrated into the provider plugin protocol and the libraries that provider developers use to implement that protocol.

However, in the short term this means that there will be some additional ways to influence Terraform's treatment of unknown values via explicit workarounds, for those who find the idea of wrapping a primitive value into an object objectionable. For example, since the definition of the coalesce function is that it fails with an error if all of its arguments are null, Terraform should be able to infer that its result cannot possibly be null and so coalesce(anything) would, if "anything" were an unknown value, return an unknown value that is known not to be null. The same would be true for any other function or operation that by definition cannot produce null, assuming that we've already implemented the extra logic to propagate that information. (It probably won't have 100% coverage for the initial release.)

I don't consider this issue resolved until this functionality is fully implemented at least to the provider development libraries so that provider developers can start to make use of it and thus you wouldn't need any extra weird "hints" in the configuration anymore. I'm sharing the above only to give an update that there has been some initial work towards this but since it's a cross-cutting concern we will need to iterate in multiple steps rather than complete this all in one round.

@jason-johnson
Copy link

jason-johnson commented Aug 29, 2023

I don’t think it should be a null check (I.e “var.x == null”). As others have mentioned above, the value could end up being null after the apply.

In the case where we really must know if the final value is null or not, then the recommendation of doing apply with a specified target is correct. However, the case I am usually dealing with is simply “did the user of this module specify something?”. In that case, a new function “isUnknown” would actually be the best solution IMO. The providers already have access to this functionality.

For example we could do:

count = isUnknown(var.x) or length(var.x) > 0 ? 1 : 0. # can only be unknown if user specified

EDIT: an “exists” function, as mentioned above, that simply tells us if something was set or not would also work.

@apparentlymart
Copy link
Contributor

apparentlymart commented Aug 29, 2023

If var.x were an unknown list that turns out to be an empty list then is_unknown(var.x) || length(var.x) > 0 would return true during plan and then false during apply. It isn't acceptable for a known value to change between plan and apply.

The possible lengths of a collection is another detail that Terraform v1.6 can track for unknown values, and so if in future any providers or functions become capable of promising "this unknown list definitely has a length of at least one" then length(var.x) > 0 would return true during planning without any need to explicitly program with the unknown values.

@jason-johnson
Copy link

The point is: we sometimes only want to know if the user set a value. Today, you can if you write a provider but not if you write a module.

@apparentlymart
Copy link
Contributor

apparentlymart commented Sep 5, 2023

As far as I know, providers have exactly the same "problem" today: if they receive an unknown value then they have no way to know whether the final known value could possibly be null. In many cases, as in the main Terraform language, providers handle that by being optimistic and just deferring any error handling until the apply phase. In some specific cases, just as with count and for_each in the Terraform language, providers choose to fail because they are missing some crucial information about how to proceed.

The new possibility of tracking a constrained range for an unknown value that I described above is the beginning of the solution to both of these situations: the Terraform language itself will be able to return true from something != null if we know that something cannot possibly be null, and providers will be able to use the information that an unknown value is definitely not null to make decisions earlier if that would give useful information sooner, and avoid some situations where unknown values cause errors.

It is true that a provider can "program with unknowns" in a way that the Terraform language doesn't allow, but a provider attempting to make assumptions beyond what the Terraform language guarantees is likely to fail to uphold Terraform's evaluation rules, and thus cause an error message from Terraform Core saying that the provider is buggy. Providers have more access to implementation details of the language, but they must still follow the language's evaluation rules in order for their outcomes to be considered valid.

As I mentioned above, work on this issue is in progress. The first step comes in Terraform v1.6, in the form of being able to track additional information about unknown values. The remaining steps belong to the Terraform plugin framework and SDK (to provide the means for provider developers to inspect refined unknown values and to refine the unknown values they are returning) and to the individual provider codebases (to make use of those new framework/SDK features). That other work will follow once v1.6 is released and we've got some experience with the new capabilities in the main Terraform language, since it'll be more practical to quickly fix any bugs and quirks only in Terraform Core rather than having to negotiate fixes across both Terraform Core and the provider ecosystem all at the same time.

@apparentlymart
Copy link
Contributor

apparentlymart commented Oct 19, 2023

Terraform v1.6 now contains the language-level building block that would be required to fully solve this.

One way you can use it today is to declare that your module's input variables are non-nullable:

variable "example" {
  type     = string
  nullable = false
}

Terraform itself now knows that this declaration means that var.example cannot be null -- if it were then that would be reported as an error inside the calling module block -- and so var.example != null would return true even if var.example's final value isn't known yet.

The same effect emerges if you use any built-in function that Terraform knows cannot possibly return null, although relying on that would of course be more a workaround than a solution since it is depending on a side-effect of a function rather than the function's documented purpose. For example, lower returns an error if given a null value as its argument, so if you know your string is case-insensitive or guaranteed to be all lowercase anyway then you could pass it through lower and then Terraform will infer that the result of that function cannot possibly be null.

output "instance_id" {
  # Terraform knows that the value of this cannot possibly
  # be null even if aws_instance.example.id isn't known yet,
  # because lower would fail if given a null value.
  value = lower(aws_instance.example.id)
}

(If you decide to use this temproary workaround, please do so sparingly and leave comments nearby for future maintainers of your configuration to see. This is very implicit behavior and so likely to be non-obvious to future maintainers.)

The next step here is to teach providers themselves to be able to indicate via the provider protocol when an unknown value is guaranteed not to be null. That's going to take some more design work since it changes part of the external API of Terraform (the provider protocol) rather than just its implementation. Part of that effort is to get more experience using the inference mechanisms built into the language itself first, so that we can feel confident that these new mechanisms are a good enough fit for the problem before making the design be frozen by its inclusion in a public API.

@rubenvw-ngdata
Copy link

@apparentlymart I would expect that this also works with a validation on the input variable, e.g.

variable "example" {
  default     = null
  validation {
    condition     = length(var.example) > 0
    error_message = "The variable example should not be null when provided"
  }
}

But still getting the same error when I supply an output variable of another module in it :(

@apparentlymart
Copy link
Contributor

length(null) is invalid, because null represents the absence of a value and an absent value does not have a length.

You can make that work by testing the length of the string only if it isn't null:

variable "example" {
  type        = string
  default     = null
  validation {
    condition     = var.example != null ? length(var.example) > 0 : true
    error_message = "The variable example should not be empty if provided."
  }
}

The above guarantees that the value of var.example elsewhere in the module will be either null or a string containing at least one character. However, Terraform cannot automatically infer that an unknown value in this variable cannot be null because validation rules only prevent progress of they are not met; they do not modify the given value in any way.

This example is not related to what I shared in my previous comment because this variable can be null, and so your validation rule must describe how to handle that situation. If you were to change the default to "" and then set nullable = false you would be guaranteed a value that cannot be null, but that means you would no longer be able to tell the difference between the variable being totally unset and it being set to the empty string.

mrclrchtr added a commit to hcloud-talos/terraform-hcloud-talos that referenced this issue Mar 21, 2024
@apparentlymart
Copy link
Contributor

At this point I think all of the work in Terraform Core for this is as complete as it can be, and so the remaining work is to update the plugin framework and SDK to allow providers to report that attributes are definitely not null, and for the providers themselves to make use of those new features.

Building on the workaround I shared earlier -- using a function whose result definitely cannot be null to allow Terraform to infer "definitely not null" even though the provider doesn't announce it -- I subsequently built a provider that more explicitly represents that workaround, called apparentlymart/assume.

My earlier workaround of using the lower function could therefore be replaced with the following, if you use my "assume" provider by including it in your module's required_providers block:

output "instance_id" {
  value = provider::assume::notnull(aws_instance.example.id)
}

In the long run I hope it will cease to be necessary for module authors to hint this explicitly, with the providers themselves annotating their results with equivalent refinements. If they do start adding such annotations then using my notnull helper function will no longer be needed, but would also be harmless because it would just confirm the refinement that was already present on the input value.

The Plugin Framework already has hashicorp/terraform-plugin-framework#869 to represent the work that needs to complete before providers could participate in this, so I'm going to close this issue in favor of that one since it's a better representation of what work remains to fully complete this, and because there's no further work planned in this repository and so this issue would likely get forgotten and just remain open forever if I don't close it today.

If you're interested in the more automatic solution where providers can include their own annotations of values that cannot possibly be null then I suggest adding your vote to that other issue in the Plugin Framework repository, or if you've found this after that has already been resolved then consider opening provider feature requests for specific situations where you think such annotations would be useful. Thanks!

Copy link
Contributor

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 22, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement unknown-values Issues related to Terraform's treatment of unknown values
Projects
None yet
Development

No branches or pull requests