Cannot create stable aws_elasticsearch_domain resources with auto_tune_options specified #22239

bevanbennett · 2021-12-15T20:00:49Z

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform CLI and Terraform AWS Provider Version

$ terraform -v
Terraform v1.0.11
on linux_amd64
+ provider registry.terraform.io/hashicorp/aws v3.69.0
+ provider registry.terraform.io/hashicorp/cloudinit v2.2.0
+ provider registry.terraform.io/hashicorp/external v2.1.1
+ provider registry.terraform.io/hashicorp/kubernetes v2.7.1
+ provider registry.terraform.io/hashicorp/local v2.1.0
+ provider registry.terraform.io/hashicorp/null v3.1.0
+ provider registry.terraform.io/hashicorp/random v3.1.0
+ provider registry.terraform.io/hashicorp/template v2.2.0
+ provider registry.terraform.io/terraform-aws-modules/http v2.4.1

Affected Resource(s)

aws_elasticsearch_domain

Terraform Configuration Files

Please include all Terraform configurations required to reproduce the bug. Bug reports without a functional reproduction may be closed without investigation.

  auto_tune_options {
    desired_state       = "ENABLED"
    rollback_on_disable = "NO_ROLLBACK"

    maintenance_schedule {
      cron_expression_for_recurrence = "cron(0 0 ? * 1 *)"
      start_at                       = timeadd(timestamp(),"1h")

      duration {
        unit  = "HOURS"
        value = 2
      }
    }
  }

  auto_tune_options {
    desired_state       = "ENABLED"
    rollback_on_disable = "NO_ROLLBACK"

    maintenance_schedule {
      cron_expression_for_recurrence = "cron(0 0 ? * 1 *)"
      start_at                       = "2021-12-15T18:00:00Z"

      duration {
        unit  = "HOURS"
        value = 2
      }
    }
  }

Debug Output

Panic Output

Expected Behavior

After adding auto_tune_options, we should be able to update existing clusters without causing eternal planning diffs.

Actual Behavior

With no start_at specified, plan errors out with
The argument "start_at" is required, but no definition was found.

With a hardcoded start_at, plan succeeds but apply will eventually fail when start_at is in the past
Error: ValidationException: The StartAt time you provided occurs in the past. Specify a time in the future.

With a computed start_at, every plan flags this as a change and wants to update the resource

If we ignore_changes for auto_tune_options[0].maintenance_schedule, we can never add a maintenance schedule where one did not exist previously, or update other elements of the maintenance schedule.

If we try to reference the start_at element directly, we cannot because:

│ Block type "maintenance_schedule" is represented by a set of objects, and set elements do not
│ have addressable keys. To find elements matching specific criteria, use a "for" expression with
│ an "if" clause.

Going through other, similar ignore_changes issues, it's suggested that changing this from a Set to a ListType could address the issue by allowing us to address individual maintenance_schedule blocks inside auto_tune_options and filter out start_at. That said, given the requirement for, and behaviour of start_at in this context, I'd expect this resource to include a built-in ignore_changes for it so that this resource is actually useable in a non-hardcoded manner. Currently it looks like our only option is to pass an explicit start_at based on the current date to every invocation of the module, which is awkward and ugly.

Steps to Reproduce

terraform apply

Important Factoids

Trying to use this resource in a module that's applied to many environments.

References

#0000

The text was updated successfully, but these errors were encountered:

ewbankkit · 2021-12-17T13:20:54Z

@bevanbennett Thanks for raising this issue.
As the timestamp() function returns a new value every time it is called, the change on every terraform plan is to be expected.
If you need a stable time value, take a look at the time provider.

bevanbennett · 2021-12-17T18:01:20Z

Thanks Kit! I didn't know about the time provider. Maybe we should mention it as a useful example in the docs for aws_elasticsearch_domain and specifically auto_tune_options.

bostrowski13 · 2022-01-20T16:17:12Z

So is start_at even relevant? I see it specified in the API in the docs, but nowhere in the console is this asked for when creating a cluster. why not just bulid this into the provider for +1min or something.

jsharper · 2022-07-11T05:50:04Z

@ewbankkit time_offset looks promising to use as a source for start_at but I'm struggling to figure out how to automatically trigger the time_offset to replace and recalculate its timestamp whenever the aws_elasticsearch_domain has a change to apply. It seems like something like lifecycle.replace_triggered_by might work, but it creates a reference loop ("Cycle" error). Am I missing a solution here, or were you suggesting it only as a workaround, with the expectation that one would manually change something in the time_offset's triggers map whenever making a change to the aws_elasticsearch_domain (and hoping that no drift occurs, necessitating a change to be applied without the terraform code changing)?

It seems like there may need to be a different model needed here for internally calculating the start_at time to pass to the AWS API, like perhaps the terraform attribute should be a time offset rather than an absolute time.

passbt · 2022-09-23T18:28:13Z

Has anyone come up with a good workaround? I also get a cycle error when trying to use the time provider. I've put a snippet of my code below for reference along with the error.

Error: Cycle: module.es.time_offset.auto_tune_start, module.es.aws_elasticsearch_domain.domain

resource "aws_elasticsearch_domain" "domain" {
  domain_name           = local.domain_name
  elasticsearch_version = var.elasticsearch_version

  advanced_options = var.advanced_options
  access_policies  = data.aws_iam_policy_document.es_policy.json
... 
  dynamic "auto_tune_options" {
    for_each = var.auto_tune_enabled ? [true] : []
    content {
      desired_state       = "ENABLED"
      rollback_on_disable = var.auto_tune_rollback_settings
      maintenance_schedule {
        start_at = var.auto_tune_starting_time == null ? time_offset.auto_tune_start.rfc3339 : var.auto_tune_starting_time
        duration {
          value = var.auto_tune_duration
          unit  = "HOURS"
        }
        cron_expression_for_recurrence = var.auto_tune_cron_schedule
      }
    }
  }
...
}

resource "time_offset" "auto_tune_start" {
  triggers = {
    start_time = aws_elasticsearch_domain.domain.auto_tune_options
  }

  offset_minutes = 60
}

ravron · 2022-10-20T01:27:05Z

Spent a bit of time wrestling with this, and wanted to jot down my thoughts.

First, I was intrigued by @bostrowski13's comment:

I see it specified in the API in the docs, but nowhere in the console is this asked for when creating a cluster. why not just bulid this into the provider for +1min or something.

I tried running UpdateDomainConfig from the command line and omitting StartAt, but that's a no-go:

An error occurred (ValidationException) when calling the UpdateDomainConfig operation: You must supply the StartAt and the Duration parameters. For more information, please refer to: https://docs.aws.amazon.com/cli/latest/reference/es/update-elasticsearch-domain-config.html

But it's true — the AWS console doesn't ask for that parameter at all. I examined the API request the console sends for you, and it's just tricking you: even though it doesn't ask you for the parameter, it does send it. Specifically, it sends the timestamp for now plus 24 hours.

So, given that, I have a proposal for a fix.

We'll add a new schema parameter to the maintenance_schedule, something like start_offset (open to suggestions on a better name). The value would be a duration string, say 24h, so it would stay stable when you don't touch it. In expandAutoTuneMaintenanceSchedules, we'd prefer to calculate StartAt using Time.now() plus start_offset, but fall back to the current behavior of using start_at as-is if start_offset isn't present. The schema should validate that Time.now() + start_offset is in the future, and it's probably a good idea to require it to be at least "a bit" into the future (ten minutes? thirty?). This is because by the time the UpdateDomainConfig operation actually runs, a StartAt calculated with a small start_offset may already be in the past, and the API call will fail. We'd also deprecate start_at, given how unhelpful it currently is, and make start_at and start_offset mutually exclusive.

I thought about trying to set a DefaultFunc for start_at that defaults to now plus 24 hours, as the AWS API does, but I don't think that would help — the default would cause a change on every plan just like using timestamp() does. Even if it didn't, that would only work if you didn't want to configure StartAt yourself; if you did, you'd be stuck in the same place we are now.

As I've never contributed to Terraform or this provider, I'd appreciate feedback on this approach. I'm willing to give it a shot if it seems reasonable.

dm3ch · 2023-01-12T18:08:55Z

StartAt seems to be really annoying was there any progress getting rid of it?

rishabhkumar92 · 2023-02-01T18:46:37Z

@ravron is there any update on the fix yet?

jeroenhabets · 2023-02-10T10:08:15Z

@ewbankkit the moving timestamp you mentioned is not the issue, as I hard coded my start_at parameter and still get continuous changed terraform plans. Others have tried to work with the time_offset to no avail, see up thread

Do you see anyway of improving the drift detection? As @ravron dived a bit deeper into it and it seems the API behaves a bit unhelpful here.

For now I have had to resort to lifecycle ignore_changes = [ auto_tune_options ] until this is fixed.

jeroenhabets · 2023-02-20T10:08:55Z

Hi Dirk (@YakDriver),
I noticed you've been the sole contributor to this opensearch module so far... Perhaps you can chime in how to best fix this "start_at" issue?
Thanks in advance! Jeroen

jeroenhabets · 2023-03-08T18:19:01Z

FYI (also for any user stumbling upon this ticket):

babbottscott posted a partial workaround using a create trigger in the HashiCorp community referencing this issue. (That's probably better than my 100% hardcoding of start_at + ignore changes)

resource "time_offset" "start_at" {
  triggers = { create = local.name_tag }
  offset_hours = 4
}

In May 2022 @justinretzolk tried ignoring only start_at to no avail, see How to ignore maintenance schedule start time on aws_elasticsearch_domain #23669 (comment) (mentioned by @tyrken)

jeroenhabets · 2023-03-30T09:24:09Z

Looks like Auto-Tune maintenance windows are (to be) superseded by 10h "off-peak windows" since Feb 2023

Note
Off-peak windows were introduced on February 16, 2023. All domains created before this date have the off-peak window disabled by default. You must manually enable and configure the off-peak window for these domains. All domains created after this date will have the off-peak window enabled by default. You can't disable the off-peak window for a domain after it's enabled.

AWS: https://docs.aws.amazon.com/opensearch-service/latest/developerguide/off-peak.html incl. Migrating from Auto-Tune maintenance windows section.
There is already a Terraform AWS provider RFC: #30104 🚀

github-actions bot added needs-triage Waiting for first response or review from a maintainer. service/elasticsearch Issues and PRs that pertain to the elasticsearch service. labels Dec 15, 2021

justinretzolk added enhancement Requests to existing resources that expand the functionality or scope. and removed needs-triage Waiting for first response or review from a maintainer. labels Dec 16, 2021

tyrken mentioned this issue Feb 27, 2023

How to ignore maintenance schedule start time on aws_elasticsearch_domain #23669

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot create stable aws_elasticsearch_domain resources with auto_tune_options specified #22239

Cannot create stable aws_elasticsearch_domain resources with auto_tune_options specified #22239

bevanbennett commented Dec 15, 2021

ewbankkit commented Dec 17, 2021

bevanbennett commented Dec 17, 2021

bostrowski13 commented Jan 20, 2022

jsharper commented Jul 11, 2022

passbt commented Sep 23, 2022

ravron commented Oct 20, 2022

dm3ch commented Jan 12, 2023

rishabhkumar92 commented Feb 1, 2023

jeroenhabets commented Feb 10, 2023

jeroenhabets commented Feb 20, 2023

jeroenhabets commented Mar 8, 2023

jeroenhabets commented Mar 30, 2023

Cannot create stable aws_elasticsearch_domain resources with auto_tune_options specified #22239

Cannot create stable aws_elasticsearch_domain resources with auto_tune_options specified #22239

Comments

bevanbennett commented Dec 15, 2021

Community Note

Terraform CLI and Terraform AWS Provider Version

Affected Resource(s)

Terraform Configuration Files

Debug Output

Panic Output

Expected Behavior

Actual Behavior

Steps to Reproduce

Important Factoids

References

ewbankkit commented Dec 17, 2021

bevanbennett commented Dec 17, 2021

bostrowski13 commented Jan 20, 2022

jsharper commented Jul 11, 2022

passbt commented Sep 23, 2022

ravron commented Oct 20, 2022

dm3ch commented Jan 12, 2023

rishabhkumar92 commented Feb 1, 2023

jeroenhabets commented Feb 10, 2023

jeroenhabets commented Feb 20, 2023

jeroenhabets commented Mar 8, 2023

jeroenhabets commented Mar 30, 2023