Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot create stable aws_elasticsearch_domain resources with auto_tune_options specified #22239

Open
bevanbennett opened this issue Dec 15, 2021 · 12 comments
Labels
enhancement Requests to existing resources that expand the functionality or scope. service/elasticsearch Issues and PRs that pertain to the elasticsearch service.

Comments

@bevanbennett
Copy link

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform CLI and Terraform AWS Provider Version

$ terraform -v
Terraform v1.0.11
on linux_amd64
+ provider registry.terraform.io/hashicorp/aws v3.69.0
+ provider registry.terraform.io/hashicorp/cloudinit v2.2.0
+ provider registry.terraform.io/hashicorp/external v2.1.1
+ provider registry.terraform.io/hashicorp/kubernetes v2.7.1
+ provider registry.terraform.io/hashicorp/local v2.1.0
+ provider registry.terraform.io/hashicorp/null v3.1.0
+ provider registry.terraform.io/hashicorp/random v3.1.0
+ provider registry.terraform.io/hashicorp/template v2.2.0
+ provider registry.terraform.io/terraform-aws-modules/http v2.4.1

Affected Resource(s)

  • aws_elasticsearch_domain

Terraform Configuration Files

Please include all Terraform configurations required to reproduce the bug. Bug reports without a functional reproduction may be closed without investigation.

  auto_tune_options {
    desired_state       = "ENABLED"
    rollback_on_disable = "NO_ROLLBACK"

    maintenance_schedule {
      cron_expression_for_recurrence = "cron(0 0 ? * 1 *)"
      start_at                       = timeadd(timestamp(),"1h")

      duration {
        unit  = "HOURS"
        value = 2
      }
    }
  }
  auto_tune_options {
    desired_state       = "ENABLED"
    rollback_on_disable = "NO_ROLLBACK"

    maintenance_schedule {
      cron_expression_for_recurrence = "cron(0 0 ? * 1 *)"
      start_at                       = "2021-12-15T18:00:00Z"

      duration {
        unit  = "HOURS"
        value = 2
      }
    }
  }

Debug Output

Panic Output

Expected Behavior

After adding auto_tune_options, we should be able to update existing clusters without causing eternal planning diffs.

Actual Behavior

With no start_at specified, plan errors out with
The argument "start_at" is required, but no definition was found.

With a hardcoded start_at, plan succeeds but apply will eventually fail when start_at is in the past
Error: ValidationException: The StartAt time you provided occurs in the past. Specify a time in the future.

With a computed start_at, every plan flags this as a change and wants to update the resource

If we ignore_changes for auto_tune_options[0].maintenance_schedule, we can never add a maintenance schedule where one did not exist previously, or update other elements of the maintenance schedule.

If we try to reference the start_at element directly, we cannot because:

│ Block type "maintenance_schedule" is represented by a set of objects, and set elements do not
│ have addressable keys. To find elements matching specific criteria, use a "for" expression with
│ an "if" clause.

Going through other, similar ignore_changes issues, it's suggested that changing this from a Set to a ListType could address the issue by allowing us to address individual maintenance_schedule blocks inside auto_tune_options and filter out start_at. That said, given the requirement for, and behaviour of start_at in this context, I'd expect this resource to include a built-in ignore_changes for it so that this resource is actually useable in a non-hardcoded manner. Currently it looks like our only option is to pass an explicit start_at based on the current date to every invocation of the module, which is awkward and ugly.

Steps to Reproduce

  1. terraform apply

Important Factoids

Trying to use this resource in a module that's applied to many environments.

References

  • #0000
@github-actions github-actions bot added needs-triage Waiting for first response or review from a maintainer. service/elasticsearch Issues and PRs that pertain to the elasticsearch service. labels Dec 15, 2021
@justinretzolk justinretzolk added enhancement Requests to existing resources that expand the functionality or scope. and removed needs-triage Waiting for first response or review from a maintainer. labels Dec 16, 2021
@ewbankkit
Copy link
Contributor

@bevanbennett Thanks for raising this issue.
As the timestamp() function returns a new value every time it is called, the change on every terraform plan is to be expected.
If you need a stable time value, take a look at the time provider.

@bevanbennett
Copy link
Author

Thanks Kit! I didn't know about the time provider. Maybe we should mention it as a useful example in the docs for aws_elasticsearch_domain and specifically auto_tune_options.

@bostrowski13
Copy link

So is start_at even relevant? I see it specified in the API in the docs, but nowhere in the console is this asked for when creating a cluster. why not just bulid this into the provider for +1min or something.

@jsharper
Copy link

@ewbankkit time_offset looks promising to use as a source for start_at but I'm struggling to figure out how to automatically trigger the time_offset to replace and recalculate its timestamp whenever the aws_elasticsearch_domain has a change to apply. It seems like something like lifecycle.replace_triggered_by might work, but it creates a reference loop ("Cycle" error). Am I missing a solution here, or were you suggesting it only as a workaround, with the expectation that one would manually change something in the time_offset's triggers map whenever making a change to the aws_elasticsearch_domain (and hoping that no drift occurs, necessitating a change to be applied without the terraform code changing)?

It seems like there may need to be a different model needed here for internally calculating the start_at time to pass to the AWS API, like perhaps the terraform attribute should be a time offset rather than an absolute time.

@passbt
Copy link

passbt commented Sep 23, 2022

Has anyone come up with a good workaround? I also get a cycle error when trying to use the time provider. I've put a snippet of my code below for reference along with the error.

Error: Cycle: module.es.time_offset.auto_tune_start, module.es.aws_elasticsearch_domain.domain

resource "aws_elasticsearch_domain" "domain" {
  domain_name           = local.domain_name
  elasticsearch_version = var.elasticsearch_version

  advanced_options = var.advanced_options
  access_policies  = data.aws_iam_policy_document.es_policy.json
... 
  dynamic "auto_tune_options" {
    for_each = var.auto_tune_enabled ? [true] : []
    content {
      desired_state       = "ENABLED"
      rollback_on_disable = var.auto_tune_rollback_settings
      maintenance_schedule {
        start_at = var.auto_tune_starting_time == null ? time_offset.auto_tune_start.rfc3339 : var.auto_tune_starting_time
        duration {
          value = var.auto_tune_duration
          unit  = "HOURS"
        }
        cron_expression_for_recurrence = var.auto_tune_cron_schedule
      }
    }
  }
...
}

resource "time_offset" "auto_tune_start" {
  triggers = {
    start_time = aws_elasticsearch_domain.domain.auto_tune_options
  }

  offset_minutes = 60
}

@ravron
Copy link
Contributor

ravron commented Oct 20, 2022

Spent a bit of time wrestling with this, and wanted to jot down my thoughts.

First, I was intrigued by @bostrowski13's comment:

I see it specified in the API in the docs, but nowhere in the console is this asked for when creating a cluster. why not just bulid this into the provider for +1min or something.

I tried running UpdateDomainConfig from the command line and omitting StartAt, but that's a no-go:

An error occurred (ValidationException) when calling the UpdateDomainConfig operation: You must supply the StartAt and the Duration parameters. For more information, please refer to: https://docs.aws.amazon.com/cli/latest/reference/es/update-elasticsearch-domain-config.html

But it's true — the AWS console doesn't ask for that parameter at all. I examined the API request the console sends for you, and it's just tricking you: even though it doesn't ask you for the parameter, it does send it. Specifically, it sends the timestamp for now plus 24 hours.

So, given that, I have a proposal for a fix.

We'll add a new schema parameter to the maintenance_schedule, something like start_offset (open to suggestions on a better name). The value would be a duration string, say 24h, so it would stay stable when you don't touch it. In expandAutoTuneMaintenanceSchedules, we'd prefer to calculate StartAt using Time.now() plus start_offset, but fall back to the current behavior of using start_at as-is if start_offset isn't present. The schema should validate that Time.now() + start_offset is in the future, and it's probably a good idea to require it to be at least "a bit" into the future (ten minutes? thirty?). This is because by the time the UpdateDomainConfig operation actually runs, a StartAt calculated with a small start_offset may already be in the past, and the API call will fail. We'd also deprecate start_at, given how unhelpful it currently is, and make start_at and start_offset mutually exclusive.

I thought about trying to set a DefaultFunc for start_at that defaults to now plus 24 hours, as the AWS API does, but I don't think that would help — the default would cause a change on every plan just like using timestamp() does. Even if it didn't, that would only work if you didn't want to configure StartAt yourself; if you did, you'd be stuck in the same place we are now.

As I've never contributed to Terraform or this provider, I'd appreciate feedback on this approach. I'm willing to give it a shot if it seems reasonable.

@dm3ch
Copy link

dm3ch commented Jan 12, 2023

StartAt seems to be really annoying was there any progress getting rid of it?

@rishabhkumar92
Copy link

@ravron is there any update on the fix yet?

@jeroenhabets
Copy link

@ewbankkit the moving timestamp you mentioned is not the issue, as I hard coded my start_at parameter and still get continuous changed terraform plans. Others have tried to work with the time_offset to no avail, see up thread

Do you see anyway of improving the drift detection? As @ravron dived a bit deeper into it and it seems the API behaves a bit unhelpful here.

For now I have had to resort to lifecycle ignore_changes = [ auto_tune_options ] until this is fixed.

@jeroenhabets
Copy link

Hi Dirk (@YakDriver),
I noticed you've been the sole contributor to this opensearch module so far... Perhaps you can chime in how to best fix this "start_at" issue?
Thanks in advance! Jeroen

@jeroenhabets
Copy link

FYI (also for any user stumbling upon this ticket):

resource "time_offset" "start_at" {
  triggers = { create = local.name_tag }
  offset_hours = 4
}

@jeroenhabets
Copy link

Looks like Auto-Tune maintenance windows are (to be) superseded by 10h "off-peak windows" since Feb 2023

Note
Off-peak windows were introduced on February 16, 2023. All domains created before this date have the off-peak window disabled by default. You must manually enable and configure the off-peak window for these domains. All domains created after this date will have the off-peak window enabled by default. You can't disable the off-peak window for a domain after it's enabled.

AWS: https://docs.aws.amazon.com/opensearch-service/latest/developerguide/off-peak.html incl. Migrating from Auto-Tune maintenance windows section.
There is already a Terraform AWS provider RFC: #30104 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Requests to existing resources that expand the functionality or scope. service/elasticsearch Issues and PRs that pertain to the elasticsearch service.
Projects
None yet
Development

No branches or pull requests

10 participants