Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws_elasticache_replication_group setting multi-az to false when automatic_failover_enabled is true #17376

Closed
Kaydub00 opened this issue Feb 1, 2021 · 7 comments
Labels
service/elasticache Issues and PRs that pertain to the elasticache service.

Comments

@Kaydub00
Copy link

Kaydub00 commented Feb 1, 2021

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform CLI and Terraform AWS Provider Version

Terraform v0.13.6

  • provider registry.terraform.io/hashicorp/archive v2.0.0
  • provider registry.terraform.io/hashicorp/aws v3.26.0
  • provider registry.terraform.io/hashicorp/local v2.0.0
  • provider registry.terraform.io/hashicorp/null v3.0.0
  • provider registry.terraform.io/hashicorp/random v3.0.1
  • provider registry.terraform.io/hashicorp/template v2.2.0

Affected Resource(s)

  • aws_elasticache_replication_group

Terraform Configuration Files

resource "aws_elasticache_replication_group" "redis" {
  count = var.enable_redis ? 1 : 0

  replication_group_id          = local.redis_name
  engine                        = "redis"
  engine_version                = var.redis_engine_version
  replication_group_description = "Terraform-managed Redis cluster for ${var.app}"
  number_cache_clusters         = var.redis_cache_cluster_count
  node_type                     = var.redis_cache_cluster_type
  automatic_failover_enabled    = var.redis_automatic_failover
  ...
  ...

and var redis_automatic_failover = true

Expected Behavior

After upgrading to tf13 I expected no changes when running terraform plan/apply.

Is it now necessary to also specify multi-az?

https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/elasticache_replication_group#automatic_failover_enabled

https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/elasticache_replication_group#multi_az_enabled

Is multi_az_enabled redundant or new? or is the default to false new or overriding the old default that would occur when automatic_failover_enabled was set to true?

Actual Behavior

Terraform wants to change multi-az to false.

Automatic failover is still enabled. Primary is in one AZ and the replica is in another AZ but multi-az is now setting to false.

Steps to Reproduce

  1. create cluster in tf12 with automatic-failover enabled (this sets multi-az on according to the documentation)
  2. upgrade to tf13.6 and listed providers
  3. terraform plan on your elasticache redis cluster
~ resource "aws_elasticache_replication_group" "redis" {
        ...
        automatic_failover_enabled    = true
        ...
      ~ multi_az_enabled              = true -> false
        ...
}
  • #0000
@ghost ghost added the service/elasticache Issues and PRs that pertain to the elasticache service. label Feb 1, 2021
@github-actions github-actions bot added the needs-triage Waiting for first response or review from a maintainer. label Feb 1, 2021
@ewbankkit
Copy link
Contributor

@chiefy
Copy link

chiefy commented Feb 1, 2021

Seems related:

Error: if automatic_failover_enabled is true, number_cache_clusters must be greater than 1

We don't set number_cache_clusters but rather have a cluster_mode section defined. This is happening w/ 3.26.0 and TF 0.12.29

@iancward
Copy link
Contributor

I've run into the Error: if automatic_failover_enabled is true, number_cache_clusters must be greater than 1 issue as well, with aws provider >= 3.26.0. I switched to 3.25.0 and AWS is quite happy to take what I've given to it. This makes me think the validation terraform is trying to do is invalid.

I've opened #17605 for this issue.

@DoctorPolski
Copy link

I have a deeper issue. I maintain a number of different AWS environments, some of which (large) utilise multi AZ and some of which (small) don't.

The environments which do NOT utilise multi AZ also use clusters but only resolve to having a single cluster for the interpolated num_of_clusters argument. Cluster mode is required for other dependant code.

Therefore, there is no version of the AWS provider that will satisfy both deployments:

  1. The "small" environments need <= 3.25.0 to allow the interpolated num_of_clusters argument to be 1 when in clustered mode. But 3.25.0 does not recognize the multi_az_enabled argument required by the "large" environments in the shared aws_elasticache_replication_group resource code.
  2. The "large" environments need >= 3.26.0 in order to support the multi_az_enabled argument that is now required since it appears to now default to false. But with 3.26.0 in place the "small" envs cannot plan when interpolated num_of_clusters = 1.

Summary: the implementation of Terraform in respect of clustering and multi AZ config for Redis has been very poorly implemented and is no longer congruent with AWS.

@rdelcampog
Copy link
Contributor

I'm dealing with the issue described by @DoctorPolski :(

@gdavison
Copy link
Contributor

Hello everyone,

This change in v3.26.0 is due to a change in the AWS API. Previously, setting automatic_failover_enabled implicitly enabled multi-AZ on the replication group. AWS made a change to separate the automatic failover and multi-AZ settings, which is why we added the multi_az_enabled parameter in v3.26.0 so that Terraform can properly support multi-AZ on replication groups.

The reason Terraform is proposing changing multi_az_enabled from true to false is that it is currently true as deployed in AWS, but is not set in your Terraform configuration, which is interpreted as false.

@DoctorPolski, v3.26.0 should work fine with your "small" environments. You can configure a cluster with no replicas, using

resource "aws_elasticache_replication_group" "example" {
  ...
  cluster_mode {
    num_node_groups         = 1
    replicas_per_node_group = 0
  }
}

If that doesn't work, please open a new issue showing the Terraform configuration that you're using.

I'm going to close this issue, since the handling of multi_az_enabled is working as needed due to the changes in the AWS API.

@ghost
Copy link

ghost commented Apr 26, 2021

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!

@ghost ghost locked as resolved and limited conversation to collaborators Apr 26, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
service/elasticache Issues and PRs that pertain to the elasticache service.
Projects
None yet
Development

No branches or pull requests

7 participants