Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ElastiCache Redis versioning #23734

Merged
merged 17 commits into from
Apr 23, 2022

Conversation

bschaatsbergen
Copy link
Member

@bschaatsbergen bschaatsbergen commented Mar 17, 2022

Community Note

  • Please vote on this pull request by adding a 👍 reaction to the original pull request comment to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for pull request followers and do not help prioritize the request

Closes #22385

Output from acceptance testing:

$ make testacc TESTARGS='-run=TestAccElastiCacheCluster_EngineVersion_redis\|TestAccElastiCacheReplicationGroup_EngineVersion_update' PKG=elasticache ACCTEST_PARALLELISM=2
==> Checking that code complies with gofmt requirements...
TF_ACC=1 go test ./internal/service/elasticache/... -v -count 1 -parallel 2  -run=TestAccElastiCacheCluster_EngineVersion_redis\|TestAccElastiCacheReplicationGroup_EngineVersion_update -timeout 180m
=== RUN   TestAccElastiCacheCluster_EngineVersion_redis
=== PAUSE TestAccElastiCacheCluster_EngineVersion_redis
=== RUN   TestAccElastiCacheReplicationGroup_EngineVersion_update
=== PAUSE TestAccElastiCacheReplicationGroup_EngineVersion_update
=== CONT  TestAccElastiCacheCluster_EngineVersion_redis
=== CONT  TestAccElastiCacheReplicationGroup_EngineVersion_update
--- PASS: TestAccElastiCacheCluster_EngineVersion_redis (1825.41s)
--- PASS: TestAccElastiCacheReplicationGroup_EngineVersion_update (3559.08s)
PASS
ok      github.com/hashicorp/terraform-provider-aws/internal/service/elasticache        3559.142s
...

@github-actions github-actions bot added needs-triage Waiting for first response or review from a maintainer. size/XS Managed by automation to categorize the size of a PR. service/elasticache Issues and PRs that pertain to the elasticache service. tests PRs: expanded test coverage. Issues: expanded coverage, enhancements to test infrastructure. labels Mar 17, 2022
@bschaatsbergen bschaatsbergen marked this pull request as ready for review March 17, 2022 16:49
@github-actions github-actions bot added size/S Managed by automation to categorize the size of a PR. documentation Introduces or discusses updates to documentation. and removed size/XS Managed by automation to categorize the size of a PR. labels Mar 17, 2022
@bschaatsbergen bschaatsbergen changed the title Fix ElastiCache Redis versioning Fix ElastiCache Redis versioning (post V6 versions) Mar 17, 2022
@bschaatsbergen bschaatsbergen changed the title Fix ElastiCache Redis versioning (post V6 versions) Fix ElastiCache Redis versioning Mar 17, 2022
@justinretzolk justinretzolk removed the needs-triage Waiting for first response or review from a maintainer. label Mar 17, 2022
@bschaatsbergen bschaatsbergen marked this pull request as draft March 17, 2022 17:57
@bschaatsbergen bschaatsbergen marked this pull request as ready for review March 17, 2022 21:28
@github-actions github-actions bot added size/M Managed by automation to categorize the size of a PR. and removed size/S Managed by automation to categorize the size of a PR. labels Mar 17, 2022
@bschaatsbergen bschaatsbergen marked this pull request as draft March 17, 2022 23:50
@bschaatsbergen
Copy link
Member Author

bschaatsbergen commented Mar 18, 2022

@ewbankkit , @gdavison quickly checking with you guys.

In the ModifyCacheCluster, specifying EngineVersion as 6.x is just ignored by the looks of it. It seems to only be possible in the CreateCacheCluster call.

We previously didn't face any issues as no-one went from 6.0 to 6.2 yet. What do you guys suggest, not support .x anymore and let users just specify a major.minor for >= Redis 6 or ?

I suppose not supporting 6.x anymore and supplying a 6.0 (major.minor) is cleaner compared to introducing logic to make this work properly?

(event trail when upgrading to 6.x from 6.0)
image

@bschaatsbergen bschaatsbergen marked this pull request as ready for review March 18, 2022 21:36
@bschaatsbergen
Copy link
Member Author

I've removed the support of 6.x and it now only accepts . for Redis 6 >= versions. Mainly to avoid the inconsistency between being able to supply a 6.x in the CreateCacheCluster, but not in the ModifyCacheCluster API call.

Please let me know what you think.

@ewbankkit
Copy link
Contributor

I'll let @gdavison take a look at this as he has much experience with ElastiCache.

@manelpb
Copy link

manelpb commented Mar 23, 2022

Any updates on this?

@gdavison gdavison self-assigned this Mar 28, 2022
@gdavison
Copy link
Contributor

Thanks for creating this PR, @bschaatsbergen. We'll need to continue supporting the 6.x format, because the ElastiCache API still supports it for the purpose of using the latest Redis 6 version. In addition, removing it would be a breaking change, which would require a new major version of the AWS Provider.

Otherwise, the PR is looking good.

@gdavison gdavison added the waiting-response Maintainers are waiting on response from community or contributor. label Mar 29, 2022
@bschaatsbergen
Copy link
Member Author

Clear, @gdavison what do you recommend on how to proceed? As it looks like that the 6.x is completely ignored when it's passed down to the UpdateCluster call (e.g. updating from 6.0 to 6.x)

@github-actions github-actions bot removed the waiting-response Maintainers are waiting on response from community or contributor. label Mar 29, 2022
@gdavison
Copy link
Contributor

I think 6.x is a creation-time value, so "updating" to 6.x won't actually make a change. I'm currently working on a PR to make auto_minor_version_upgrade work, since it was previously ignored, so that can help.

With explicit version numbers, we'll have to:

  • ignore minor version downgrades to engine_version when auto_minor_version_upgrade is true
  • correctly handle changes from engine_version 6.x to an explicit version, assuming engine_version_actual is 6.2.*:
    • the engine_version is changed to 6.2, no change is needed
    • the engine_version is changed to 6.0, this will be a downgrade

@gdavison gdavison force-pushed the fix-redit-versioning branch from 6ebbd3e to ca61710 Compare April 21, 2022 04:48
@github-actions github-actions bot added size/XL Managed by automation to categorize the size of a PR. and removed size/M Managed by automation to categorize the size of a PR. labels Apr 21, 2022
Copy link
Contributor

@gdavison gdavison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR, @bschaatsbergen. Looks good! 🚀

--- PASS: TestAccElastiCacheReplicationGroup_Validation_noNodeType (58.87s)
--- PASS: TestAccElastiCacheClusterDataSource_Data_basic (503.98s)
--- PASS: TestAccElastiCacheCluster_NumCacheNodes_increaseWithPreferredAvailabilityZones (805.28s)
--- PASS: TestAccElastiCacheReplicationGroup_finalSnapshot (875.91s)
--- PASS: TestAccElastiCacheReplicationGroup_autoMinorVersionUpgrade (895.24s)
--- PASS: TestAccElastiCacheReplicationGroup_tags (955.49s)
--- PASS: TestAccElastiCacheReplicationGroup_dataTiering (1035.39s)
--- PASS: TestAccElastiCacheReplicationGroup_updateMaintenanceWindow (1160.22s)
--- PASS: TestAccElastiCacheCluster_ReplicationGroupID_multipleReplica (1173.02s)
--- PASS: TestAccElastiCacheReplicationGroup_tagWithOtherModification (1244.32s)
--- PASS: TestAccElastiCacheReplicationGroup_Validation_globalReplicationGroupIdAndNodeType (1442.98s)
--- PASS: TestAccElastiCacheReplicationGroup_clusteringAndCacheNodesCausesError (2.13s)
--- PASS: TestAccElastiCacheReplicationGroup_Engine_Redis_LogDeliveryConfigurations_ClusterMode_Enabled (1451.24s)
--- PASS: TestAccElastiCacheReplicationGroup_Engine_Redis_LogDeliveryConfigurations_ClusterMode_Disabled (1454.04s)
--- PASS: TestAccElastiCacheReplicationGroup_NumberCacheClustersMemberClusterDisappears_addMemberCluster (1672.43s)
--- PASS: TestAccElastiCacheReplicationGroup_NumberCacheClustersMemberClusterDisappearsRemoveMemberCluster_atTargetSize (1732.78s)
--- PASS: TestAccElastiCacheReplicationGroup_GlobalReplicationGroupIDClusterModeValidation_numNodeGroupsOnSecondary (1749.87s)
--- PASS: TestAccElastiCacheReplicationGroup_useCMKKMSKeyID (867.75s)
--- PASS: TestAccElastiCacheReplicationGroup_NumberCacheClustersMemberClusterDisappears_noChange (1433.66s)
--- PASS: TestAccElastiCacheReplicationGroup_NumberCacheClustersMemberClusterDisappearsRemoveMemberCluster_scaleDown (2022.44s)
--- PASS: TestAccElastiCacheReplicationGroup_enableSnapshotting (819.08s)
--- PASS: TestAccElastiCacheReplicationGroup_ValidationMultiAz_noAutomaticFailover (2.84s)
--- PASS: TestAccElastiCacheReplicationGroup_enableAuthTokenTransitEncryption (995.50s)
--- PASS: TestAccElastiCacheReplicationGroup_ClusterMode_singleNode (795.05s)
--- PASS: TestAccElastiCacheReplicationGroup_NumberCacheClustersFailover_autoFailoverDisabled (1347.65s)
--- PASS: TestAccElastiCacheReplicationGroup_enableAtRestEncryption (1189.06s)
--- PASS: TestAccElastiCacheReplicationGroup_NumberCacheClustersFailover_autoFailoverEnabled (1593.58s)
--- PASS: TestAccElastiCacheCluster_NumCacheNodes_redis (2.64s)
--- PASS: TestAccElastiCacheReplicationGroup_ClusterMode_nonClusteredParameterGroup (751.01s)
--- PASS: TestAccElastiCacheReplicationGroup_NumberCacheClusters_multiAZEnabled (1855.91s)
--- PASS: TestAccElastiCacheReplicationGroup_NumberCacheClusters_basic (1869.13s)
--- PASS: TestAccElastiCacheReplicationGroup_GlobalReplicationGroupID_basic (2839.17s)
--- PASS: TestAccElastiCacheReplicationGroup_disappears (644.73s)
--- PASS: TestAccElastiCacheReplicationGroup_ClusterMode_basic (977.88s)
--- PASS: TestAccElastiCacheReplicationGroup_updateDescription (857.99s)
--- PASS: TestAccElastiCacheReplicationGroup_GlobalReplicationGroupID_disappears (3042.92s)
--- PASS: TestAccElastiCacheReplicationGroup_multiAzInVPC (874.00s)
--- PASS: TestAccElastiCacheReplicationGroup_deprecatedAvailabilityZones_multiAzInVPC (1116.75s)
--- PASS: TestAccElastiCacheCluster_ReplicationGroupID_availabilityZone (1383.63s)
--- PASS: TestAccElastiCacheReplicationGroup_uppercase (670.60s)
--- PASS: TestAccElastiCacheCluster_ReplicationGroupID_singleReplica (1535.64s)
--- PASS: TestAccElastiCacheReplicationGroup_ClusterMode_updateReplicasPerNodeGroup (1856.21s)
--- PASS: TestAccElastiCacheReplicationGroup_GlobalReplicationGroupIDClusterMode_basic (3613.86s)
--- PASS: TestAccElastiCacheReplicationGroupDataSource_nonExistent (2.55s)
--- PASS: TestAccElastiCacheReplicationGroup_multiAzNotInVPC_repeated (911.75s)
--- PASS: TestAccElastiCacheReplicationGroup_basic_v5 (710.60s)
--- PASS: TestAccElastiCacheReplicationGroup_multiAzNotInVPC (840.55s)
--- PASS: TestAccElastiCacheCluster_NodeTypeResize_redis (1103.58s)
--- PASS: TestAccElastiCacheCluster_NodeTypeResize_memcached (905.76s)
--- PASS: TestAccElastiCacheReplicationGroup_GlobalReplicationGroupID_full (3810.56s)
--- PASS: TestAccElastiCacheReplicationGroup_depecatedAvailabilityZones_vpc (701.66s)
--- PASS: TestAccElastiCacheReplicationGroup_ClusterModeUpdateNumNodeGroups_scaleUp (2267.26s)
--- PASS: TestAccElastiCacheReplicationGroup_ClusterModeUpdateNumNodeGroupsAndReplicasPerNodeGroup_scaleUp (2563.03s)
--- PASS: TestAccElastiCacheCluster_AZMode_redis (527.07s)
--- PASS: TestAccElastiCacheReplicationGroup_basic (698.74s)
--- PASS: TestAccElastiCacheReplicationGroup_vpc (701.88s)
--- PASS: TestAccElastiCacheReplicationGroup_ClusterModeUpdateNumNodeGroupsAndReplicasPerNodeGroup_scaleDown (2774.50s)
--- PASS: TestAccElastiCacheCluster_AZMode_memcached (527.24s)
--- PASS: TestAccElastiCacheCluster_multiAZInVPC (498.05s)
--- PASS: TestAccElastiCacheReplicationGroup_ClusterModeUpdateNumNodeGroups_scaleDown (2592.11s)
--- PASS: TestAccElastiCacheCluster_Memcached_finalSnapshot (2.65s)
--- PASS: TestAccElastiCacheCluster_EngineVersion_memcached (982.77s)
--- PASS: TestAccElastiCacheReplicationGroup_deprecatedAvailabilityZones_multiAzNotInVPC (1700.19s)
--- PASS: TestAccElastiCacheReplicationGroupDataSource_Engine_Redis_LogDeliveryConfigurations (805.11s)
--- PASS: TestAccElastiCacheReplicationGroupDataSource_multiAZ (739.30s)
--- PASS: TestAccElastiCacheCluster_vpc (468.97s)
--- PASS: TestAccElastiCacheReplicationGroup_updateAuthToken (882.90s)
--- PASS: TestAccElastiCacheReplicationGroupDataSource_clusterMode (868.24s)
--- SKIP: TestAccElastiCacheCluster_SecurityGroup_ec2Classic (0.89s)
--- PASS: TestAccElastiCacheReplicationGroup_updateParameterGroup (960.78s)
--- PASS: TestAccElastiCacheCluster_tagWithOtherModification (571.48s)
--- PASS: TestAccElastiCacheCluster_tags (462.45s)
--- PASS: TestAccElastiCacheCluster_Redis_autoMinorVersionUpgrade (524.33s)
--- PASS: TestAccElastiCacheReplicationGroupDataSource_basic (830.99s)
--- PASS: TestAccElastiCacheCluster_Engine_None (2.39s)
--- PASS: TestAccElastiCacheCluster_Redis_finalSnapshot (689.39s)
--- PASS: TestAccElastiCacheCluster_ParameterGroupName_default (454.77s)
--- PASS: TestAccElastiCacheCluster_snapshotsWithUpdates (499.30s)
--- PASS: TestAccElastiCacheReplicationGroup_updateUserGroups (986.10s)
--- PASS: TestAccElastiCacheCluster_Engine_redis_v5 (474.33s)
--- PASS: TestAccElastiCacheCluster_port (454.71s)
--- PASS: TestAccElastiCacheCluster_Engine_memcached (445.07s)
--- PASS: TestAccElastiCacheGlobalReplicationGroup_disappears (849.55s)
--- PASS: TestAccElastiCacheCluster_PortRedis_default (505.53s)
--- PASS: TestAccElastiCacheCluster_Engine_Redis_LogDeliveryConfigurations (939.81s)
--- PASS: TestAccElastiCacheGlobalReplicationGroup_clusterMode (1215.09s)
--- PASS: TestAccElastiCacheCluster_EngineVersion_redis (2321.67s)
--- PASS: TestAccElastiCacheCluster_Engine_redis (455.36s)
--- PASS: TestAccElastiCacheCluster_NumCacheNodes_increase (675.10s)
--- PASS: TestAccElastiCacheGlobalReplicationGroup_basic (972.98s)
--- PASS: TestAccElastiCacheGlobalReplicationGroup_description (1122.36s)
--- PASS: TestAccElastiCacheCluster_NumCacheNodes_decrease (802.66s)
--- PASS: TestAccElastiCacheClusterDataSource_Engine_Redis_LogDeliveryConfigurations (734.72s)
--- PASS: TestAccElastiCacheReplicationGroup_updateNodeSize (1979.62s)
--- PASS: TestAccElastiCacheReplicationGroup_EngineVersion_update (4738.97s)
--- PASS: TestAccElastiCacheGlobalReplicationGroup_multipleSecondaries (3195.48s)
--- PASS: TestAccElastiCacheGlobalReplicationGroup_ReplaceSecondary_differentRegion (3384.08s)

@gdavison gdavison merged commit 3370515 into hashicorp:main Apr 23, 2022
@github-actions github-actions bot added this to the v4.12.0 milestone Apr 23, 2022
@bschaatsbergen bschaatsbergen deleted the fix-redit-versioning branch April 23, 2022 07:22
@github-actions
Copy link

This functionality has been released in v4.12.0 of the Terraform AWS Provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading.

For further feature requests or bug reports with this functionality, please create a new GitHub issue following the template. Thank you!

@manelpb
Copy link

manelpb commented May 2, 2022

@bschaatsbergen hey Bruno, much appreciated your contribution to the bug fix. I'm wondering if we could also add the same fix to the version 3.x of the provider? We really wanted to be able to use this feature, but upgrading all our modules to the version 4.x with a breaking change is a huge pain :(

@bschaatsbergen
Copy link
Member Author

Hey @manelpb, that's something I have to pull through the core team first. I can imagine it might not be their priority for now.

cc @gdavison

@jalaziz
Copy link
Contributor

jalaziz commented May 8, 2022

I just tried changing to explicit 6.0 and 6.2 values from 6.x and terraform is marking both situations as a full cluster replacement. Is that expected?

@bschaatsbergen
Copy link
Member Author

bschaatsbergen commented May 8, 2022

I would have to dive into this again to answer your question, perhaps @gdavison knows from the top of his head?

I believe it was something along the lines like:

  • Going from 6.x to to 6.2 or 6.0 is a downgrade, which triggers a replacement.
  • Going from 6.0 or 6.2 to 6.x should be a update in place.

@jalaziz
Copy link
Contributor

jalaziz commented May 9, 2022

@jalaziz that behaviour comes from the ElastiCache API itself.

From the top of my head:

  • Going from 6.x to to 6.2 or 6.0 is a downgrade, which triggers a replacement.
  • Going from 6.0 or 6.2 to 6.x should be a update in place.

@bschaatsbergen but engine_version_actual are 6.0.5 and 6.2.5, shouldn't Terraform recognize that the versions match and simply update state? AWS doesn't seem to even recognize 6.x after creation and, at least via console, shows 6.0 or 6.2 respectively.

The 6.x downgrade logic doesn't make much sense because then you could never upgrade major versions via terraform. For example, going from 6.x with engine_version_actual at 6.0.5 to 6.2 triggers a replacement. However, this is certainly possible via AWS API and console without replacement.

@Vassilis0
Copy link

I tried to explicitly set the engine_version to 6.2 (since it's optional we didn't set it initially), but it triggered a replacement.

~ engine_version         = "6.x" -> "6.2" # forces replacement
~ engine_version_actual  = "6.2.5" -> (known after apply)

Could this issue be solved by using terraform import ?

@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jun 19, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
documentation Introduces or discusses updates to documentation. service/elasticache Issues and PRs that pertain to the elasticache service. size/XL Managed by automation to categorize the size of a PR. tests PRs: expanded test coverage. Issues: expanded coverage, enhancements to test infrastructure.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AWS has changed Elasticache Redis versioning again
8 participants