Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: After switching an RDS instance from gp2 to gp3, following apply will fail if IOPS is not specified #28271

Closed
etiennechabert opened this issue Dec 9, 2022 · 19 comments · Fixed by #28361
Labels
bug Addresses a defect in current functionality. service/rds Issues and PRs that pertain to the rds service.

Comments

@etiennechabert
Copy link

etiennechabert commented Dec 9, 2022

Terraform Core Version

1.2.3

AWS Provider Version

4.46.0

Affected Resource(s)

  • aws_db_instance

Expected Behavior

If specifying iops is mandatory when choosing gp3:

  • You should not be able to apply the change without specifying it, a validator is missing

If IOPS is NOT mandatory when choosing gp3:

  • You should not be blocked during further apply, and your plan should not describe a change about iops

Actual Behavior

The first apply was successful, allowing me to switch an existing RDS instance from GP2 to GP3. But now my pipeline is blocked with the following plan/error

  ~ resource "aws_db_instance" "this" {
        id                                    = "a_db"
      ~ iops                                  = 3000 -> 0
        name                                  = "a_db"

....

│ Error: updating RDS DB Instance (a_db): operation error RDS: ModifyDBInstance, https response error StatusCode: 400, RequestID: 61792e08-c93b-4834-a35c-eb029b29ba30, api error InvalidParameterCombination: You can't specify IOPS or storage throughput for engine postgres and a storage size less than 400.

I can work around the error by adding to my resource the optional iops parameter with a value of 3000, unblocking my pipeline by avoiding the change:

  resource "aws_db_instance" "this" {
      id                                    = "a_db"
      iops                                  = 3000
      name                                  = "a_db"
        
...

Apply complete! Resources: 0 added, 0 changed, 0 destroyed.

Relevant Error/Panic Output Snippet

No response

Terraform Configuration Files

resource "aws_db_instance" "default" {
  ...
  allocated_storage  = 20
  storage_type       = "gp2"
}

After first apply (you have an RDS instance on gp2), change storage_type, and apply two times: after the 1st apply you are on gp3, but now the 2nd (and following) applies will fail:

resource "aws_db_instance" "default" {
  ...
  allocated_storage  = 20
  storage_type       = "gp3"
}

Steps to Reproduce

  • Create an RDS instance (postgres might be important), with a gp2 volume of 20 GB
  • After first apply, switch it to gp3 without specifying the IOPS parameter
  • Try to apply again, you should now see an unexpected change:
    • 0 → 3000 IOPS
    • Your apply will fail with the following error:
    • api error InvalidParameterCombination: You can't specify IOPS or storage throughput for engine postgres and a storage size less than 400

Debug Output

No response

Panic Output

No response

Important Factoids

No response

References

No response

Would you like to implement a fix?

None

@etiennechabert etiennechabert added bug Addresses a defect in current functionality. needs-triage Waiting for first response or review from a maintainer. labels Dec 9, 2022
@github-actions
Copy link

github-actions bot commented Dec 9, 2022

Community Note

Voting for Prioritization

  • Please vote on this issue by adding a 👍 reaction to the original post to help the community and maintainers prioritize this request.
  • Please see our prioritization guide for information on how we prioritize.
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.

Volunteering to Work on This Issue

  • If you are interested in working on this issue, please leave a comment.
  • If this would be your first contribution, please review the contribution guide.

@github-actions github-actions bot added the service/rds Issues and PRs that pertain to the rds service. label Dec 9, 2022
@ewbankkit ewbankkit removed the needs-triage Waiting for first response or review from a maintainer. label Dec 12, 2022
@ewbankkit
Copy link
Contributor

@etiennechabert Thanks for raising this issue.
In your Terraform configuration do you have an explicit iops = 0 or are you omitting that attribute completely?

@etiennechabert
Copy link
Author

@ewbankkit during the first apply, the one that succeeded initially, switching my instance from gp2 to gp3, my terraform configuration was not specifying anything for IOPS... but I realize now that the module we are using is setting iops to a default value of 0:

First apply

  • In my plan during this apply, there were no changes regarding iops described
    • That's probably because at this point, my database is still a gp2, therefor the remote IOPS is equal to 0
  • The apply is successful, my database is now using a gp3 volume

Following applies

During the following apply, the ones that failed, my terraform configuration was still not defining iops, but the default value of terraform-aws-modules/rds been 0, this is now probably clashing with the remote state of my database and causing the change:

  • My plan is now showing the following change iops = 3000 -> 0
  • The apply is failing with the error: You can't specify IOPS or storage throughput for engine postgres and a storage size less than 400.
    • Kind of unrelated error with the actual issue in my opinion.
  • My pipeline is blocked

Workaround

I am now forced to set iops = 3000 in my Terraform configuration to avoid the invalid change, unblocking my pipeline.

Proposal to avoid this

Should the provider do a max(3000, var.iops) when allocated_storage == 'gp3'?

My understanding is that a gp3 volume cannot have less than 3000 IOPS according to the documentation, that would probably avoid this issue all together.

Open question

With that been said, I am not sure anymore if this problem should actually be reported to https://github.com/terraform-aws-modules/terraform-aws-rds or is actually about the AWS Provider.

@etiennechabert
Copy link
Author

etiennechabert commented Dec 12, 2022

Actually reading one more time the documentation of the module we are using, and more specifically about storage_type, I think I can answer this question myself: https://github.com/terraform-aws-modules/terraform-aws-rds/blob/master/modules/db_instance/variables.tf#L24-L28

If you specify 'io1' or 'gp3' , you must also include a value for the 'iops' parameter...

Seems like by switching from gp2 to gp3, I was now using the module the wrong way by not specifying iops... and that my workaround is actually what is needed as documented.

Happy to close this issue if you say so.

@ewbankkit
Copy link
Contributor

@etiennechabert Thanks for the response.
Let's leave the issue open a while and see if anyone else from the community has input.

@TomWizen
Copy link

We are facing the same issue. When trying to configure a migration from gp2 to gp3 with iops parameter it is failing with the above error. The only possible way to do it right now it to separate it to two terraform runs, which is not ideal.

@etiennechabert
Copy link
Author

@TomWizen can you please share if you are using this module on top of the AWS Provider: https://github.com/terraform-aws-modules/terraform-aws-rds

@etiennechabert
Copy link
Author

etiennechabert commented Dec 13, 2022

I started over again all my tests, this time directly using aws_db_instance.

In this messages my results for creation tests, in next message I will put my transition test results:

  • Create a mariadb with gp3 🟢
  • Create a mariadb with gp3 while setting IOPS to 3000 🟥
  • Create a mariadb with gp3 while setting IOPS to 3000 and storage_throughput to 125 🟥

Note: same results with a postgres

Create a mariadb with gp3: 🟢

resource "aws_db_instance" "default_gp3" {
  identifier           = "default-gp3"
  allocated_storage    = 20
  db_name              = "direct_gp3"
  engine               = "mariadb"
  engine_version       = "10.3"
  instance_class       = "db.t3.micro"
  username             = "foo"
  password             = "foobarbaz"
  parameter_group_name = "default.mariadb10.3"
  skip_final_snapshot  = true
  apply_immediately    = true
  storage_type         = "gp3"
}

Just works

Create a mariadb with gp3 while setting IOPS to 3000 🟥

resource "aws_db_instance" "default_gp3" {
  identifier           = "default-gp3"
  allocated_storage    = 20
  db_name              = "direct_gp3"
  engine               = "mariadb"
  engine_version       = "10.3"
  instance_class       = "db.t3.micro"
  username             = "foo"
  password             = "foobarbaz"
  parameter_group_name = "default.mariadb10.3"
  skip_final_snapshot  = true
  apply_immediately    = true
  storage_type         = "gp3"
  iops                 = 3000
}

Error:

InvalidParameterCombination: You can't specify IOPS or storage throughput for engine mariadb and a storage size less than 400

Create a mariadb with gp3 while setting IOPS to 3000 and storage_throughput to 125 🟥

resource "aws_db_instance" "default_gp3" {
  identifier           = "default-gp3"
  allocated_storage    = 20
  db_name              = "direct_gp3"
  engine               = "mariadb"
  engine_version       = "10.3"
  instance_class       = "db.t3.micro"
  username             = "foo"
  password             = "foobarbaz"
  parameter_group_name = "default.mariadb10.3"
  skip_final_snapshot  = true
  apply_immediately    = true
  storage_type         = "gp3"
  iops                 = 3000
  storage_throughput   = 125
}

Error:

InvalidParameterCombination: You can't specify IOPS or storage throughput for engine mariadb and a storage size less than 400

I find this test interesting, because in the end, this is exactly the state of the database succesfully created in my first test (default_gp3), but you cannot create a DB with exactly these parameters 🤔

Note

I know that the creation is not what was initially the topic of this issue, but I still find it interesting and probably connected to this issue

@etiennechabert
Copy link
Author

etiennechabert commented Dec 13, 2022

Now about transitions:

  • Create a mariadb with gp2 and transition it to gp3 🟢
  • Create a mariadb with gp2 and transition it to gp3 and then change instance_class 🟢
  • Create a mariadb with gp2 and transition it to gp3 with IOPS to 3000 🟥

Create a mariadb with gp2 and transition it to gp3 🟢

Start with

resource "aws_db_instance" "default_gp2_to_gp3" {
  identifier           = "default-gp2-to-gp3"
  allocated_storage    = 20
  db_name              = "gp2_to_gp3"
  engine               = "mariadb"
  engine_version       = "10.3"
  instance_class       = "db.t3.micro"
  username             = "foo"
  password             = "foobarbaz"
  parameter_group_name = "default.mariadb10.3"
  skip_final_snapshot  = true
  apply_immediately    = true
  storage_type         = "gp2"
}

Then change

  storage_type         = "gp3"

Works

Create a mariadb with gp2 and transition it to gp3 and then change instance_class 🟢

Start with

resource "aws_db_instance" "default_gp2_to_gp3" {
  identifier           = "default-gp2-to-gp3"
  allocated_storage    = 20
  db_name              = "gp2_to_gp3"
  engine               = "mariadb"
  engine_version       = "10.3"
  instance_class       = "db.t3.micro"
  username             = "foo"
  password             = "foobarbaz"
  parameter_group_name = "default.mariadb10.3"
  skip_final_snapshot  = true
  apply_immediately    = true
  storage_type         = "gp2"
}

Then change

  storage_type         = "gp3"

Then change

  instance_class         = "db.t3.small"

Works, with this test I wanted to make sure that it was still possible to modify this DB, while not providing iops/storage_throughput

Create a MariaDB with gp2 and transition it to gp3 with IOPS to 3000 🟥

Start with

resource "aws_db_instance" "default_gp2_to_gp3" {
  identifier           = "default-gp2-to-gp3"
  allocated_storage    = 20
  db_name              = "gp2_to_gp3"
  engine               = "mariadb"
  engine_version       = "10.3"
  instance_class       = "db.t3.micro"
  username             = "foo"
  password             = "foobarbaz"
  parameter_group_name = "default.mariadb10.3"
  skip_final_snapshot  = true
  apply_immediately    = true
  storage_type         = "gp2"
}

Then change

  storage_type         = "gp3"
  iops                 = 3000

Error:

InvalidParameterCombination: You can't specify IOPS or storage throughput for engine mariadb and a storage size less than 400

Note: same result with

  storage_type         = "gp3"
  iops                 = 3000
  storage_throughput   = 125

@TomWizen
Copy link

@TomWizen can you please share if you are using this module on top of the AWS Provider: https://github.com/terraform-aws-modules/terraform-aws-rds

Yes

@etiennechabert
Copy link
Author

etiennechabert commented Dec 13, 2022

As a conclusion, my feeling is that there is a bug with the AWS Provider module:

  • It's possible to switch to gp3
  • You cannot create a database using gp3 while specifying iops/storage_throughput to the default values if your volume is smaller than 400GB
  • You cannot modify your volume to use storage_type = gp3 in combination with the parameters iops/storage_throughput if your volume is smaller than 400GB

This bug is turning to be quite blocking for the users using: https://github.com/terraform-aws-modules/terraform-aws-rds, as explained by @TomWizen, since you cannot have a valid configuration without 2x PRs, and this is because of the default values used by the module for the variable iops/storage_throughput

@LDVSOFT
Copy link

LDVSOFT commented Dec 13, 2022

I've hit the same with AWS RDS module because it has explicit variable "iops" { default = 0 }, specifying iops = null explicitly makes clean plans.

@etiennechabert
Copy link
Author

etiennechabert commented Dec 13, 2022

Agreed that this is a good workaround for the people using AWS RDS Terraform module ☝️

I just tested it and it's allowing:

  • A smooth transition from gp2 → gp3
  • A clean plan without any changes during the following applies

Good finding @LDVSOFT 👏

@etiennechabert
Copy link
Author

etiennechabert commented Dec 14, 2022

Regarding AWS RDS Terraform module, the version 5.2.1 fix the default value of IOPS:

https://github.com/terraform-aws-modules/terraform-aws-rds/releases/tag/v5.2.1

@ewbankkit
Copy link
Contributor

@etiennechabert Thanks for the very thorough write-up of your failing scenarios 👏.
I think the most maintainable answer here is to add to the aws_db_instance resource documentation noting that iops and storage_throughput cannot be specified with certain combinations of engine and allocated_storage (and to link to relevant AWS documentation) so that practitioners are warned that they may have to remove these attributes' values from Terraform code. Trying to add further logic to the provider to deal with configured values that should be ignored (rather than computed values returned from the RDS API) will further complicate an already large resource.

@cdl-danielchapman
Copy link

@ewbankkit

Just pasting my message from another post:

We have an issue where if we build an RDS instance at 100GiB specifying GP3 storage then the module dynamically sets IOPS to 3000 and throughput to 125MiB as expected

However, if we change the allocated_storage to 400GiB or more and apply that the apply fails with the below error

Error: updating RDS DB Instance (pg-prod-team-gp3): operation error RDS: ModifyDBInstance, https response error StatusCode: 400, RequestID: 970841cc-a412-4147-820c-700cb17e27fd, api error InvalidParameterCombination: Invalid iops value for engine name postgres and storage type gp3: 3000

If we build an RDS instance from a 100GiB snapshot but specify 400GiB in the terraform code at initial build it does the modification as expected and increases the storage/IOPS/throughput

But if we modify it after the instance has been built it errors. Ideally if it could dynamically lookup the values for IOPS/throughput that would be great. If they've been set in the code then set them to that but if they're null or not set to use the defaults i.e <400GiB 3K IOPS/125 MiB/S or >=400GiB 12K IOPS 500MiB/s

Do you know if this has been reported elsewhere?

@ewbankkit
Copy link
Contributor

@cdl-danielchapman Please open a new GitHub Issue. Thanks.

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 15, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Addresses a defect in current functionality. service/rds Issues and PRs that pertain to the rds service.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants