Disk downsize triggers recreate #8597

przsab · 2021-03-03T09:02:20Z

Terraform provider recreates google_compute_disk during attempt to downscale whereas regular API throws expected error.

When somebody misses a 0 in new disk size, all data is gone.

How API works:

$ gcloud compute disks describe disk01 --format="csv(sizeGb)"
60

$ gcloud compute disks resize disk01 --size=50g
ERROR: (gcloud.compute.disks.resize) Could not fetch resource:
 - Invalid value for field 'sizeGb': '50'. New disk size '50' GiB must be larger than existing size '60' GiB.

How Terraform works:

  # module.gcp-oracle.google_compute_disk.disks["disk01"] must be replaced
-/+ resource "google_compute_disk" "disks" {
      ...
      ~ size                      = 60 -> 50 # forces replacement
      ...
    }

The text was updated successfully, but these errors were encountered:

edwardmedia · 2021-03-03T15:55:46Z

@przsab Can you share your Terraform config you are using and the steps to repro the error?

przsab · 2021-03-04T07:14:39Z

Absolutely, same results are reproduced on v0.12.29, v0.13.4 and v0.14.7 using latest plugin 3.58.0.

$ nl disk.tf
     1  resource "google_compute_disk" "disk01" {
     2      name = "disk01"
     3      size = 10
     4  }

$ terraform init

Initializing the backend...

Initializing provider plugins...
- Checking for available provider plugins...
- Downloading plugin for provider "google" (hashicorp/google) 3.58.0...

The following providers do not have any version constraints in configuration,
so the latest version was installed.

To prevent automatic upgrades to new major versions that may contain breaking
changes, it is recommended to add version = "..." constraints to the
corresponding provider blocks in configuration, with the constraint strings
suggested below.

* provider.google: version = "~> 3.58"

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.

$ terraform version
Terraform v0.12.29
Terraform v0.13.4 <- same result
Terraform v0.14.7 <- same result
+ provider.google v3.58.0

Your version of Terraform is out of date! The latest version
is 0.14.7. You can update by downloading from https://www.terraform.io/downloads.html


$ terraform apply

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # google_compute_disk.disk01 will be created
  + resource "google_compute_disk" "disk01" {
      + creation_timestamp        = (known after apply)
      + id                        = (known after apply)
      + label_fingerprint         = (known after apply)
      + last_attach_timestamp     = (known after apply)
      + last_detach_timestamp     = (known after apply)
      + name                      = "disk01"
      + physical_block_size_bytes = (known after apply)
      + project                   = (known after apply)
      + self_link                 = (known after apply)
      + size                      = 10
      + source_image_id           = (known after apply)
      + source_snapshot_id        = (known after apply)
      + type                      = "pd-standard"
      + users                     = (known after apply)
      + zone                      = (known after apply)
    }

Plan: 1 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

google_compute_disk.disk01: Creating...
google_compute_disk.disk01: Creation complete after 7s [id=projects/some-project-1234/zones/us-central1-a/disks/disk01]

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

Disk is scaled to 5 from 10:

$ nl disk.tf
     1  resource "google_compute_disk" "disk01" {
     2      name = "disk01"
     3      size = 5
     4  }

$ terraform apply
google_compute_disk.disk01: Refreshing state... [id=projects/some-project-1234/zones/us-central1-a/disks/disk01]

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
-/+ destroy and then create replacement

Terraform will perform the following actions:

  # google_compute_disk.disk01 must be replaced
-/+ resource "google_compute_disk" "disk01" {
      ~ creation_timestamp        = "2021-03-03T22:58:34.534-08:00" -> (known after apply)
      ~ id                        = "projects/some-project-1234/zones/us-central1-a/disks/disk01" -> (known after apply)
      ~ label_fingerprint         = "42WmSpB8rSM=" -> (known after apply)
      - labels                    = {} -> null
      + last_attach_timestamp     = (known after apply)
      + last_detach_timestamp     = (known after apply)
        name                      = "disk01"
      ~ physical_block_size_bytes = 4096 -> (known after apply)
      ~ project                   = "some-project-1234" -> (known after apply)
      ~ self_link                 = "https://www.googleapis.com/compute/v1/projects/some-project-1234/zones/us-central1-a/disks/disk01" -> (known after apply)
      ~ size                      = 10 -> 5 # forces replacement
      + source_image_id           = (known after apply)
      + source_snapshot_id        = (known after apply)
        type                      = "pd-standard"
      ~ users                     = [] -> (known after apply)
      ~ zone                      = "us-central1-a" -> (known after apply)
    }

Plan: 1 to add, 0 to change, 1 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

google_compute_disk.disk01: Destroying... [id=projects/some-project-1234/zones/us-central1-a/disks/disk01]
google_compute_disk.disk01: Destruction complete after 6s
google_compute_disk.disk01: Creating...
google_compute_disk.disk01: Creation complete after 5s [id=projects/some-project-1234/zones/us-central1-a/disks/disk01]

Apply complete! Resources: 1 added, 0 changed, 1 destroyed.

Same attempt with gcloud yields expected error:

$ gcloud compute disks describe disk01 --format="csv(sizeGb)"
size_gb
5

$ gcloud compute disks resize disk01 --size=4

This command increases disk size. This change is not reversible.
For more information, see:
https://cloud.google.com/sdk/gcloud/reference/compute/disks/resize

Do you want to continue (Y/n)?  y

ERROR: (gcloud.compute.disks.resize) Could not fetch resource:
 - Invalid value for field 'sizeGb': '4'. New disk size '4' GiB must be larger than existing size '5' GiB.

przsab · 2021-03-04T07:58:34Z

And from competition:

Terraform will perform the following actions:

  # aws_ebs_volume.disk1 will be updated in-place
  ~ resource "aws_ebs_volume" "disk1" {
        id                   = "vol-0995144fb9d83d909"
      ~ size                 = 10 -> 5
        tags                 = {}
        # (7 unchanged attributes hidden)
    }

Plan: 0 to add, 1 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

aws_ebs_volume.disk1: Modifying... [id=vol-0995144fb9d83d909]

Error: InvalidParameterValue: New size cannot be smaller than existing size
        status code: 400, request id: 12afc30f-c50f-4839-94d6-c5715514d561

  on disk.tf line 1, in resource "aws_ebs_volume" "disk1":
   1: resource "aws_ebs_volume" "disk1" {

edwardmedia · 2021-03-04T16:53:54Z

@przsab below is the plan I got in my test and it seems fine (note: dose not say recreate on disk). I also see the resize call without tearing down the disk in the log. There is also a specific test in the provider.

POST /compute/v1/projects/myproject/zones/us-central1-a/disks/tf-test-96dybihpfv/resize?alt=json HTTP/1.1
Host: compute.googleapis.com
{
 "sizeGb": 100
}

  # google_compute_disk.disk01 will be updated in-place
  ~ resource "google_compute_disk" "disk01" {
        creation_timestamp        = "2021-03-04T08:42:04.659-08:00"
        id                        = "projects/myproject/zones/us-central1-a/disks/issue8597"
        label_fingerprint         = "42WmSpB8rSM="
        labels                    = {}
        name                      = "issue8597"
        physical_block_size_bytes = 4096
        project                   = "myproject"
        self_link                 = "https://www.googleapis.com/compute/v1/projects/myproject/zones/us-central1-a/disks/issue8597"
      ~ size                      = 5 -> 10
        type                      = "pd-standard"
        users                     = []
        zone                      = "us-central1-a"
    }

edwardmedia · 2021-03-04T18:29:14Z

@przsab downsizing disk is different and does trigger recreation which works as intended as GCP only allows upsizing through the API. Below is what I got. If you want to prevent destroy, you may add lifecycle.prevent_destroy in the config

  # google_compute_disk.disk01 must be replaced
-/+ resource "google_compute_disk" "disk01" {
      ~ creation_timestamp        = "2021-03-04T10:18:30.628-08:00" -> (known after apply)
      ~ id                        = "projects/myproject/zones/us-central1-a/disks/issue8597" -> (known after apply)
      ~ label_fingerprint         = "42WmSpB8rSM=" -> (known after apply)
      - labels                    = {} -> null
      + last_attach_timestamp     = (known after apply)
      + last_detach_timestamp     = (known after apply)
        name                      = "issue8597"
      ~ physical_block_size_bytes = 4096 -> (known after apply)
      ~ project                   = "myproject" -> (known after apply)
      ~ self_link                 = "https://www.googleapis.com/compute/v1/projects/myproject/zones/us-central1-a/disks/issue8597" -> (known after apply)
      ~ size                      = 10 -> 5 # forces replacement
      + source_image_id           = (known after apply)
      + source_snapshot_id        = (known after apply)
        type                      = "pd-standard"
      ~ users                     = [] -> (known after apply)
      ~ zone                      = "us-central1-a" -> (known after apply)
    }

edwardmedia · 2021-03-05T00:27:03Z

@przsab I have updated its doc to reflect this behavior. It will show up here after the new version of the provider released.

przsab · 2021-03-05T09:34:42Z

This is not the correct behavior, downsizing disk should cause an error not replace disk with smaller, empty one.

Prevent destroy is static feature and can't be parameterized, thus breaks any attempts to automate via pipelines (no, we can't consider sed or awk a solution to this).

Maybe disk replacement is useful for hello-world examples where you serve welcome page from nginx container, but for anything serious Terraform should follow API behavior and error out, just like AWS provider does.

rileykarson · 2021-03-05T17:57:24Z

Hey @przsab! Terraform is a higher-level tool than something like gcloud, and has different behaviour here. Instead of managing requests against resources it operates against the state of the resources where the requests taken are an implementation detail. So instead of declaring the operation/request you want it to perform like "resize my disk to XXGB" and having it return whether there was an error or not, you specify a desired end state for your resource and it brings it there (admittedly, to the best of its abilities- it is by no means a perfect abstraction) and only returns an error if it was unable to.

So in contrast with gcloud or the API, that means that when specifying a disk's size, you're stating an end state e.g. "my disk should have size XXGB". From there, for the google_compute_disk resource, Terraform is able to recognise that an underlying operation (resizing down) is impossible, and it suggests recreating it. At the plan stage, a user is able to see that resizing the disk down will recreate it, and choose not to proceed. Additionally, as @edwardmedia raised, lifecycle.prevent_changes can prevent it from being destroyed unless the element is removed from config. It's frustrating that it's not configurable through interpolation, I'll admit.

Our team definitely feels that destroying resources in these resources should be harder than it is, too. See hashicorp/terraform#24658.

przsab · 2021-03-08T15:19:57Z

Is the suggestion here to modify CD pipelines to scan for disk destroy operations in terraform plan to prevent recreate? Could you provide a set of regular expressions for each Terraform release to introduce such protection (POSIX or PCRE, 0.12 to 0.15)?

I understand you are fully aware of below unaddressed issues:
hashicorp/terraform#22544
hashicorp/terraform#17599

Is there anything else besides backwards-compatibility stopping you from implementing this change? What if solution relied on new lifecycle rule, i.e. prevent_downsize = true, that way you could retain compatibility and I wouldn't lose the data.

edwardmedia · 2021-03-08T20:18:57Z

@przsab as we have explained this behavior is by design. It is not a bug. You may file an enhancement instead so your request will be triaged. I am closing this issue now.

ghost · 2021-04-08T13:50:31Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 hashibot-feedback@hashicorp.com. Thanks!

edwardmedia self-assigned this Mar 3, 2021

edwardmedia added the bug label Mar 3, 2021

edwardmedia added the waiting-response label Mar 3, 2021

ghost removed waiting-response labels Mar 4, 2021

edwardmedia added the waiting-response label Mar 4, 2021

edwardmedia mentioned this issue Mar 4, 2021

[Doc] add note for compute disk on downsizing & upsizing behaviors #8614

Closed

ghost removed the waiting-response label Mar 5, 2021

edwardmedia added the waiting-response label Mar 5, 2021

ghost removed waiting-response labels Mar 8, 2021

edwardmedia closed this as completed Mar 8, 2021

przsab mentioned this issue Mar 10, 2021

Add prevent_downsize disk attribute #8642

Closed

ghost locked as resolved and limited conversation to collaborators Apr 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disk downsize triggers recreate #8597

Disk downsize triggers recreate #8597

przsab commented Mar 3, 2021 •

edited

Loading

edwardmedia commented Mar 3, 2021 •

edited

Loading

przsab commented Mar 4, 2021

przsab commented Mar 4, 2021

edwardmedia commented Mar 4, 2021 •

edited

Loading

edwardmedia commented Mar 4, 2021 •

edited

Loading

edwardmedia commented Mar 5, 2021 •

edited

Loading

przsab commented Mar 5, 2021

rileykarson commented Mar 5, 2021 •

edited

Loading

przsab commented Mar 8, 2021

edwardmedia commented Mar 8, 2021

ghost commented Apr 8, 2021

Disk downsize triggers recreate #8597

Disk downsize triggers recreate #8597

Comments

przsab commented Mar 3, 2021 • edited Loading

edwardmedia commented Mar 3, 2021 • edited Loading

przsab commented Mar 4, 2021

przsab commented Mar 4, 2021

edwardmedia commented Mar 4, 2021 • edited Loading

edwardmedia commented Mar 4, 2021 • edited Loading

edwardmedia commented Mar 5, 2021 • edited Loading

przsab commented Mar 5, 2021

rileykarson commented Mar 5, 2021 • edited Loading

przsab commented Mar 8, 2021

edwardmedia commented Mar 8, 2021

ghost commented Apr 8, 2021

przsab commented Mar 3, 2021 •

edited

Loading

edwardmedia commented Mar 3, 2021 •

edited

Loading

edwardmedia commented Mar 4, 2021 •

edited

Loading

edwardmedia commented Mar 4, 2021 •

edited

Loading

edwardmedia commented Mar 5, 2021 •

edited

Loading

rileykarson commented Mar 5, 2021 •

edited

Loading