-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Provider: google-cloud] deleting an attached disk should not be possible #2078
Comments
There are multiple ways to attach disks to instances so I'm not sure exactly what state you are in right now. @jeff-knurek can you upload a more complete configuration and debug logs so that we can help investigate? |
here's an example of the disk provisioning with Terraform:
I'd say the steps are pretty straightforward and don't need logs, but I'll detail out the steps anyhow.
That's it. Step 3 should error out but instead deletes the disk with no error. The setup that has the disk in use is unaware of the disk removal because this isn't expected behavior example k8s manifest to put disk in use: apiVersion: apps/v1beta2
kind: Deployment
metadata:
name: my-test
namespace: test
spec:
replicas: 1
selector:
matchLabels:
app: my-test
template:
metadata:
labels:
app: my-test
spec:
containers:
- name: bbox
image: busybox
command: ['sh', '-c', 'echo The app is running! && sleep 3600']
ports:
- containerPort: 80
name: http
volumeMounts:
- name: disk-test
mountPath: /tmp/test
resources:
requests:
memory: "50Mi"
cpu: "50m"
volumes:
- name: disk-test
persistentVolumeClaim:
claimName: pv-test
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-test
namespace: test
spec:
storageClassName: ""
capacity:
storage: 50G
accessModes:
- ReadWriteOnce
gcePersistentDisk:
pdName: disk-test
fsType: ext4
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pv-test
namespace: test
spec:
storageClassName: ""
volumeName: pv-test
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50G |
It looks like you have terraform managing the creation of the disk but you are using a different tool in order to manage the attachment of the disk to an instance. In this case terraform has no knowledge that the disk was supposed to be attached to an instance an so in this case when you remove the disk from your config, if terraform was the authoritative source of your configuration then it could be considered correct behavior to detach and delete the disk rather then fail. We are currently intentionally detaching disks from an instance during the delete process. So I'm not sure I would call this a bug, but given that the rest api fails to delete a disk in use it's a bit of a gray area IMO. Finally it would be a breaking change to do what you are suggesting, so I'll bring it up with the team and see where we fall. |
Also if you wanted to let us know more details around why you want this behavior to happen we can think about if there are different or better solutions. In the mean time I propose a few possible solutions:
|
That makes sense, and I can see how this falls into a gray area. From our perspective, it's not possible to use terraform for all our resource management. The google & kubernetes providers have come along way in the past year, but there is still lacking functionality in places. And since disk allocation is managed by the kubernetes API specifically, I'm not clear on how terraform would be able to know about the state of an attached disk in this use case. We try to use terraform as a replacement for the gcloud cli tool in all places possible. And so for us, this is why we expect terraform to respect the api failure during this. |
I think there's an argument for letting Kubernetes to manage the disk attachments--that's a really gray area, but there's a certain level of pragmatism I think we have to apply in that case. However, I'm not 100% clear on how it's relevant. I think the more important question is, why are you removing the disks from your config when they're in use? What does the error enable you to do that you can't do today? |
@paddycarver I've been thinking about this a bit more and right now if a disk is attached to a To make sure we aren't orphaning disks we still need to delete them when attached with |
b/262494546 |
I've looked into this issue and found out that the request is regarding the disks, those disks which are attached shouldn't be deleted and an error should've been thrown. Similar to the error thrown via gCloud cli in the above comment.
Now, in this case if I comment out the disk, an error will be thrown stating
|
Terraform detaching disk users made a lot of sense when disk was initially implemented, but fine-grained resources like the attachment resource may make this less necessary now. We could consider guarding the behaviour behind a Not sure if that works for all use cases, though- we'd want to investigate this substantially before changing anything. |
Note that if the behavior were guarded behind a deletion policy, and the current behavior is the default, that would not be considered a breaking change. For more information: |
Closing the issue with the response from the service team: The api does not allow deletion of disk if attached to an instances. There is no force delete option. The disk has to be detached first. It is recommended to manage all of resources using Terraform. |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. |
Affected Resource(s)
This issue was originally opened by @jeff-knurek as hashicorp/terraform#18898. It was migrated here as a result of the provider split. The original body of the issue is below.
There seems to be enough issues in the past regarding disks that are attached and not being able to properly manage desired state:
hashicorp/terraform#7792
hashicorp/terraform#12760
hashicorp/terraform#12398
But if I run:
gcloud compute disks delete DISKNAME
I get the following error from Google:
However, if I comment out the disk configuration in terraform and run
apply
,Terraform will perform the following actions: - module.-.module.-.google_compute_disk.disk .... Apply complete! Resources: 0 added, 0 changed, 1 destroyed.
This then leaves everything that was attached to it in a strange state of not being able to connect.
My expected behaviour is to see the same error message that the
gcloud
command outputs.b/305377667
The text was updated successfully, but these errors were encountered: