Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PowerVS start and possibly stop actions failing when resource already in the desired state #4400

Closed
christopher-horn opened this issue Mar 13, 2023 · 3 comments · Fixed by #4403
Assignees
Labels
service/Power Systems Issues related to Power Systems

Comments

@christopher-horn
Copy link

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform CLI and Terraform IBM Provider Version

Terraform v1.2.9
on linux_amd64

  • provider registry.terraform.io/community-terraform-providers/ignition v2.1.3
  • provider registry.terraform.io/hashicorp/null v3.2.1
  • provider registry.terraform.io/hashicorp/random v3.4.3
  • provider registry.terraform.io/ibm-cloud/ibm v1.49.0

Affected Resource(s)

  • ibm_pi_instance_start
  • ibm_pi_instance_stop

Now for ibm_pi_instance_stop we have not seen this issue and are only pointing out that a similar issue may exist since the start is not being handled properly, so if fixing the start handling it might be wise to check the stop as well.

Terraform Configuration Files

We are using the PowerVS OpenShift pattern here for 4.12 deployments:

https://github.com/ocp-power-automation/ocp4-upi-powervs

Expected Behavior

If Terraform fails because the PowerVS service broker times out when performing a start, and possibly stop, action and then the apply is resumed Terraform should see that the VM is already in the desired start and not fail.

Or is this actually a PowerVS API issue and they should not treat it as an error and pass it back to Terraform? I suppose there is an argument to be made for that.

Actual Behavior

Error: failed to perform Action on PVM Instance tz-206686-bootstrap :[POST /pcloud/v1/cloud-instances/{cloud_instance_id}/pvm-instances/{pvm_instance_id}/action][400] pcloudPvminstancesActionPostBadRequest  &{Code:0 Description:the Power virtual server instance 'tz-206686-bootstrap' is already in an ACTIVE state and cannot be started Error:bad request Message:}

Steps to Reproduce

Not easily reproducible as you need Terraform to timeout / fail at a specific point and for it to think the VM is still off when it actually finished powering on. The error that lead to me reporting this:

module.install.ibm_pi_instance_action.bootstrap_start[0]: Still creating... [14m40s elapsed]
module.install.ibm_pi_instance_action.bootstrap_start[0]: Still creating... [14m50s elapsed]

Error: context deadline exceeded

  with module.install.ibm_pi_instance_action.bootstrap_start[0],
  on modules/5_install/install.tf line 391, in resource "ibm_pi_instance_action" "bootstrap_start":
 391: resource "ibm_pi_instance_action" "bootstrap_start" {

We have seen this a few times recently.

@github-actions github-actions bot added the service/Power Systems Issues related to Power Systems label Mar 13, 2023
@christopher-horn
Copy link
Author

christopher-horn commented Mar 13, 2023

Some more information on this. I updated the pattern to raise the timeout parameters on the ibm_pi_instance_action stanzas from default 15 minutes to 30 minutes thinking maybe that would give it enough time. Well that causes it to now fail as follows:

module.install.ibm_pi_instance_action.bootstrap_start[0]: Still creating... [14m51s elapsed]
module.install.ibm_pi_instance_action.bootstrap_start[0]: Still creating... [15m1s elapsed]

Error: timeout while waiting for state to become 'ACTIVE, ERROR, ' (last state: 'PENDING', timeout: 15m0s)

  with module.install.ibm_pi_instance_action.bootstrap_start[0],
  on modules/5_install/install.tf line 391, in resource "ibm_pi_instance_action" "bootstrap_start":
 391: resource "ibm_pi_instance_action" "bootstrap_start" {

Every single provision we have attempted in lon06 today has failed now multiple times because of this.

@yussufsh
Copy link
Collaborator

@christopher-horn sure we can handle this in Terraform code but yes please check with PowerVS team if this can be implemented at the backend as well.
Also, I am thinking we should only handle just a few actions ie. start and stop(+imediate-shutdown). Not including reboot/reset actions which will add more complexity.

@christopher-horn
Copy link
Author

Thanks Yussuf! I agree it is just the simple actions that are a concern.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
service/Power Systems Issues related to Power Systems
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants