Skip to content
This repository has been archived by the owner on Oct 24, 2023. It is now read-only.

Specifying multiple VM extensions in apimodel causes occasional deployment failures #2392

Closed
marosset opened this issue Dec 4, 2019 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@marosset
Copy link
Contributor

marosset commented Dec 4, 2019

Describe the bug
Specifying multiple VM extensions in an apimodel can cause random deployment failures.

We are seeing frequently on the sig-windows k8s test passes with the following apimodel/deployment template:
https://github.com/kubernetes-sigs/windows-testing/blob/master/job-templates/kubernetes_release_1_17.json

The deployments fail with the following error:
W1203 14:18:21.338] 2019/12/03 14:18:21 main.go:319: Something went wrong: starting e2e cluster: error creating cluster: cannot deploy: cannot get the create deployment future response: Code="DeploymentFailed" Message="At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/DeployOperations for usage details." Details=[{"code":"Conflict","message":"{\r\n "status": "Failed",\r\n "error": {\r\n "code": "ResourceDeploymentFailure",\r\n "message": "The resource operation completed with terminal provisioning state 'Failed'.",\r\n "details": [\r\n {\r\n "code": "DeploymentFailed",\r\n "message": "At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/DeployOperations for usage details.",\r\n "details": [\r\n {\r\n "code": "Conflict",\r\n "message": "{\r\n \"error\": {\r\n \"code\": \"Conflict\",\r\n \"message\": \"The request failed due to conflict with a concurrent request. \"\r\n }\r\n}"\r\n }\r\n ]\r\n }\r\n ]\r\n }\r\n}"}]

Example:
https://prow.k8s.io/view/gcs/kubernetes-jenkins/logs/ci-kubernetes-e2e-aks-engine-azure-1-17-windows/1201856259427405824

Expected behavior
Deployments do not fail

AKS Engine version
latest

Kubernetes version
any

Additional context

@marosset marosset added the bug Something isn't working label Dec 4, 2019
@marosset
Copy link
Contributor Author

marosset commented Dec 4, 2019

It looks like ARM is not happy having 3 different extensions (billing extension, master_extension, and gmsa-coredns extension) having deponsOn relationships with the cse-master-* extension and is throwing conflicts when processing the ARM template.

I tried to create a VM with multiple extensions through the Azure portal and noticed that the first extension had a deponsOn relationship on the vm and each additional extension had a depondsOn relationship to the previous extension.

I'm trying to get guidance from the ARM team and/or compute RP team to verify this is the cause.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant