Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Optimized deployment resolution runtime to reduce unnecessary wait time #2461

Merged
merged 53 commits into from
Jul 22, 2024

Conversation

AlexanderSehr
Copy link
Contributor

@AlexanderSehr AlexanderSehr commented Jun 16, 2024

Description

  • Changed the way the removal logic tries to resolve deployment names to their actual resources
  • Up until now the logic was as follows
    • One deployment name after another (up to 3 as per 3 retries) where passed into the script
    • The script would then, for each deploymentname respectively search for the deployment
    • If it didn't find it, it would retry after one minute up to 40 times
  • This meant, for 3 failed deployments that fail in a way that they don't create a deployment (e.g. in case the template failed already the initial validation, it would wait 3 x 40 minutes = 120 minutes
  • This, updated approach works as follows
    • All 3 deployment names are passed into the script
    • For each round of trying, it searches for all 3 deployment names
    • If at least one isn't found, it will try again, as before after one minute to a maximum of 40 times
    • Each found name will be skript by the retries
  • This way, the script in its entirety only works 40 minutes in total and as such will make sure it will definitely wait a maximum of 40 minutes for each deployment to show up

Note: To test a failing deployment, I used a long-running deployment script which I would maniupulate in Azure during deployment, and also delete one of the retries before the removal would start. This is why in the log you can see that the removal logic finds attempts 2 & 3 but not 1, which is what was intended.

// In dependencies.bicep of WAF-aligned test
resource deploymentScript 'Microsoft.Resources/deploymentScripts@2023-08-01' = {
 name: 'testScript'
 location: location
 identity: {
   type: 'UserAssigned'
   userAssignedIdentities: {
     '${managedIdentity.id}': {}
   }
 }
 kind: 'AzurePowerShell'
 properties: {
   azPowerShellVersion: '9.7'
   retentionInterval: 'P1D'
   scriptContent: 'Start-Sleep 180'
 }
}

Pipeline Reference

Pipeline
avm.res.network.virtual-network (failed deployment test)
avm.res.key-vault.vault (successful deployment test)

Test run with a faulty module (VNET): https://github.com/Azure/bicep-registry-modules/actions/runs/9613612546
Test run with a working module (KeyVault): https://github.com/Azure/bicep-registry-modules/actions/runs/9536144913

Type of Change

  • Update to CI Environment or utilities (Non-module affecting changes)
  • Azure Verified Module updates:
    • Bugfix containing backwards-compatible bug fixes, and I have NOT bumped the MAJOR or MINOR version in version.json:
      • Someone has opened a bug report issue, and I have included "Closes #{bug_report_issue_number}" in the PR description.
      • The bug was found by the module author, and no one has opened an issue to report it yet.
    • Feature update backwards compatible feature updates, and I have bumped the MINOR version in version.json.
    • Breaking changes and I have bumped the MAJOR version in version.json.
    • Update to documentation

Example log: Successful deployments run

  VERBOSE: Invoke task with
  VERBOSE: {
    "SubscriptionId": "***",
    "ManagementGroupId": "***",
    "TemplateFilePath": "/home/runner/work/bicep-registry-modules/bicep-registry-modules/avm/res/key-vault/vault/tests/e2e/waf-aligned/main.test.bicep",
    "DeploymentNames": "a-r-kv-v-waf-aligned-t1-20240616T1206450259Z"
  }
  
  VERBOSE: Handling resource removal with deployment names
  VERBOSE: - a-r-kv-v-waf-aligned-t1-20240616T1206450259Z
  VERBOSE: Total number of deployment target resources after fetching deployments [18]
  VERBOSE: Total number of deployment target resources after pre-filtering (duplicates) & ordering items [18]
  VERBOSE: Total number of deployment target resources after formatting items [18]
  VERBOSE: Total number of deployments after filtering all dependency resources [18]
  VERBOSE: Total number of deployments after final ordering of resources [18]
  VERBOSE: - Remove [/subscriptions/***/resourceGroups/dep-***-keyvault.vaults-kvvwaf-rg/providers/Microsoft.KeyVault/vaults/***kvvwaf002/providers/Microsoft.Authorization/locks/myCustomLockName]

Example log: Failed deployments run

  VERBOSE: Invoke task with
  VERBOSE: {
    "ManagementGroupId": "***",
    "SubscriptionId": "***",
    "DeploymentNames": [
      "a-r-n-vn-waf-aligned-t1-20240621T1206370696Z",
      "a-r-n-vn-waf-aligned-t2-20240621T1206433769Z",
      "a-r-n-vn-waf-aligned-t3-20240621T1206013450Z"
    ],
    "TemplateFilePath": "/home/runner/work/bicep-registry-modules/bicep-registry-modules/avm/res/network/virtual-network/tests/e2e/waf-aligned/main.test.bicep"
  }
  
  VERBOSE: Handling resource removal with deployment names
  VERBOSE: - a-r-n-vn-waf-aligned-t1-20240621T1206370696Z
  VERBOSE: - a-r-n-vn-waf-aligned-t2-20240621T1206433769Z
  VERBOSE: - a-r-n-vn-waf-aligned-t3-20240621T1206013450Z
  VERBOSE: Found & resolved deployment [a-r-n-vn-waf-aligned-t2-20240621T1206433769Z]
  VERBOSE: Found & resolved deployment [a-r-n-vn-waf-aligned-t3-20240621T1206013450Z]
  VERBOSE: No deployment found by name(s) [@{Name=a-r-n-vn-waf-aligned-t1-20240621T1206370696Z}] in scope [subscription]. Retrying in [60] seconds [1/40]
  VERBOSE: No deployment found by name(s) [@{Name=a-r-n-vn-waf-aligned-t1-20240621T1206370696Z}] in scope [subscription]. Retrying in [60] seconds [2/40]

@AlexanderSehr AlexanderSehr added Type: AVM 🅰️ ✌️ Ⓜ️ This is an AVM related issue Needs: Core Team 🧞 This item needs the AVM Core Team to review it Type: CI 🚀 This issue is related to the AVM CI labels Jun 16, 2024
@AlexanderSehr AlexanderSehr self-assigned this Jun 16, 2024
@microsoft-github-policy-service microsoft-github-policy-service bot added the Needs: Triage 🔍 Maintainers need to triage still label Jun 16, 2024
@AlexanderSehr AlexanderSehr removed the Needs: Triage 🔍 Maintainers need to triage still label Jun 21, 2024
@AlexanderSehr AlexanderSehr marked this pull request as ready for review June 21, 2024 13:36
@AlexanderSehr AlexanderSehr requested a review from a team as a code owner June 21, 2024 13:36
@AlexanderSehr AlexanderSehr enabled auto-merge (squash) June 21, 2024 13:36
@AlexanderSehr AlexanderSehr merged commit b2e88e7 into main Jul 22, 2024
6 of 7 checks passed
@AlexanderSehr AlexanderSehr deleted the users/alsehr/deploymentWaitOptimization branch July 22, 2024 08:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs: Core Team 🧞 This item needs the AVM Core Team to review it Type: AVM 🅰️ ✌️ Ⓜ️ This is an AVM related issue Type: CI 🚀 This issue is related to the AVM CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants