Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🪲 Policy definitions should be specified only at or above the policy set definition's scope. #179

Closed
autocloudarc opened this issue Mar 9, 2022 · 9 comments
Assignees

Comments

@autocloudarc
Copy link

autocloudarc commented Mar 9, 2022

Describe the bug

A list of policy definitions are shown as invalid in the error message.

To Reproduce

Steps to reproduce the behaviour:

  1. Run the pipeline with the following env: variables values:
env:
  ManagementGroupPrefix: "alz07"
  TopLevelManagementGroupDisplayName: "Azure Landing Zones"
  Location: "centralus"
  LoggingSubId: "redacted"
  LoggingResourceGroupName: "alz-logging-rgp-01"
  HubNetworkSubId: "redacted"
  HubNetworkResourceGroupName: "alz-network-hub-rgp-01"
  RoleAssignmentManagementGroupId: "alz07"
  SpokeNetworkSubId: "redacted"
  SpokeNetworkResourceGroupName: "spoke-networking-rgp-01"
  runNumber: ${{ github.run_number }}

Expected behaviour

Pipeline runs without error and deploys the platform and landing zones.

Screenshots 📷

image

image

Correlation ID

A correlation ID really helps us investigate your issue further. Please provide one if possible. Details on how to find a correlation ID can be found here: Correlation ID and support

3c8058dc-ca85-4731-92f3-fbc76f96d8f3

Additional context

{
    "status": "Failed",
    "error": {
        "code": "InvalidCreatePolicySetDefinitionRequest",
        "message": "The policy set definition 'Deploy-Diagnostics-LogAnalytics' request is invalid. Policy definitions should be specified only at or above the policy set definition's scope. The following policy definitions are invalid: 'Deploy-Diagnostics-ACI,Deploy-Diagnostics-ACR,Deploy-Diagnostics-AnalysisService,Deploy-Diagnostics-ApiForFHIR,Deploy-Diagnostics-APIMgmt,Deploy-Diagnostics-ApplicationGateway,Deploy-Diagnostics-WebServerFarm,Deploy-Diagnostics-Website,Deploy-Diagnostics-AA,Deploy-Diagnostics-CDNEndpoints,Deploy-Diagnostics-CognitiveServices,Deploy-Diagnostics-CosmosDB,Deploy-Diagnostics-Databricks,Deploy-Diagnostics-DataExplorerCluster,Deploy-Diagnostics-DataFactory,Deploy-Diagnostics-DLAnalytics,Deploy-Diagnostics-EventGridSub,Deploy-Diagnostics-EventGridTopic,Deploy-Diagnostics-EventGridSystemTopic,Deploy-Diagnostics-ExpressRoute,Deploy-Diagnostics-Firewall,Deploy-Diagnostics-FrontDoor,Deploy-Diagnostics-Function,Deploy-Diagnostics-HDInsight,Deploy-Diagnostics-iotHub,Deploy-Diagnostics-LoadBalancer,Deploy-Diagnostics-LogicAppsISE,Deploy-Diagnostics-MariaDB,Deploy-Diagnostics-MediaService,Deploy-Diagnostics-MlWorkspace,Deploy-Diagnostics-MySQL,Deploy-Diagnostics-NIC,Deploy-Diagnostics-NetworkSecurityGroups,Deploy-Diagnostics-PostgreSQL,Deploy-Diagnostics-PowerBIEmbedded,Deploy-Diagnostics-RedisCache,Deploy-Diagnostics-Relay,Deploy-Diagnostics-SignalR,Deploy-Diagnostics-SQLElasticPools,Deploy-Diagnostics-SQLMI,Deploy-Diagnostics-TimeSeriesInsights,Deploy-Diagnostics-TrafficManager,Deploy-Diagnostics-VM,Deploy-Diagnostics-VirtualNetwork,Deploy-Diagnostics-VMSS,Deploy-Diagnostics-VNetGW,Deploy-Diagnostics-WVDAppGroup,Deploy-Diagnostics-WVDHostPools,Deploy-Diagnostics-WVDWorkspace'."
    }
}
@ghost ghost added the Needs: Triage 🔍 Needs triaging by the team label Mar 9, 2022
@jtracey93
Copy link
Collaborator

@autocloudarc This looks like a RACE condition issue we see sometimes in all deployment experiences.

Did you customise the policy definitions module at all?

The fix/workaround

Give it 10 minutes and run again and it should be fine.

This happens due to ARM not replicating fast enough in the regions you are deploying to. And sometimes the ARM nodes processing the request haven't caught up and think the policies don't exist, but they do, it just hasn't replicated fully yet.

Running again fixes this as the gap between runs allows the relocation to catch up.

Linking this to a master issue we are tracking and working with engineering teams on to resolve. Azure/Enterprise-Scale#902

Thanks and let us know if this works or doesn't. 👍

@jtracey93 jtracey93 self-assigned this Mar 9, 2022
@jtracey93 jtracey93 added Area: Policy and removed Needs: Triage 🔍 Needs triaging by the team labels Mar 9, 2022
@autocloudarc
Copy link
Author

Thanks @jtracey93 . After 2 hours between runs, then 2 minutes later after that with a subsequent run, same issue unfortunately. Here is the screenshot of the latest:

image

@ghost ghost added Needs: Attention 👋 Needs attention from the maintainers and removed Needs: Author Feedback labels Mar 10, 2022
@autocloudarc
Copy link
Author

Also @jtracey93; No, I didn't customize any of the policy definition modules at all.

@autocloudarc
Copy link
Author

Here is the Deploy Custom Policy Definitions step now from the deploy-alz.yml file.

      - name: Deploy Custom Policy Definitions
        id: create_policy_defs
        uses: azure/arm-deploy@v1
        with:
          scope: managementgroup
          managementGroupId: ${{ env.ManagementGroupPrefix }}
          region: ${{ env.Location }}
          template: infra-as-code/bicep/modules/policy/definitions/custom-policy-definitions.bicep
          parameters: infra-as-code/bicep/modules/policy/definitions/custom-policy-definitions.parameters.example.json
          deploymentName: create_policy_defs-${{ env.runNumber }}
          failOnStdErr: false

@ejhenry
Copy link
Contributor

ejhenry commented Mar 10, 2022

@autocloudarc Are you using the default value for parTargetManagementGroupID in the custom-policy-definitions.parameters.example.json file?

@jtracey93 jtracey93 added Needs: Author Feedback and removed Needs: Attention 👋 Needs attention from the maintainers labels Mar 10, 2022
@jtracey93
Copy link
Collaborator

@autocloudarc Are you using the default value for parTargetManagementGroupID in the custom-policy-definitions.parameters.example.json file?

I think @ejhenry may have found the issue here. Ensure the parameters you are passing in are correct for the management group ID as it uses this to lookup the intermediate root management group for the policy sets.

So if you are not using 'alz' then you need to update the parameters.

@autocloudarc
Copy link
Author

@jtracey93 and @ejhenry . Thank you. Yes, updating that parameter to match my custom top level management prefix id of alz07 did work in that the deployment progressed a bit further, but a similar error re-appeared. It seems to be because there are still numerous other dependencies in various files that would need to be updated. Doing a [ctrl-shift-] shows the following for 'alz':

image

When I reverted my top-level management group id to the default of 'alz' as well as the value suggested by @ejhenry, it did get past the Deploy Custom Policy Definition step, however, it would have been better to have the ability to specify a custom prefix value once and then have that propagate to update all other relevant values throughout the code, so that part now is more of a feature request, which seems to have already been addressed in #158

@ghost ghost added Needs: Attention 👋 Needs attention from the maintainers and removed Needs: Author Feedback labels Mar 10, 2022
@jtracey93
Copy link
Collaborator

Hey @autocloudarc,

All of these are parameterised, so if you update the top-level management group prefix. You need to update all the parameter files to the same input. All the modules today, already support using a different top-level prefix, but you need to ensure you update the parameter files for each module to match or tailor to your needs.

This is by design to keep them modules flexible and customizable easily via parameter inputs.

This is not related to #158 for clarity.

Please ensure you read through each of the module README.md files to ensure you set the parameters correctly. https://github.com/Azure/ALZ-Bicep/wiki/DeploymentFlow

What where the other errors you saw?

@jtracey93 jtracey93 added Needs: Author Feedback and removed Needs: Attention 👋 Needs attention from the maintainers labels Mar 10, 2022
@ghost
Copy link

ghost commented Mar 14, 2022

This issue has been automatically marked as stale because it has been marked as requiring author feedback but has not had any activity for 4 days. It will be closed if no further activity occurs within 3 days of this comment.

@ghost ghost closed this as completed Mar 17, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Apr 16, 2022
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants