Troubleshooting Lab: Implement and manage Azure Policy
Diagnostic Scenariosβ
Scenario 1 β Root Causeβ
An administrator created a custom policy with Deny effect to prevent the creation of storage accounts outside the brazilsouth and eastus2 regions. The policy was published as a definition at subscription scope and then assigned to the organization's root management group.
During testing, the development team reports that they are able to create storage accounts in the westeurope region without receiving any error messages or blocking.
The administrator verifies the following facts:
- The policy definition is visible in the portal, with
Enabledstatus - The assignment appears listed in the root management group
- The policy evaluation mode is configured as
Default - The subscription where testing occurs was created three weeks ago
- The development team uses an account with the
Contributorrole on the subscription
az policy assignment list --scope "/providers/Microsoft.Management/managementGroups/mg-root"
[
{
"name": "restrict-regions",
"enforcementMode": "DoNotEnforce",
"policyDefinitionId": "/subscriptions/.../policyDefinitions/restrict-regions",
"scope": "/providers/Microsoft.Management/managementGroups/mg-root"
}
]
What is the root cause of the observed behavior?
A) The policy was defined at subscription scope and cannot be applied from a management group.
B) The assignment has enforcementMode configured as DoNotEnforce, which prevents any blocking.
C) The developers' Contributor role has implicit permission to bypass policies with Deny effect.
D) The Default evaluation mode introduces a delay of up to 30 minutes before the policy takes effect.
Scenario 2 β Action Decisionβ
The governance team identified that a policy with DeployIfNotExists effect was correctly assigned to the production subscription five days ago. The policy should automatically install the Azure Monitor agent on all Windows VMs. New VMs created after the assignment are receiving the agent correctly.
However, the compliance dashboard shows that 47 VMs that existed before the assignment are still marked as non-compliant, and none of them have received the agent.
The cause has already been confirmed: pre-existing resources are not automatically handled by the DeployIfNotExists effect without explicit administrator action.
The environment is active production, with maintenance window available only on Sundays. Today is Thursday. The security team requires compliance to be achieved before next Sunday.
What is the correct action to take at this time?
A) Reassign the policy with the enforcementMode parameter set to Default to force immediate reevaluation of all existing resources.
B) Create a remediation task for the existing assignment, with scope limited to non-compliant VMs, and monitor deployment progress.
C) Delete the current assignment and recreate it so that the initial evaluation cycle treats existing VMs as new resources.
D) Wait for Sunday's maintenance window to execute remediation, as installing extensions on production VMs requires an approved window.
Scenario 3 β Root Causeβ
A security team assigned an initiative containing 12 policies to the mg-corp management group. One of the initiative's policies requires that all storage accounts have minimumTlsVersion equal to TLS1_2.
Three days after assignment, the team notices that storage accounts with TLS1_0 are still being created successfully in resource groups within mg-corp. The compliance dashboard shows these accounts as non-compliant, but no creation was blocked.
The administrator investigates and finds the following data:
- The initiative is assigned and visible in the correct scope
- The initiative assignment's
enforcementModeisDefault - All 12 policies have their definitions with
Enabledstatus - The
rg-storage-legacyresource group has an exclusion configured in the initiative assignment - The accounts being created with
TLS1_0are all in therg-storage-newresource group
az policy assignment show \
--name "security-initiative-assignment" \
--scope "/providers/Microsoft.Management/managementGroups/mg-corp"
{
"enforcementMode": "Default",
"notScopes": [
"/subscriptions/xxx/resourceGroups/rg-storage-legacy"
]
}
What is the root cause of the observed behavior?
A) The exclusion configured in rg-storage-legacy is incorrectly propagating to rg-storage-new due to a scope inheritance bug.
B) The TLS policy effect within the initiative is Audit and not Deny, which explains the logging without blocking.
C) Initiatives do not support the Deny effect in individual policies; all policies within an initiative are treated as Audit.
D) The Default enforcementMode in the initiative assignment disables the Deny effect for policies with custom parameters.
Scenario 4 β Diagnostic Sequenceβ
An administrator receives the following report: resources are being created without mandatory tags, despite a policy with Modify effect having been assigned to the subscription to automatically add them.
The administrator needs to investigate the cause. The available diagnostic steps are:
- Verify if the managed identity associated with the assignment has the necessary role in the correct scope
- Confirm that the policy effect in the definition is actually
Modifyand notAppendorAudit - Verify if the assignment exists in the expected scope and has
enforcementModeequal toDefault - Analyze the activity log of a recently created resource to identify if the policy was evaluated and what the result was
- Verify if the created resource belongs to a type covered by the
ifcondition in the policy definition
What is the correct diagnostic sequence?
A) 2, 3, 5, 1, 4 B) 3, 2, 5, 1, 4 C) 1, 2, 3, 4, 5 D) 4, 3, 2, 5, 1
Answer Key and Explanationsβ
Answer Key β Scenario 1β
Answer: B
The output from the az policy assignment list command explicitly shows "enforcementMode": "DoNotEnforce". This mode instructs Azure Policy to evaluate resources and record compliance, but to never block creations or modifications, regardless of the effect configured in the policy. The Deny effect is completely neutralized when DoNotEnforce is active.
The determining clue is in the code block in the scenario. The administrator who reads it carefully finds the cause without needing any other information.
The information about the subscription being created three weeks ago is irrelevant and was included purposely to simulate the real noise of a diagnosis, where context details don't always relate to the problem.
The most dangerous distractor is alternative D: the evaluation delay in the Default cycle exists, but affects compliance marking in the dashboard, not the real-time blocking capability of a creation operation. Acting based on this distractor would lead the administrator to wait for a problem that would never resolve itself.
Answer Key β Scenario 2β
Answer: B
The cause has already been stated in the scenario: the DeployIfNotExists effect does not act retroactively without intervention. The only way to fix pre-existing resources is by creating a remediation task for the assignment. This can be done immediately, without a maintenance window, and progress can be monitored in real-time through the portal or via CLI.
Alternative D seems prudent, but ignores a critical constraint: the security team requires compliance before Sunday, and waiting for the window would miss the deadline. Additionally, creating a remediation task is not a maintenance operation that requires a window; it schedules deployments managed by Azure Policy itself, which respects VM states.
Alternative A describes an action that does not have the described effect: changing enforcementMode does not trigger retroactive reevaluation of existing resources. Alternative C (deleting and recreating the assignment) does not transform existing resources into "new" ones for DeployIfNotExists evaluation purposes.
Answer Key β Scenario 3β
Answer: B
The scenario describes a behavior of logging without blocking: accounts are created successfully and appear as non-compliant in the dashboard. This pattern is the signature of the Audit effect. If the effect were Deny, creation would be rejected with an error. The root cause is that the TLS policy within the initiative is configured with Audit effect, not Deny.
The exclusion in rg-storage-legacy is the irrelevant information inserted purposely. It appears visibly in the code block and may attract diagnosis to alternative A, but the behavior is occurring in rg-storage-new, which is not in the notScopes list. The exclusion has no relation to the problem.
Alternative C is technically false: initiatives support policies with any effect, including Deny. Alternative D is also false: enforcementMode Default does not interfere with individual policy effects within an initiative; it controls whether the assignment as a whole blocks or not, and Default means blocking is active.
The most dangerous distractor is A, because it leads the administrator to investigate scope inheritance and exclusions, consuming time and deviating from the real problem, which lies in the policy definition itself.
Answer Key β Scenario 4β
Answer: B
The correct sequence is 3, 2, 5, 1, 4, which represents a progressive diagnosis from broadest to most specific:
Step 3 confirms that the assignment exists and is active in the expected scope. Without this, all other steps are irrelevant.
Step 2 validates that the configured effect is actually Modify. An error in the policy definition may make the observed behavior expected, not a problem.
Step 5 verifies if the resource type is covered by the if condition. A tag policy may have conditions that exclude certain resource types.
Step 1 checks the managed identity. The Modify effect requires that the managed identity associated with the assignment has the Contributor role (or equivalent) in scope to be able to modify resources. Without this permission, the policy evaluates but cannot execute the modification.
Step 4 is the final confirmation step: the activity log shows what actually happened during creation, including whether the policy was evaluated, what the result was, and if there was a permission error in the modification attempt.
Sequence A inverts logical order by checking the effect before confirming the assignment exists. Sequence C starts with managed identity permission before confirming that assignment and effect are correct, which wastes investigation on a component that may not even be involved. Sequence D starts with the log, which is valid for confirmation, but inefficient as a first step when it's still unknown if the assignment exists.
Troubleshooting Tree: Implement and manage Azure Policyβ
Legend:
| Color | Meaning |
|---|---|
| Dark blue | Initial symptom (entry point) |
| Blue | Diagnostic question |
| Red | Identified cause |
| Green | Recommended action or resolution |
| Orange | Intermediate verification or validation |
When facing a real Azure Policy problem, start at the root node and answer each question based on what you can observe directly: does the assignment exist? Is the enforcement mode correct? Does the configured effect correspond to the expected behavior? Each answer closes a path and directs to the next level of investigation. The goal is to reach a red cause-identified node through the fewest possible steps, avoiding premature corrective actions before confirming the diagnosis.