Skip to main content

Troubleshooting Lab: Implement and manage Azure Policy

Diagnostic Scenarios​

Scenario 1 β€” Root Cause​

An administrator created a custom policy with Deny effect to prevent the creation of storage accounts outside the brazilsouth and eastus2 regions. The policy was published as a definition at subscription scope and then assigned to the organization's root management group.

During testing, the development team reports that they are able to create storage accounts in the westeurope region without receiving any error messages or blocking.

The administrator verifies the following facts:

  • The policy definition is visible in the portal, with Enabled status
  • The assignment appears listed in the root management group
  • The policy evaluation mode is configured as Default
  • The subscription where testing occurs was created three weeks ago
  • The development team uses an account with the Contributor role on the subscription
az policy assignment list --scope "/providers/Microsoft.Management/managementGroups/mg-root"
[
{
"name": "restrict-regions",
"enforcementMode": "DoNotEnforce",
"policyDefinitionId": "/subscriptions/.../policyDefinitions/restrict-regions",
"scope": "/providers/Microsoft.Management/managementGroups/mg-root"
}
]

What is the root cause of the observed behavior?

A) The policy was defined at subscription scope and cannot be applied from a management group. B) The assignment has enforcementMode configured as DoNotEnforce, which prevents any blocking. C) The developers' Contributor role has implicit permission to bypass policies with Deny effect. D) The Default evaluation mode introduces a delay of up to 30 minutes before the policy takes effect.


Scenario 2 β€” Action Decision​

The governance team identified that a policy with DeployIfNotExists effect was correctly assigned to the production subscription five days ago. The policy should automatically install the Azure Monitor agent on all Windows VMs. New VMs created after the assignment are receiving the agent correctly.

However, the compliance dashboard shows that 47 VMs that existed before the assignment are still marked as non-compliant, and none of them have received the agent.

The cause has already been confirmed: pre-existing resources are not automatically handled by the DeployIfNotExists effect without explicit administrator action.

The environment is active production, with maintenance window available only on Sundays. Today is Thursday. The security team requires compliance to be achieved before next Sunday.

What is the correct action to take at this time?

A) Reassign the policy with the enforcementMode parameter set to Default to force immediate reevaluation of all existing resources. B) Create a remediation task for the existing assignment, with scope limited to non-compliant VMs, and monitor deployment progress. C) Delete the current assignment and recreate it so that the initial evaluation cycle treats existing VMs as new resources. D) Wait for Sunday's maintenance window to execute remediation, as installing extensions on production VMs requires an approved window.


Scenario 3 β€” Root Cause​

A security team assigned an initiative containing 12 policies to the mg-corp management group. One of the initiative's policies requires that all storage accounts have minimumTlsVersion equal to TLS1_2.

Three days after assignment, the team notices that storage accounts with TLS1_0 are still being created successfully in resource groups within mg-corp. The compliance dashboard shows these accounts as non-compliant, but no creation was blocked.

The administrator investigates and finds the following data:

  • The initiative is assigned and visible in the correct scope
  • The initiative assignment's enforcementMode is Default
  • All 12 policies have their definitions with Enabled status
  • The rg-storage-legacy resource group has an exclusion configured in the initiative assignment
  • The accounts being created with TLS1_0 are all in the rg-storage-new resource group
az policy assignment show \
--name "security-initiative-assignment" \
--scope "/providers/Microsoft.Management/managementGroups/mg-corp"
{
"enforcementMode": "Default",
"notScopes": [
"/subscriptions/xxx/resourceGroups/rg-storage-legacy"
]
}

What is the root cause of the observed behavior?

A) The exclusion configured in rg-storage-legacy is incorrectly propagating to rg-storage-new due to a scope inheritance bug. B) The TLS policy effect within the initiative is Audit and not Deny, which explains the logging without blocking. C) Initiatives do not support the Deny effect in individual policies; all policies within an initiative are treated as Audit. D) The Default enforcementMode in the initiative assignment disables the Deny effect for policies with custom parameters.


Scenario 4 β€” Diagnostic Sequence​

An administrator receives the following report: resources are being created without mandatory tags, despite a policy with Modify effect having been assigned to the subscription to automatically add them.

The administrator needs to investigate the cause. The available diagnostic steps are:

  1. Verify if the managed identity associated with the assignment has the necessary role in the correct scope
  2. Confirm that the policy effect in the definition is actually Modify and not Append or Audit
  3. Verify if the assignment exists in the expected scope and has enforcementMode equal to Default
  4. Analyze the activity log of a recently created resource to identify if the policy was evaluated and what the result was
  5. Verify if the created resource belongs to a type covered by the if condition in the policy definition

What is the correct diagnostic sequence?

A) 2, 3, 5, 1, 4 B) 3, 2, 5, 1, 4 C) 1, 2, 3, 4, 5 D) 4, 3, 2, 5, 1


Answer Key and Explanations​

Answer Key β€” Scenario 1​

Answer: B

The output from the az policy assignment list command explicitly shows "enforcementMode": "DoNotEnforce". This mode instructs Azure Policy to evaluate resources and record compliance, but to never block creations or modifications, regardless of the effect configured in the policy. The Deny effect is completely neutralized when DoNotEnforce is active.

The determining clue is in the code block in the scenario. The administrator who reads it carefully finds the cause without needing any other information.

The information about the subscription being created three weeks ago is irrelevant and was included purposely to simulate the real noise of a diagnosis, where context details don't always relate to the problem.

The most dangerous distractor is alternative D: the evaluation delay in the Default cycle exists, but affects compliance marking in the dashboard, not the real-time blocking capability of a creation operation. Acting based on this distractor would lead the administrator to wait for a problem that would never resolve itself.


Answer Key β€” Scenario 2​

Answer: B

The cause has already been stated in the scenario: the DeployIfNotExists effect does not act retroactively without intervention. The only way to fix pre-existing resources is by creating a remediation task for the assignment. This can be done immediately, without a maintenance window, and progress can be monitored in real-time through the portal or via CLI.

Alternative D seems prudent, but ignores a critical constraint: the security team requires compliance before Sunday, and waiting for the window would miss the deadline. Additionally, creating a remediation task is not a maintenance operation that requires a window; it schedules deployments managed by Azure Policy itself, which respects VM states.

Alternative A describes an action that does not have the described effect: changing enforcementMode does not trigger retroactive reevaluation of existing resources. Alternative C (deleting and recreating the assignment) does not transform existing resources into "new" ones for DeployIfNotExists evaluation purposes.


Answer Key β€” Scenario 3​

Answer: B

The scenario describes a behavior of logging without blocking: accounts are created successfully and appear as non-compliant in the dashboard. This pattern is the signature of the Audit effect. If the effect were Deny, creation would be rejected with an error. The root cause is that the TLS policy within the initiative is configured with Audit effect, not Deny.

The exclusion in rg-storage-legacy is the irrelevant information inserted purposely. It appears visibly in the code block and may attract diagnosis to alternative A, but the behavior is occurring in rg-storage-new, which is not in the notScopes list. The exclusion has no relation to the problem.

Alternative C is technically false: initiatives support policies with any effect, including Deny. Alternative D is also false: enforcementMode Default does not interfere with individual policy effects within an initiative; it controls whether the assignment as a whole blocks or not, and Default means blocking is active.

The most dangerous distractor is A, because it leads the administrator to investigate scope inheritance and exclusions, consuming time and deviating from the real problem, which lies in the policy definition itself.


Answer Key β€” Scenario 4​

Answer: B

The correct sequence is 3, 2, 5, 1, 4, which represents a progressive diagnosis from broadest to most specific:

Step 3 confirms that the assignment exists and is active in the expected scope. Without this, all other steps are irrelevant.

Step 2 validates that the configured effect is actually Modify. An error in the policy definition may make the observed behavior expected, not a problem.

Step 5 verifies if the resource type is covered by the if condition. A tag policy may have conditions that exclude certain resource types.

Step 1 checks the managed identity. The Modify effect requires that the managed identity associated with the assignment has the Contributor role (or equivalent) in scope to be able to modify resources. Without this permission, the policy evaluates but cannot execute the modification.

Step 4 is the final confirmation step: the activity log shows what actually happened during creation, including whether the policy was evaluated, what the result was, and if there was a permission error in the modification attempt.

Sequence A inverts logical order by checking the effect before confirming the assignment exists. Sequence C starts with managed identity permission before confirming that assignment and effect are correct, which wastes investigation on a component that may not even be involved. Sequence D starts with the log, which is valid for confirmation, but inefficient as a first step when it's still unknown if the assignment exists.


Troubleshooting Tree: Implement and manage Azure Policy​

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

Legend:

ColorMeaning
Dark blueInitial symptom (entry point)
BlueDiagnostic question
RedIdentified cause
GreenRecommended action or resolution
OrangeIntermediate verification or validation

When facing a real Azure Policy problem, start at the root node and answer each question based on what you can observe directly: does the assignment exist? Is the enforcement mode correct? Does the configured effect correspond to the expected behavior? Each answer closes a path and directs to the next level of investigation. The goal is to reach a red cause-identified node through the fewest possible steps, avoiding premature corrective actions before confirming the diagnosis.