Troubleshooting Lab: Manage Resource Groups
Diagnostic Scenariosβ
Scenario 1 β Root Causeβ
An administrator receives a ticket reporting that the development team cannot create new resources within the rg-dev-aplicacoes resource group. The group exists, is visible in the portal, and developers can list existing resources without problems. The subscription is active with no quota alerts.
The administrator verifies that developers have the Contributor role assigned directly on the resource group. During the investigation, they notice that an Azure Policy was recently assigned to the subscription, but the governance team reports it has an Audit effect, not Deny. The administrator also confirms that the eastus region is available and there are no connectivity issues with the portal.
When trying to create a Storage Account through the portal, the developer receives:
AuthorizationFailed: The client does not have authorization to perform
action 'Microsoft.Storage/storageAccounts/write' over scope
'/subscriptions/xxxx/resourceGroups/rg-dev-aplicacoes'.
What is the root cause of the problem?
A) The Azure Policy with Audit effect is blocking resource creation because the enforcement mode was changed to Enabled without updating the effect.
B) A ReadOnly lock was applied to the resource group, preventing write operations on the management plane even for users with Contributor role.
C) The Contributor role was assigned with an ABAC condition that restricts write scope to a specific resource type, excluding Storage Accounts.
D) The subscription reached the limit for Microsoft.Storage resources and Azure Resource Manager is returning authorization error instead of quota exceeded.
Scenario 2 β Action Decisionβ
The operations team identified that a CanNotDelete lock was incorrectly applied to the rg-homologacao resource group, preventing obsolete test resources from being removed at the end of each cycle. The cause is confirmed and documented.
The resource group contains 14 active resources. Four of them are being used in integration tests that end in 40 minutes. The scheduled maintenance window for cleanup starts in 1 hour. The administrator has the Owner role on the subscription.
None of the resources in the group have individual locks. The governance team confirmed that the lock was created by mistake and its removal is approved.
What is the correct action to take at this time?
A) Remove the lock immediately and start deleting obsolete resources before integration tests finish, to optimize maintenance window time.
B) Wait for integration tests to finish, remove the lock, and execute deletion of obsolete resources within the scheduled maintenance window.
C) Delete the entire resource group now, since the CanNotDelete lock doesn't prevent deletion when the Owner role is present on the subscription.
D) Create a new resource group, move active resources to it, and delete rg-homologacao with all obsolete resources at once.
Scenario 3 β Root Causeβ
An administrator is analyzing a Cost Analysis report and notices that several resources from rg-financas don't appear when the departamento: financas filter is applied. The resource group has the departamento: financas tag applied correctly. The resources are running and generating costs visible in the general report, without the filter.
The administrator verifies there's no Azure Policy configured for tag inheritance in the environment. The subscription has 8 resource groups and only rg-financas presents the problem. Resources were created via ARM templates 3 months ago, and templates didn't include the tags block.
When inspecting two resources from the group via CLI, the administrator gets:
$ az resource show --ids /subscriptions/xxxx/resourceGroups/rg-financas/providers/Microsoft.Compute/virtualMachines/vm-fin-01 --query tags
{}
$ az resource show --ids /subscriptions/xxxx/resourceGroups/rg-financas/providers/Microsoft.Compute/virtualMachines/vm-fin-02 --query tags
{}
What is the root cause of the behavior observed in Cost Analysis?
A) Cost Analysis has synchronization issues and doesn't reflect tags applied to resource groups created more than 90 days ago.
B) Individual resources don't have the departamento: financas tag applied, as resource group tags aren't automatically inherited by child resources.
C) ARM templates override resource group tags on child resources during deployment, removing any previously configured tags.
D) Tag filtering in Cost Analysis only operates at subscription scope and doesn't recognize tags applied at resource group scopes.
Scenario 4 β Diagnostic Sequenceβ
An administrator receives an alert that the operation to move the vnet-producao resource from the rg-rede resource group to rg-infraestrutura failed. Both resource groups are in the same subscription. The administrator needs to diagnose the failure before trying again.
The available investigation steps are:
- Verify if the
Microsoft.Network/virtualNetworksresource type is in the list of resources that support the Move operation - Confirm if there are
ReadOnlyorCanNotDeletelocks on the source and destination resource groups - Check if there are dependent resources on the VNet, like NICs or running VMs, that could prevent the move
- Consult the subscription's Activity Log to identify the exact error message returned by Azure Resource Manager
- Confirm if the administrator has write permission on both resource groups and delete permission on the source group
What is the correct investigation sequence?
A) 1 -> 5 -> 2 -> 4 -> 3
B) 4 -> 5 -> 2 -> 1 -> 3
C) 4 -> 1 -> 2 -> 5 -> 3
D) 2 -> 4 -> 1 -> 3 -> 5
Answer Key and Explanationsβ
Answer Key β Scenario 1β
Answer: B
The AuthorizationFailed error message about the write action indicates something in the management plane is blocking the operation, even with the Contributor role assigned correctly. A ReadOnly lock applied to the resource group prevents any operation other than read on the management plane, overriding RBAC permissions. The Contributor cannot remove locks; only Owner or a role with Microsoft.Authorization/locks/* can do so.
The information about the Azure Policy with Audit effect is intentionally irrelevant: policies with this effect log non-compliance but don't block operations. Including it in the scenario simulates real investigation pressure, where not all data is a clue.
The most dangerous distractor is alternative C, as ABAC conditions are real and plausible, but the returned error doesn't indicate resource type restriction. Distractor A confuses policy effect behavior. Distractor D is unlikely because the error would be QuotaExceeded, not AuthorizationFailed.
Answer Key β Scenario 2β
Answer: B
The critical constraint in the scenario is that integration tests are running and end in 40 minutes. The maintenance window starts in 1 hour, offering sufficient margin to wait for tests to finish and act in a controlled manner. Removing the lock and deleting resources while tests are active (alternative A) could compromise ongoing results, even if none of the obsolete resources are directly in use by the tests, as dependencies in shared resource groups are common.
Alternative C represents a classic misconception: the Owner role allows removing the lock, but doesn't alter the lock's behavior itself. While the CanNotDelete lock exists, no deletion is allowed, regardless of role. Alternative D introduces an unnecessary Move operation with higher risk than the simple action of removing the lock and waiting for the window. Acting hastily when time is available is the error that distractors A and D induce.
Answer Key β Scenario 3β
Answer: B
The CLI output is the decisive clue: both VMs return {} for the tags field, confirming that individual resources have no tags applied. Tags on resource groups are metadata of the group itself and don't automatically propagate to child resources, unless an Azure Policy with Modify effect is configured to do so. The scenario explicitly confirms there's no such policy in the environment.
The information about ARM templates being used 3 months ago is irrelevant to the diagnosis: the problem isn't when resources were created, but the fact that templates didn't include the tags block and no inheritance mechanism was configured afterward.
The most dangerous distractor is alternative C, as it sounds plausible to those familiar with some ARM template property behaviors. However, tags aren't overwritten by subsequent deployments unless explicitly specified in the template. Distractor D reverses the real Cost Analysis logic, which works by tag filtering at any scope where the tag is present on the resource.
Answer Key β Scenario 4β
Answer: B
The correct sequence starts with reading the Activity Log (step 4) because it provides the exact error message, eliminating hypotheses before any additional verification. Without knowing what ARM returned, any subsequent investigation is speculative. With the error in hand, verify permissions (step 5), as AuthorizationFailed and LinkedAuthorizationFailed are frequent causes. Next, check for locks (step 2), which would block the operation regardless of permissions. Then, confirm if the resource type supports Move (step 1), information documented by Microsoft. Finally, investigate dependencies (step 3), which is the most laborious verification and should only be done after eliminating simpler causes.
Sequence A inverts logical order by checking Move support before understanding the actual error. Sequence C mixes Move support verification with log reading without coherent diagnostic progression. Sequence D starts with locks without reading the error, which could lead to unnecessary corrective actions.
Troubleshooting Tree: Manage Resource Groupsβ
Color Legend:
| Color | Node Type |
|---|---|
| Dark blue | Initial symptom, investigation entry point |
| Medium blue | Diagnostic question, decision point |
| Red | Identified cause |
| Green | Recommended action or resolution |
| Orange | Validation or intermediate verification |
To use this tree when facing a real problem, start from the root node describing the observed symptom and answer each diagnostic question based on what is directly verifiable in the environment, without assuming causes. Follow the path indicated by the answer until reaching an identified cause node (red) and, from there, apply the corresponding action (green). If no path converges to a clear cause, return to the intermediate validation node (orange) to consult the Activity Log and get the exact error before restarting the journey.