Troubleshooting Lab: Configure resource locks
Diagnostic Scenariosβ
Scenario 1 β Root Causeβ
A junior administrator reports that they cannot delete a Virtual Machine named vm-app-prod-01, even having the Owner role assigned directly on the subscription. The production environment uses three resource groups: rg-app, rg-network and rg-storage. The VM is in rg-app.
The administrator executes the following command to investigate:
az vm delete --name vm-app-prod-01 --resource-group rg-app --yes
The returned output is:
(ScopeLocked) The scope '/subscriptions/a1b2c3d4.../resourceGroups/rg-app/
providers/Microsoft.Compute/virtualMachines/vm-app-prod-01'
cannot perform delete operation because following scope(s) are locked:
'/subscriptions/a1b2c3d4.../resourceGroups/rg-app'.
Please remove the lock and try again.
The administrator checks the locks directly on the VM through the portal and finds no locks listed on the resource. They also confirm that the VM is in Running state and that the OS disk has 128 GB, without recent snapshots.
What is the root cause of the failure?
A) The Owner role was assigned on the subscription, but didn't propagate correctly to the resource group rg-app, preventing the operation.
B) There is a lock applied on the resource group rg-app that is being inherited by the VM, but doesn't appear in the individual resource's lock view.
C) The VM is in Running state and Azure prevents deletions of active VMs without the --force parameter.
D) The 128 GB OS disk has an implicit lock automatically applied by Azure when there are no recent snapshots.
Scenario 2 β Action Decisionβ
The security team identified that a ReadOnly lock applied on the resource group rg-dados-sensiveis is preventing the execution of a scheduled backup job. The job uses the Azure API to create snapshots of the managed disks contained in this resource group.
The cause was confirmed: the snapshot creation operation is blocked by the ReadOnly lock because it is classified as a write operation in ARM.
The environment has the following restrictions:
- The resource group contains data regulated by internal compliance policy
- The backup job schedule is between 02:00 and 03:00 UTC, and has already started
- Removing the lock requires approval from two security team members, according to internal policy
- The second security member's approval has been pending for 20 minutes
What is the correct action to take at this moment?
A) Remove the lock immediately with available credentials, execute the backup and reapply the lock, documenting the exception afterwards.
B) Wait for the pending approval from the second member according to policy, even if the backup job fails in this cycle.
C) Change the lock type from ReadOnly to Delete to allow snapshot creation without removing protection against deletions.
D) Create the snapshots manually via portal, as locks don't apply to operations initiated by the portal, only by API.
Scenario 3 β Root Causeβ
An infrastructure engineer tries to move the resource group rg-legacy to another subscription using the Azure portal. The operation fails. She verifies permissions and confirms she has the Contributor role on both the source and destination subscriptions.
The engineer also verifies that rg-legacy contains 12 resources, including VMs, disks and a storage account. All resources are in healthy state. The storage account has GRS replication enabled.
When trying to initiate the move, the portal displays:
Move operation failed.
One or more resources in the source resource group have locks
that prevent the move operation.
Resource: /subscriptions/.../resourceGroups/rg-legacy/
providers/Microsoft.Storage/storageAccounts/stlegacyprod
Lock: lock-storage-readonly (ReadOnly)
The engineer believes the problem is GRS replication, which would make the move more complex, and opens a ticket to disable replication before trying again.
What is the root cause of the problem?
A) The Contributor role doesn't have permission to initiate moves between subscriptions; the Owner role is required on both.
B) GRS replication on the storage account prevents moves between subscriptions, as it creates region dependencies.
C) A ReadOnly lock applied directly on the storage account is blocking the move operation, which is classified as a write operation by ARM.
D) The resource group has too many resources for a single move; Azure limits moves to a maximum of 10 resources per operation.
Scenario 4 β Collateral Impactβ
During a planned maintenance window, an administrator needed to resize a set of resources within the resource group rg-infra-core. The resource group had a ReadOnly lock applied, which was preventing all write operations.
To quickly resolve the issue, the administrator removed the ReadOnly lock from the resource group and successfully performed the resize.
What secondary consequence could this action have caused during the interval when the lock was removed?
A) The resource group resources became temporarily inaccessible for reading by other users, as removing a ReadOnly lock causes momentary interruption in the management plane.
B) Any user with write permissions on the subscription could delete, modify or reconfigure any resource in the resource group during the interval without the lock.
C) Azure logged an automatic compliance violation and suspended the subscription until the lock was reapplied.
D) Locks applied directly on the resource group's child resources were automatically removed along with the parent resource group lock.
Answer Key and Explanationsβ
Answer Key β Scenario 1β
Answer: B
The error message is the central clue and should be read carefully. It explicitly states that the blocked scope is the resource group rg-app, not the VM itself. Locks applied on a resource group are inherited by the resources contained within it, but this inheritance doesn't appear in the individual resource's lock view in the portal. The child resource doesn't display inherited locks, only locks applied directly to it.
The information about the VM's Running state and the 128 GB disk size are purposefully irrelevant and were included as distractors. Azure doesn't prevent deletion of active VMs by default, and there's no concept of implicit lock based on snapshot absence.
The most dangerous distractor is A, as it simulates a permission problem, which would lead the administrator to investigate RBAC and waste time instead of checking the parent resource group's locks. The collective conceptual error of the distractors is confusing the symptom (denied operation) with RBAC causes, resource state or irrelevant technical characteristics.
Answer Key β Scenario 2β
Answer: B
The cause is already identified in the prompt. The correct decision is determined by the context constraints, not by the fastest technical solution. The internal policy requires approval from two security members to remove the lock. This restriction is binding regardless of operational impact. Failing one backup cycle is a recoverable impact; violating a compliance policy in a resource group with regulated data can have disciplinary and auditable consequences.
Alternative A represents the technically correct action applied while ignoring a critical process restriction. Alternative C is wrong because a Delete lock wouldn't allow snapshot creation, which is a write operation blocked only by ReadOnly. Alternative D is wrong because locks apply to all management plane operations, regardless of the interface used: portal, CLI, PowerShell or API.
The most dangerous distractor is A, as it seems pragmatic and solves the immediate problem, but deliberately ignores a security control established for that specific environment.
Answer Key β Scenario 3β
Answer: C
The error message displayed by the portal delivers the cause precisely: a ReadOnly lock is applied directly on the storage account stlegacyprod, and this lock is blocking the move operation. Moving a resource between subscriptions is a write operation in ARM (POST), blocked by any active ReadOnly lock on the resource or its parent scope.
The information about GRS replication is purposefully irrelevant and represents exactly the type of plausible technical detail that can divert the diagnosis. The engineer made the classic mistake of focusing on a resource characteristic instead of carefully reading the error message.
Distractor B is the most dangerous because it would lead to unnecessary action: disabling GRS replication wouldn't solve the problem, would cost time and potentially reduce the storage account's resilience during the period without lock. Distractor D is false: Azure supports up to 800 resources per move operation, making the limit of 12 resources completely irrelevant.
Answer Key β Scenario 4β
Answer: B
Locks are not an authentication or authorization control. They are an additional protection layer that overlays RBAC. By removing the ReadOnly lock, the administrator restored the default access control state, where any user with write permissions on the subscription can execute any operation allowed by their role on the resource group resources. During the interval without the lock, no additional protection against accidental deletion or modification was active.
Alternative A is false: removing a ReadOnly lock doesn't interrupt read operations. Alternative C is false: Azure doesn't automatically suspend subscriptions for lock removal. Alternative D is false and represents confusion with inheritance behavior: child locks are independent and aren't affected by removing locks on the parent scope.
The real and relevant impact is that the window without lock is a period of genuine operational risk, especially in environments with multiple administrators or automations waiting for permission to execute operations. The best practice is to minimize this interval as much as possible.
Troubleshooting Tree: Configure resource locksβ
Color legend:
- Dark blue: symptom or diagnostic entry point
- Medium blue: objective diagnostic question
- Red: identified cause
- Green: recommended action or resolution
- Orange: intermediate validation or verification
To use this tree when facing a real problem, start at the root node describing the blocked or unexpected behavior. Answer each diagnostic question based on what is observable in the environment: the error message, lock scope, operation type and available permissions. Each path ends with an identified cause followed by a concrete action. If the action doesn't solve the problem, return to the validation node and restart the path with the newly obtained information.