Troubleshooting Lab: Create a Recovery Services vault
Diagnostic Scenariosβ
Scenario 1 β Root Causeβ
An administrator reports that when trying to configure VM backup through the Azure portal, the previously created Recovery Services vault does not appear in the list of available vaults for selection. The VM is running and has no health alerts. The administrator confirms they have the Contributor role on the subscription.
Information collected:
- Vault name:
vault-prod-eastus - Vault region:
East US 2 - VM name:
vm-app-prod-01 - VM region:
East US - Vault resource group:
rg-backup - VM resource group:
rg-compute - Replication type configured in vault:
GRS - Soft delete: enabled
The administrator checks activity logs and finds no permission errors or API failures.
What is the root cause of the problem?
A) The vault is in a different resource group from the VM, which prevents association between them.
B) The vault and VM are in different regions, making the vault invisible to this VM in the backup configuration flow.
C) GRS replication is blocking new resource registration until the configuration is confirmed.
D) Enabled soft delete prevents registration of VMs that have never been protected before.
Scenario 2 β Action Decisionβ
The infrastructure team identified that the Recovery Services vault vault-dr-westeu was created with LRS replication, but the company's compliance policy requires GRS for all vaults protecting critical workloads. The vault is 14 days old and has no protected items configured yet.
The operations manager requests immediate resolution. The production environment is active and the vault needs to be compliant before the audit scheduled in 48 hours.
What is the correct action to take at this time?
A) Delete the vault and recreate with GRS, since no data was lost given that there are no protected items.
B) Change the replication configuration directly in the vault from LRS to GRS through the portal or CLI, taking advantage of no items being protected.
C) Create a new vault with GRS in parallel and migrate existing protected items to it.
D) Open a Microsoft support ticket to request redundancy change, as this configuration is not editable by the customer.
Scenario 3 β Root Causeβ
During a security review, the operations team attempts to delete a Recovery Services vault that is no longer used. The vault appears to be empty. When executing the delete command, the following error is returned:
ERROR: Vault cannot be deleted as there are existing resources within the vault.
Please ensure backup is stopped with delete data for all backup items,
and also delete all private endpoints and backup policies for the vault.
(Code: ResourceInUse)
The administrator checks the portal and confirms there are no VMs, file shares, or other items showing as actively protected. The vault was created six months ago and had active backups that were stopped three weeks ago via the "Stop backup" option. The environment uses the vault's default security configuration.
What is the root cause of the deletion failure?
A) The vault has custom backup policies that need to be deleted before the vault.
B) There are backup items in retained with data state, resulting from backup interruption without data deletion, and soft delete maintains the data for an additional period after interruption.
C) The error indicates there are private endpoints associated with the vault that were not removed before the deletion attempt.
D) The vault cannot be deleted because GRS replication maintains an active copy in the paired region that needs to be removed first.
Scenario 4 β Diagnostic Sequenceβ
An administrator receives a report that a newly created Recovery Services vault is not accepting VM registration for backup. No clear error message was provided by the user who reported the problem.
The available investigation steps are out of order:
- Check if there are already protected items in the vault that might indicate configuration conflict.
- Confirm if the vault and VM are in the same Azure region.
- Verify if the user trying to configure backup has adequate permission on the vault and VM.
- Confirm if the vault was successfully created and is in Active state in the portal.
- Try to register the VM via Azure CLI to isolate if the problem is interface or platform related.
What is the correct investigation sequence for this scenario?
A) 4 β 2 β 3 β 1 β 5
B) 3 β 4 β 1 β 2 β 5
C) 2 β 4 β 3 β 1 β 5
D) 1 β 3 β 2 β 4 β 5
Answer Key and Explanationsβ
Answer Key β Scenario 1β
Answer: B
The decisive clue is in the comparison between regions: the vault is in East US 2 and the VM is in East US. These are distinct regions in Azure. The portal automatically filters available vaults to display only those in the same region as the resource being protected. Therefore, the vault simply doesn't appear in the list, without generating any explicit error message.
The information about different resource groups (A) is the purposely irrelevant distractor in this scenario. The vault and VM can be in completely different resource groups without any impact on the backup association capability. This is a common misconception, especially in environments where there's a policy to separate resources by groups.
The replication type (C) and soft delete (D) have no relation to visibility or registration eligibility. Acting based on (A) would lead the administrator to reorganize resource groups without solving the real problem, wasting critical time.
Answer Key β Scenario 2β
Answer: B
The condition that makes option B correct is precisely the absence of protected items. The storage replication configuration of a Recovery Services vault can be changed directly while no items are protected, which is precisely the current state of the vault. This is the window of opportunity that the scenario presents.
Option A describes a technically possible action, but unnecessary and riskier: deleting and recreating a functional vault when direct editing is available violates the principle of least necessary intervention.
Option C contains an embedded factual error: the statement declares there are no protected items, making the premise of "migrating items" invalid. This distractor forces the reader to verify if they really absorbed the environment state before acting.
Option D is incorrect: redundancy change while the vault is empty is a self-service operation available directly to the customer, without needing Microsoft support.
Answer Key β Scenario 3β
Answer: B
The root cause is the combination of two related behaviors: backups were stopped via "Stop backup" without the option to delete retained data, and the vault's default soft delete maintains this data for an additional 14 days after interruption. To Azure, these items still exist in the vault, even though they don't appear as actively protected in the portal's default view.
The critical clue is in the phrase "stopped via Stop backup option" combined with "vault's default security configuration", which confirms that soft delete is active.
Option C is the most dangerous distractor: the error explicitly mentions private endpoints, and a rushed administrator could go directly to investigate endpoints without questioning the premise. However, the statement doesn't describe any private network context or private endpoint configuration, making this hypothesis less founded than the presence of retained data.
Option A describes an additional step that might be necessary, but is not the root cause of the deletion failure in this scenario. Option D confuses storage replication with deletion blocking, which has no foundation in the service's actual behavior.
Answer Key β Scenario 4β
Answer: A
The correct sequence is 4 β 2 β 3 β 1 β 5, which represents progressive diagnostic reasoning from most fundamental to most specific.
Step 4 comes first because confirming the vault exists and is operational is the base verification. Without this, any subsequent investigation lacks foundation. Step 2 follows as the most common and objectively verifiable cause of VM registration failure in vaults. Step 3 investigates permissions, which is a frequent cause, but only makes sense to verify after confirming the basic infrastructure is correct. Step 1 checks configuration conflicts, which are less likely in new vaults. Step 5 concludes as layer isolation, to determine if the problem is interface or platform related, and only makes sense after exhausting configuration hypotheses.
Sequence B starts with permissions, which is a triage error: permissions are relevant, but checking infrastructure before identity is the correct order in resource registration diagnostics. Sequence C starts with region, skipping vault existence verification, which is more elementary. Sequence D starts with conflicts in existing items, which is the least probable step in a newly created vault.
Troubleshooting Tree: Create a Recovery Services vaultβ
Color Legend:
| Color | Node Type |
|---|---|
| Dark Blue | Initial symptom (entry point) |
| Blue | Diagnostic question |
| Red | Identified cause |
| Green | Recommended action or resolution |
To use this tree when facing a real problem, always start from the root node describing the observed symptom and follow the branches by objectively answering each diagnostic question. Each answer eliminates a set of hypotheses and leads to the next level of verification. The goal is to reach an identified cause node with the fewest possible steps, without skipping intermediate validation steps. When the cause is confirmed, the corresponding action node indicates the direct resolution, without ambiguity.