Skip to main content

Theoretical Foundation: Create a Recovery Services Vault


1. Initial Intuition​

Imagine you have a critical server running your application. One day, due to human error, a disk fails, or someone accidentally deletes important files. Without a protection plan, you've lost data. With a plan, you restore everything in minutes.

The Recovery Services Vault is exactly this "protection vault" in Azure. It is a managed container that stores backup and disaster recovery data for Azure and on-premises resources.

The most direct analogy: think of a bank vault. You deposit your most valuable assets there (VM backups, databases, files) and when you need them, you withdraw them with guaranteed integrity. The bank (Azure) manages the vault infrastructure; you manage what goes in and the access rules.

In practice, the Recovery Services Vault serves two main purposes:

  • Azure Backup: protect data against deletion, corruption, or failure
  • Azure Site Recovery (ASR): replicate virtual machines to another region and ensure business continuity in case of regional disaster

2. Context​

Within the Azure ecosystem, data protection is organized in layers. The Recovery Services Vault is the central element of this structure.

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

The vault exists because Azure needed a unified resource that:

  1. Manages metadata and backup data with automatic redundancy
  2. Centralizes retention and scheduling policies
  3. Provides granular access control via RBAC
  4. Ensures compliance with soft delete and immutability
  5. Allows centralized monitoring and alerts

Without the vault, each backup solution would be isolated, without unified visibility and without integrated security guarantees.


3. Concept Construction​

3.1 What composes a Recovery Services Vault​

Before creating the vault, you need to understand its fundamental elements.

Region: the vault is a regional resource. It can only protect resources in the same region or replicate resources to another region. You cannot backup a VM in Brazil South in a vault in East US.

Storage redundancy: defines how backup data is physically replicated. This is configured in the vault and affects cost and resilience.

TypeAcronymCopiesUse case
Locally Redundant StorageLRS3 copies in the same zone/datacenterLow cost, acceptable if there is ASR
Geo-Redundant StorageGRS6 copies in 2 distinct regionsProtection against regional disaster
Zone-Redundant StorageZRS3 copies in different zonesHigh zonal availability

The default is GRS. If the vault is used only with ASR and replication data is already in another region, LRS may be sufficient, reducing cost.

Cross Region Restore (CRR): functionality that allows restoring backups from a secondary region, even if the primary region is unavailable. Only available when redundancy is GRS.

Soft Delete: additional protection that retains backup data for 14 days after deletion, preventing accidental or malicious loss. Enabled by default since 2020.

Immutability: prevents alteration or deletion of backup data during the defined retention period. Critical for regulatory compliance.


3.2 Backup Policies​

A Backup Policy is a set of rules that defines:

  • How often to backup (scheduling frequency)
  • What time to execute
  • How long to retain recovery points

Policies are associated with the vault and applied to individual protected items.

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

There are two types of policy:

  • Standard Policy: daily backup with granular retention options by day, week, month, and year
  • Enhanced Policy: supports hourly backup, offering lower RPO (Recovery Point Objective), but with higher cost

4. Structural View​

The Recovery Services Vault positions itself as a central hub between data sources and protection destinations.

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

5. Practical Operation​

Recovery Services Vault lifecycle​

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

A critical and frequently ignored behavior: storage redundancy can only be changed before the first backup is executed. After that, the only way to change is to delete all protected items, change the configuration, and re-protect everything.


Prerequisites for creation​

Before creating the vault, you need to have:

  • An active subscription in Azure
  • An existing Resource Group or permission to create one
  • RBAC permission: minimum of Contributor in the Resource Group or Backup Contributor in the desired scope
  • Define the region based on the resources to be protected

6. Implementation Methods​

6.1 Azure Portal (Graphical Interface)​

When to use: one-time creation, learning environments, visual validation of configurations.

Steps:

  1. Access portal.azure.com
  2. Search for "Recovery Services vaults" in the search bar
  3. Click "Create"
  4. Fill in the fields:
    • Subscription: select the correct subscription
    • Resource Group: create a new one or select existing
    • Vault name: unique name in the Resource Group (3 to 50 characters, alphanumeric and hyphens)
    • Region: same region as the resources to protect
  5. Click "Review + Create" then "Create"

After creation, access the vault and configure immediately in Properties > Backup Configuration:

  • Storage Replication Type (LRS, GRS, or ZRS)
  • Cross Region Restore (requires GRS)
  • Security Settings (Soft Delete, immutability)

Limitation: manual process, not replicable, subject to human error at scale.


6.2 Azure CLI​

When to use: automation scripts, CI/CD pipelines, batch creation.

# Create Resource Group (if necessary)
az group create \
--name rg-backup-prod \
--location brazilsouth

# Create the Recovery Services Vault
az backup vault create \
--resource-group rg-backup-prod \
--name rsv-prod-brazilsouth \
--location brazilsouth

# Configure storage redundancy
az backup vault backup-properties set \
--resource-group rg-backup-prod \
--name rsv-prod-brazilsouth \
--backup-storage-redundancy GeoRedundant

# Enable Cross Region Restore
az backup vault backup-properties set \
--resource-group rg-backup-prod \
--name rsv-prod-brazilsouth \
--cross-region-restore-flag true

Advantage: fast, scriptable, integrable into pipelines. Limitation: requires Azure CLI installed and authenticated.


6.3 Azure PowerShell​

When to use: corporate Windows environments, automation integrated with existing PowerShell scripts.

# Create Resource Group
New-AzResourceGroup -Name "rg-backup-prod" -Location "brazilsouth"

# Create Recovery Services Vault
New-AzRecoveryServicesVault `
-ResourceGroupName "rg-backup-prod" `
-Name "rsv-prod-brazilsouth" `
-Location "brazilsouth"

# Get vault reference
$vault = Get-AzRecoveryServicesVault `
-ResourceGroupName "rg-backup-prod" `
-Name "rsv-prod-brazilsouth"

# Configure redundancy
Set-AzRecoveryServicesBackupProperty `
-Vault $vault `
-BackupStorageRedundancy GeoRedundant

6.4 ARM Template (Azure Resource Manager)​

When to use: Infrastructure as Code, environments with strict governance, repeatable and versioned deployments.

{
"$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
"contentVersion": "1.0.0.0",
"resources": [
{
"type": "Microsoft.RecoveryServices/vaults",
"apiVersion": "2023-01-01",
"name": "rsv-prod-brazilsouth",
"location": "[resourceGroup().location]",
"sku": {
"name": "RS0",
"tier": "Standard"
},
"properties": {}
}
]
}

Advantage: versionable, auditable, reusable in multiple environments. Limitation: higher learning curve; post-creation configurations (redundancy, soft delete) require additional resources in the template.


6.5 Terraform​

When to use: multi-cloud environments, teams already using Terraform as IaC standard.

resource "azurerm_resource_group" "backup" {
name = "rg-backup-prod"
location = "Brazil South"
}

resource "azurerm_recovery_services_vault" "main" {
name = "rsv-prod-brazilsouth"
location = azurerm_resource_group.backup.location
resource_group_name = azurerm_resource_group.backup.name
sku = "Standard"

soft_delete_enabled = true

storage_mode_type = "GeoRedundant"
}

Advantage: state management, execution plan, native integration with other Azure resources. Limitation: requires Terraform installed and AzureRM provider configured.


7. Control and Security​

RBAC in Recovery Services Vault​

The vault supports specific roles to separate responsibilities:

RoleCapability
Backup ContributorCreate/manage backups, create vaults, cannot delete
Backup OperatorEnable backup, trigger jobs, restore. Cannot remove protection
Backup ReaderRead-only. View backups and jobs
Site Recovery ContributorManage ASR completely, except create vaults
Site Recovery OperatorExecute failover and failback

Soft Delete​

When enabled (default), when deleting a protected item:

  • Data is retained for 14 additional days in "soft deleted" state
  • During this period, deletion can be undone (undelete)
  • After 14 days, data is permanently removed
  • It's possible to extend to 180 days with extended retention configuration

Attention: even with the vault "empty" of active items, if there are items in soft delete, the vault cannot be deleted. You need to purge (permanently delete) the soft deleted items first.

Immutability​

Three possible states:

StateBehavior
DisabledNo immutability protection
Enabled (unlocked)Immutable data, but can be disabled
Enabled (locked)Immutable data, cannot be reverted. Irreversible

The "locked" state is required by regulations like LGPD, SOC 2, and Brazilian banking regulations when requiring proof that backups were not tampered with.


8. Decision Making​

Storage redundancy​

SituationBest choiceReason
Critical VM without ASR configuredGRSProtection against regional disaster via backup
VM with ASR replicating to another regionLRSASR already ensures regional recovery; LRS reduces cost
Zonal availability requirementZRSProtects against zone failure within the same region
Dev/test environmentLRSMinimum cost; tolerable loss

Number of vaults per environment​

ScenarioRecommendationReason
Separated prod, staging, dev environmentsOne vault per environmentPolicy and access isolation
Multiple regionsOne vault per regionVault is regional; data should stay close to resources
Compliance with multiple departmentsOne vault per department or BURBAC and retention policy isolation
Small organization, simple resourcesSingle vaultOperational simplicity

9. Best Practices​

Standardized naming: use a clear and consistent convention. Example: rsv-[environment]-[region] like rsv-prod-brazilsouth or rsv-dev-eastus2.

Mandatory tags: apply tags like Environment, CostCenter, Owner, and Application on the vault to facilitate governance and chargeback.

Minimum access policy: use specific backup roles (Backup Operator, Backup Contributor) instead of giving Contributor or Owner to backup operators.

Soft Delete always enabled: never disable in production environments. The cost of 14 days additional retention is negligible compared to the risk of irreversible loss.

Separation by region: never try to centralize backups from multiple regions in a single vault. This is not supported and compromises latency and data residency compliance.

Proactive monitoring: configure alerts in Azure Monitor for backup job failures. A backup silently failing for weeks is a serious risk.

Test restore regularly: creating backups without ever testing restoration is a false security practice. Schedule periodic restore tests in isolated environments.


10. Common Errors​

Error: creating the vault in a different region than resources Why it happens: the operator creates the vault in a "default" region without checking where the VMs are. How to avoid: check the region of protected resources before creating the vault. The rule is: vault and resource in the same region for backup.

Error: forgetting to configure redundancy before the first backup Why it happens: the vault is created and protection is activated immediately without reviewing storage settings. How to avoid: create the vault, immediately configure storage redundancy and soft delete, only then activate item protection.

Error: trying to delete a vault with protected items or in soft delete Why it happens: the operator removes VMs from Azure and assumes the vault is empty. How to avoid: before deleting the vault, check in Backup Items, Replication Items, and Backup Infrastructure if there are active items. Execute purge on soft deleted items.

Error: using a single vault for all environments Why it happens: excessive simplification to reduce management. How to avoid: separate vaults by environment (prod, staging, dev) to prevent dev retention policies from impacting prod and to facilitate RBAC.

Error: not configuring backup failure alerts Why it happens: assuming "if there's no alert, it's working". How to avoid: configure immediately after vault creation backup notifications via Azure Monitor or email alerts in the vault.


11. Operation and Maintenance​

Daily monitoring​

In the portal, access the vault and check:

  • Backup Jobs: jobs with Failed or Warning status require immediate attention
  • Backup Alerts: active alerts that need investigation
  • Backup Reports (via Azure Monitor Workbooks): historical view of backup compliance

Important Recovery Services Vault limits​

LimitValue
Vaults per subscriptionNo documented limit, but recommended to organize by Resource Group
VMs protected per vault1000 VMs per vault (performance recommendation)
Backup policy per vaultUp to 200 policies
Recovery points per itemVaries by type, up to 9999 for VMs
Maximum protected disk size (VM)32 TB

Cost management​

Recovery Services Vault costs are composed of:

  1. Protected instance: charged per protected VM, based on disk size
  2. Backup storage: charged by data volume stored, with different costs for LRS, ZRS and GRS
  3. Transactions: read/write operations on storage

Monitor via Azure Cost Management filtering by Resource Group or vault tags for cost visibility per environment.


12. Integration and Automation​

Integration with Azure Policy​

You can use Azure Policy to ensure that all VMs in a subscription or Resource Group are protected by a specific vault. The built-in policy Configure backup on VMs of a location to an existing central Vault in the same location automates the association of new VMs to the vault.

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

Automation with Azure Automation / Logic Apps​

Common automation patterns:

  • Trigger backup on-demand via runbook when a significant change is detected (e.g., before a deployment)
  • Weekly compliance report sent by email via Logic App querying the vault API
  • Auto-register new VMs to the vault via event-driven automation with Azure Event Grid

REST API​

The vault exposes complete REST APIs. Example of creation via API:

PUT https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.RecoveryServices/vaults/{vaultName}?api-version=2023-01-01

With body:

{
"location": "brazilsouth",
"sku": {
"name": "RS0",
"tier": "Standard"
},
"properties": {}
}

13. Final Summary​

What it is: regional container in Azure that stores backup data and disaster recovery configurations (ASR) for Azure and on-premises resources.

Essential points:

  • The vault is always regional: must be in the same region as the protected resources
  • Storage redundancy (LRS, GRS, ZRS) can only be changed before the first backup
  • Soft Delete is enabled by default and retains data for 14 days after deletion. Vaults with soft deleted items cannot be deleted without purge
  • Cross Region Restore is only available with GRS redundancy
  • A vault cannot be deleted while there are protected, replicated, or soft deleted items
  • Use specific backup roles (Backup Contributor, Backup Operator) instead of generic roles to follow the principle of least privilege

Critical differences:

PointDetail
GRS vs LRSGRS for primary backup; LRS when ASR already provides regional resilience
Soft Delete vs ImmutabilitySoft delete protects against accidental deletion; immutability protects against tampering
Locked vs Unlocked ImmutabilityLocked is irreversible. Use only when required by regulation
Standard vs Enhanced PolicyEnhanced supports hourly backup (lower RPO), with higher cost

What needs to be remembered for AZ-104:

  • Vault created before any backup configuration
  • Redundancy configured immediately after creation, before the first backup
  • Vault and protected resource must be in the same region
  • Soft delete prevents immediate data deletion; requires manual purge for permanent deletion
  • Granular RBAC available with specific backup and site recovery roles