Theoretical Foundation: Create and Configure a Backup Policy
1. Initial Intuitionβ
In previous topics, you created the vault (Recovery Services Vault or Backup Vault). The vault is the container, but by itself it doesn't do anything. You need to tell it when to backup, how frequently, and how long to keep the data. This is exactly what a Backup Policy defines.
The most direct analogy: think of a photocopying service for important documents. You hire the service (create the vault), but you need to sign a contract specifying "take a copy every Monday, keep weekly copies for 1 month, monthly ones for 1 year, and annual ones for 5 years". That contract is the Backup Policy.
In practice, the Backup Policy answers three fundamental questions:
- When to execute the backup? (scheduling)
- What to capture in each execution? (recovery point type)
- How long to keep each point? (retention)
2. Contextβ
The Backup Policy exists between the vault and protected items. It's the link that transforms an empty vault into an active protection system.
A policy is created within a vault and can be applied to multiple items. A vault can have multiple policies. Each protected item, however, uses only one policy at a time.
The policy exists because different workloads have radically different needs:
- A production database might require hourly backups with 7-year retention
- A development VM might need daily backups with 7-day retention
- A file server might require daily, weekly, and monthly backups with tiered retention
Without policies, you would have to configure each item individually. With policies, you configure once and apply to as many items as needed.
3. Building the Conceptsβ
3.1 Recovery Point Objective (RPO) and Recovery Time Objective (RTO)β
Before configuring a policy, you need to understand two concepts that determine scheduling choices.
RPO (Recovery Point Objective): how much data loss is acceptable? If your backup is daily, the worst-case scenario is losing 24 hours of data. If it's hourly, you lose at most 1 hour. RPO defines the minimum backup frequency.
RTO (Recovery Time Objective): how long can it take to restore and return to operation? This isn't configured directly in the policy, but backup frequency and type affect RTO, as more frequent backups generally result in smaller and faster restore points.
3.2 Backup Typesβ
Full Backup: complete copy of all resource data at a specific moment. Consumes more space and time, but restoration is simpler.
Incremental Backup: copies only data that changed since the last backup (full or incremental). Faster and more economical, but restoration is more complex because it needs to chain multiple points.
Azure Backup uses incremental backup for VMs by default since 2020 (Enhanced Policy). The first backup is always full; subsequent ones are incremental. Users don't need to manage this distinction; the platform handles it automatically.
Snapshot: captures the state of a disk at a specific moment, without needing to copy data outside the resource. Very fast, but limited in retention. Used by Azure Disks backup and the operational tier of Backup Vault.
3.3 Backup Policy Structure for VMs (Recovery Services Vault)β
A policy for VMs has the following configurable elements:
Backup Schedule:
- Daily: one backup per day, at the defined time
- Hourly (Enhanced Policy only): backups every 1, 2, 4, 6, 8, or 12 hours, within a configurable time window
Backup Window: start time and duration of the window when backups can be executed. Primarily relevant for hourly policies.
Instant Restore: number of instant snapshots kept locally for fast restoration without accessing the vault. Configurable from 1 to 5 days.
Recovery point retention: defines how long each type of point is maintained:
3.4 Standard Policy vs Enhanced Policy (VMs)β
This is a critical distinction for AZ-104:
| Characteristic | Standard Policy | Enhanced Policy |
|---|---|---|
| Minimum frequency | Daily | Hourly (minimum 1h) |
| Minimum RPO | 24 hours | 1 hour |
| Instant Restore | 1 to 5 days | 1 to 30 days |
| Trusted Launch VMs support | No | Yes |
| Ultra Disks support | No | Yes |
| Cost | Lower | Higher |
| Backup type | Incremental (legacy: full) | Always incremental |
Point of attention: once a protected item uses Enhanced Policy, it's not possible to downgrade to Standard Policy without removing protection and re-configuring.
3.5 Policies for other workloadsβ
Each workload has its own policy options. The main differences:
| Workload | Vault | Frequency | Maximum retention |
|---|---|---|---|
| Azure VM | Recovery Services Vault | Daily or hourly | 99 years |
| SQL Server on VM | Recovery Services Vault | Log: every 15 min; Full: weekly | 99 years |
| Azure Files | Recovery Services Vault | Daily | 10 years |
| Azure Disk | Backup Vault | Every 1h, 4h, 6h, 8h or 12h, or daily | 30 days (Vault Tier) |
| Azure Blob | Backup Vault | Continuous (operational retention) | 360 days |
| PostgreSQL Flexible | Backup Vault | Weekly | 10 years |
4. Structural Viewβ
How a policy fits into the complete protection flow:
5. Practical Operationβ
How retention points chain togetherβ
A non-obvious aspect: weekly, monthly, and yearly retention points are not separate backups. They are promoted from daily backups that coincide with the criteria.
Concrete example: you configure a policy like this:
- Daily backup: every Monday to Sunday, retained for 30 days
- Weekly backup: every Monday, retained for 12 weeks
- Monthly backup: first Sunday of each month, retained for 12 months
The backup executed on Monday will be the same recovery point marked as both "daily" and "weekly". The backup on the first Sunday of each month will be marked as "daily", "weekly", and "monthly". One job, multiple retention labels.
This means the point marked as monthly will be retained for the longest period among all labels applied to it.
Instant Restore (Snapshot Tier)β
Instant Restore is a layer of local snapshots created before data is transferred to the vault. While the snapshot exists, restoration is much faster (minutes instead of hours), because the data is in the same storage as the disk, without needing to download from the vault.
When the Instant Restore period expires, the snapshot is deleted, but the recovery point still exists in the vault (for longer retention). Restoration from the vault is slower.
6. Implementation Methodsβ
6.1 Azure Portalβ
When to use: initial creation, visual validation of configurations, learning environments.
Creating a policy for VM (Standard) in Recovery Services Vault:
- Access the Recovery Services Vault
- In "Backup policies", click "Add"
- Select "Azure Virtual Machine"
- Choose the type: Standard or Enhanced
- Configure:
- Policy name: descriptive name, e.g.,
policy-vm-prod-daily - Backup schedule: frequency (Daily or Hourly) and time (e.g., 23:00 UTC)
- Instant Restore: number of days to keep snapshots (1 to 5 for Standard)
- Retention range:
- Daily: how many days (e.g., 30)
- Weekly: which day of the week and how many weeks (e.g., Sunday, 12 weeks)
- Monthly: which week/day of the month and how many months (e.g., first Sunday, 12 months)
- Yearly: which month and day, and how many years (e.g., January, first Sunday, 5 years)
- Policy name: descriptive name, e.g.,
- Click "Create"
6.2 Azure CLIβ
When to use: automation, pipelines, bulk creation.
For VMs in Recovery Services Vault, policies are defined via JSON. First, get the default template:
# Get default policy template for VMs
az backup policy get-default-for-vm \
--resource-group rg-backup-prod \
--vault-name rsv-prod-brazilsouth
The return is a JSON with the policy structure. Save as policy.json and edit as needed. Then, create the policy:
# Create policy from JSON file
az backup policy create \
--resource-group rg-backup-prod \
--vault-name rsv-prod-brazilsouth \
--name policy-vm-prod-daily \
--policy @policy.json \
--backup-management-type AzureIaasVM
To list existing policies:
az backup policy list \
--resource-group rg-backup-prod \
--vault-name rsv-prod-brazilsouth
To update an existing policy:
az backup policy set \
--resource-group rg-backup-prod \
--vault-name rsv-prod-brazilsouth \
--name policy-vm-prod-daily \
--policy @policy-updated.json
Example JSON structure for VM policy (Standard):
{
"eTag": null,
"location": null,
"name": "policy-vm-prod-daily",
"properties": {
"backupManagementType": "AzureIaasVM",
"instantRpRetentionRangeInDays": 2,
"schedulePolicy": {
"schedulePolicyType": "SimpleSchedulePolicy",
"scheduleRunFrequency": "Daily",
"scheduleRunTimes": ["2024-01-01T23:00:00Z"]
},
"retentionPolicy": {
"retentionPolicyType": "LongTermRetentionPolicy",
"dailySchedule": {
"retentionTimes": ["2024-01-01T23:00:00Z"],
"retentionDuration": {
"count": 30,
"durationType": "Days"
}
},
"weeklySchedule": {
"daysOfTheWeek": ["Sunday"],
"retentionTimes": ["2024-01-01T23:00:00Z"],
"retentionDuration": {
"count": 12,
"durationType": "Weeks"
}
},
"monthlySchedule": {
"retentionScheduleFormatType": "Weekly",
"retentionScheduleWeekly": {
"daysOfTheWeek": ["Sunday"],
"weeksOfTheMonth": ["First"]
},
"retentionTimes": ["2024-01-01T23:00:00Z"],
"retentionDuration": {
"count": 12,
"durationType": "Months"
}
}
}
}
}
6.3 Azure PowerShellβ
When to use: Windows corporate environments, automation with existing scripts.
# Get the vault
$vault = Get-AzRecoveryServicesVault `
-ResourceGroupName "rg-backup-prod" `
-Name "rsv-prod-brazilsouth"
# Set vault context
Set-AzRecoveryServicesVaultContext -Vault $vault
# Get default policy for VMs
$defaultPolicy = Get-AzRecoveryServicesBackupProtectionPolicy `
-Name "DefaultPolicy"
# Create schedule
$schedulePolicy = Get-AzRecoveryServicesBackupSchedulePolicyObject `
-WorkloadType AzureVM
$schedulePolicy.ScheduleRunFrequency = "Daily"
$schedulePolicy.ScheduleRunTimes[0] = (Get-Date "2024-01-01 23:00:00Z").ToUniversalTime()
# Create retention
$retentionPolicy = Get-AzRecoveryServicesBackupRetentionPolicyObject `
-WorkloadType AzureVM
$retentionPolicy.DailySchedule.DurationCountInDays = 30
$retentionPolicy.IsWeeklyScheduleEnabled = $true
$retentionPolicy.WeeklySchedule.DaysOfTheWeek = @("Sunday")
$retentionPolicy.WeeklySchedule.DurationCountInWeeks = 12
# Create the policy
New-AzRecoveryServicesBackupProtectionPolicy `
-Name "policy-vm-prod-daily" `
-WorkloadType AzureVM `
-RetentionPolicy $retentionPolicy `
-SchedulePolicy $schedulePolicy
6.4 ARM Templateβ
When to use: versioned IaC, corporate deployment pipelines.
{
"type": "Microsoft.RecoveryServices/vaults/backupPolicies",
"apiVersion": "2023-01-01",
"name": "[concat(parameters('vaultName'), '/policy-vm-prod-daily')]",
"dependsOn": [
"[resourceId('Microsoft.RecoveryServices/vaults', parameters('vaultName'))]"
],
"properties": {
"backupManagementType": "AzureIaasVM",
"instantRpRetentionRangeInDays": 2,
"schedulePolicy": {
"schedulePolicyType": "SimpleSchedulePolicy",
"scheduleRunFrequency": "Daily",
"scheduleRunTimes": ["2024-01-01T23:00:00Z"]
},
"retentionPolicy": {
"retentionPolicyType": "LongTermRetentionPolicy",
"dailySchedule": {
"retentionTimes": ["2024-01-01T23:00:00Z"],
"retentionDuration": {
"count": 30,
"durationType": "Days"
}
}
}
}
}
7. Control and Securityβ
Who can create and modify policiesβ
Policy management follows the vault's RBAC:
| Role | Can create policy | Can modify policy | Can associate items |
|---|---|---|---|
| Backup Contributor | Yes | Yes | Yes |
| Backup Operator | No | No | Yes (with existing policy) |
| Backup Reader | No | No | No |
| Owner / Contributor | Yes | Yes | Yes |
Impact of changing an existing policyβ
Modifying an existing policy immediately affects all protected items that use it. This includes:
- Backup time change: next backup will be at the new time
- Retention reduction: recovery points exceeding the new limit are marked for deletion and removed in the next cleanup cycle
- Retention increase: existing points are not retroactively extended; only new points follow the new retention
This reduction behavior is critical: if you reduce retention from 365 days to 30 days, points older than 30 days will be deleted. There's no way to recover them if soft delete is not enabled for that state.
8. Decision Makingβ
Choosing between Standard and Enhanced Policy for VMsβ
| Situation | Best choice | Reason |
|---|---|---|
| Production VM with transactional database | Enhanced Policy | 1-hour RPO; reduces data loss |
| Development or test VM | Standard Policy | Daily backup sufficient; lower cost |
| VM with Ultra Disks or Trusted Launch | Enhanced Policy | Standard doesn't support these features |
| Regulatory compliance with RPO < 4h | Enhanced Policy | Standard only offers daily backup |
| Budget constraints on non-critical workloads | Standard Policy | Significantly lower cost |
Retention configuration by criticalityβ
| Criticality | Daily | Weekly | Monthly | Yearly |
|---|---|---|---|---|
| Critical production (financial, health) | 30 days | 52 weeks | 60 months | 10 years |
| Standard production | 30 days | 12 weeks | 12 months | 5 years |
| Staging/Pre-production | 14 days | 4 weeks | Not necessary | Not necessary |
| Development | 7 days | Not necessary | Not necessary | Not necessary |
Instant Restore: how many days to configureβ
| Scenario | Recommendation | Reason |
|---|---|---|
| Critical VMs with mandatory fast restore | 5 days | Maximum for Standard; restore in minutes |
| Standard production VMs | 2 to 3 days | Balance between snapshot cost and speed |
| Dev/test VMs | 1 day | Minimizes snapshot storage cost |
9. Best Practicesβ
Descriptive naming: use names like policy-vm-prod-daily-30d or policy-vm-dev-7d to indicate workload, environment, frequency and retention.
Create policies by criticality tier: have pre-defined policies for Tier 1 (critical), Tier 2 (standard) and Tier 3 (dev/test). This reduces proliferation of unique policies per item.
Align backup time with maintenance windows: schedule backups for lower load times, usually early morning in business timezone. Remember that Azure time is UTC.
Never modify production policies without impact assessment: any retention reduction is irreversible. Document and approve changes before executing.
Use separate policies for SQL on VMs: SQL Server on VMs requires specific policies with transaction log backup (every 15 to 60 minutes) to ensure near-zero RPO for databases.
Review periodically: policies created years ago may no longer reflect business requirements. Include annual policy review in governance process.
10. Common Errorsβ
Error: configuring backup time during peak hours Why it happens: operator configures 12:00 without considering performance impact. How to avoid: always schedule backups early morning (UTC); check timezone before configuring.
Error: reducing retention without realizing it deletes existing points Why it happens: operator assumes change affects only new backups. How to avoid: read the impact warning presented by portal before confirming change. Document current retention window before modifying.
Error: using single policy for all workloads Why it happens: excessive simplification to reduce management. How to avoid: segment by criticality. A single 5-year retention policy for dev/test generates unnecessary cost; a 7-day policy for critical production generates risk.
Error: not configuring weekly/monthly/yearly retention for production environments Why it happens: operator configures only daily retention. How to avoid: understand that 30-day daily retention means if data was corrupted 31 days ago and discovered today, there's no recovery. Long retentions protect against slow corruption.
Error: forgetting that Azure time is UTC Why it happens: operator configures 23:00 thinking local time, but Azure executes in UTC. How to avoid: always convert local time to UTC before configuring. For Brazil (UTC-3), 23:00 local = 02:00 UTC next day.
11. Operation and Maintenanceβ
Monitoring policy complianceβ
In Backup Center, the "Backup Instances" view shows the status of each protected item. Main states:
| Status | Meaning |
|---|---|
| Protection configured | Backup active and functioning normally |
| Protection stopped (retain data) | Backup paused, data retained |
| Protection stopped (delete data) | Backup paused, data being deleted |
| Initial backup pending | First backup not yet executed |
| Warning | Backup executing with warnings (e.g., partial snapshot) |
| Critical | Backup failing systematically |
Change policy of a protected itemβ
It's possible to migrate an item from one policy to another without interrupting protection:
# Change policy of a specific VM
az backup item set-policy \
--resource-group rg-backup-prod \
--vault-name rsv-prod-brazilsouth \
--name AzureIaasVMIaasVMContainerV2;vm-producao \
--policy-name policy-vm-prod-enhanced \
--backup-management-type AzureIaasVM \
--workload-type VM
Important limitsβ
| Limit | Value |
|---|---|
| Policies per vault (Recovery Services) | 200 |
| Policies per vault (Backup Vault) | 100 |
| Protected items per policy | No documented limit |
| Backup schedules per policy (Standard) | 1 per day |
| Backup schedules per policy (Enhanced) | Multiple (depends on interval) |
| Maximum daily retention | 9999 days |
| Maximum yearly retention | 99 years |
12. Integration and Automationβ
Azure Policy to automatically apply Backup Policiesβ
You can use Azure Policy to ensure new VMs are automatically protected with a specific policy:
The relevant built-in policy is: Configure backup on VMs without a given tag to an existing recovery services vault in the same location.
Policy creation automation with Terraformβ
resource "azurerm_backup_policy_vm" "prod" {
name = "policy-vm-prod-daily"
resource_group_name = azurerm_resource_group.backup.name
recovery_vault_name = azurerm_recovery_services_vault.main.name
backup {
frequency = "Daily"
time = "23:00"
}
retention_daily {
count = 30
}
retention_weekly {
count = 12
weekdays = ["Sunday"]
}
retention_monthly {
count = 12
weekdays = ["Sunday"]
weeks = ["First"]
}
retention_yearly {
count = 5
weekdays = ["Sunday"]
weeks = ["First"]
months = ["January"]
}
}
13. Final Summaryβ
What it is: set of rules that defines backup frequency, instant snapshot configuration and recovery point retention, applied to protected items within a vault.
Essential points:
- A policy answers three questions: when to backup, what to capture and how long to retain
- Policies are created within a vault and can be applied to multiple items
- Each protected item uses only one policy at a time
- Weekly/monthly/yearly retention points are promoted from existing daily backups, not separate jobs
- Instant Restore keeps local snapshots for fast restore for the configured period (1 to 5 days in Standard, 1 to 30 in Enhanced)
- Reducing policy retention immediately deletes points that exceed the new limit
Critical differences:
| Aspect | Standard Policy | Enhanced Policy |
|---|---|---|
| Minimum frequency | Daily | Hourly (minimum 1h) |
| Instant Restore | Up to 5 days | Up to 30 days |
| Ultra Disk / Trusted Launch support | No | Yes |
| Downgrade possible | N/A | Cannot return to Standard |
What needs to be remembered for AZ-104:
- Backup time is configured in UTC; always convert from local timezone
- Modifying an existing policy impacts all items using it immediately
- Retention reduction is not reversible for already deleted points
- Enhanced Policy cannot be downgraded to Standard without removing and re-configuring protection
- SQL Server on VMs requires specific policy with transaction log backup for low RPO
- Azure Blobs backup uses continuous operational retention, without traditional job scheduling