Theoretical Foundation: Implement and Manage Azure Policy
1. Initial Intuitionβ
Imagine you are the IT director of a large company and need to ensure that all resources created in the cloud follow a set of rules: every VM must have a cost center tag, no resources can be created outside Brazil, all storage accounts must use mandatory encryption.
You could train each administrator individually and trust they will follow the rules. But people make mistakes, forget, and teams grow. What you need is a system that applies the rules automatically, regardless of who creates the resource, when, or how.
This is Azure Policy: a governance system that defines rules about what can and must exist in your Azure infrastructure, continuously evaluating resources and being able to block non-compliant creations, automatically correct configurations, and generate compliance reports.
If RBAC answers "who can do what", Azure Policy answers "what can exist and how it should be configured".
2. Contextβ
Where Azure Policy fitsβ
Azure Policy operates at the Azure Resource Manager (ARM) level, intercepting and evaluating every resource creation or modification operation. It is completely independent of RBAC: a user may have Contributor permissions and still have their operation blocked by a Policy.
Azure Policy exists because RBAC is not sufficient for governance. RBAC controls human intention; Policy controls infrastructure state. A well-intentioned administrator can create a VM without a tag by mistake. Policy captures this error regardless of intention.
What depends on Azure Policyβ
- Regulatory compliance: LGPD, ISO 27001, CIS Benchmark are implemented as initiative definitions
- FinOps: mandatory tags for cost tracking
- Security: ensure sensitive resources are not publicly exposed
- Standardization: naming conventions, allowed regions, allowed SKUs
- Configuration automation: apply default configurations to newly created resources
3. Concept Buildingβ
3.1 Policy Definitionβ
A Policy Definition is the rule itself. It defines:
- What to evaluate: which resource type, which property
- How to evaluate: which condition must be true or false
- What to do if the condition is met: the effect
Every Policy Definition is a JSON document with a defined structure. Azure provides hundreds of built-in policy definitions ready for use, and you can create custom definitions for specific needs.
3.2 Effectsβ
The effect is the heart of a Policy Definition. It determines what happens when a resource is evaluated and the policy condition is met.
Each effect has distinct behavior and timing:
| Effect | Timing | What it does | Typical use case |
|---|---|---|---|
| Deny | On creation/modification | Blocks the ARM operation | Prohibit unauthorized regions |
| Audit | Continuous evaluation | Marks as non-compliant, doesn't block | Identify resources without tags |
| Append | On creation/modification | Adds fields to the resource | Force additional tags |
| Modify | On creation/modification and remediation | Adds, removes or replaces properties and tags | Ensure default tag on all resources |
| DeployIfNotExists | After creation (async) | Deploys a related resource if it doesn't exist | Install monitoring extension on VMs |
| AuditIfNotExists | Continuous evaluation | Audits if related resource doesn't exist | Check if VM has backup configured |
| Disabled | N/A | Ignores all resources | Temporarily disable a policy |
3.3 Difference between Modify and Appendβ
This distinction causes much confusion:
Append only adds fields or items to arrays in existing properties. It cannot modify a value that already exists, only add to what's already there. If the property already exists with a value, Append can result in conflict depending on the property.
Modify is more powerful: it can add, replace, or remove properties and tags. It's the recommended effect for tag management, as it replaces incorrect values and adds missing tags, rather than just trying to add.
3.4 Difference between DeployIfNotExists and AuditIfNotExistsβ
Both evaluate the existence of a related resource (not the resource itself), but with different responses:
- AuditIfNotExists (AINE): If the related resource doesn't exist, marks as non-compliant. Doesn't do anything automatically.
- DeployIfNotExists (DINE): If the related resource doesn't exist, automatically deploys it via ARM template. Requires a Managed Identity with permissions to create the related resource.
Practical example: a policy with DINE that ensures every VM has the Azure Monitor agent installed. When a VM is created without the agent, Policy automatically installs it via ARM deployment.
3.5 Policy Initiative (Policy Set Definition)β
An Initiative (also called Policy Set Definition) is a grouping of multiple related Policy Definitions, treated as a unit.
Why do initiatives exist? Imagine that ISO 27001 compliance requires 120 individual policies. Managing 120 separate assignments would be unfeasible. The initiative groups them all into a single object that can be assigned at once, with a single compliance report.
3.6 Policy Assignmentβ
A Policy Assignment is the act of applying a Policy Definition or Initiative to a specific scope. The same scope concepts from RBAC apply: Management Group, Subscription, Resource Group, or Resource.
The assignment is where you:
- Define which policy/initiative is applied
- Define to which scope
- Configure the policy parameters (e.g., list of allowed regions)
- Define exclusions (scopes excluded from evaluation)
- Configure the Managed Identity for effects that need to make deployments (DINE)
3.7 Parametersβ
Parameters make policies reusable. Instead of creating a different policy for each list of allowed regions, you create a parametric policy and pass the list as a parameter in the assignment.
"parameters": {
"allowedLocations": {
"type": "Array",
"metadata": {
"displayName": "Allowed locations",
"description": "The list of allowed locations for resources."
},
"defaultValue": ["brazilsouth", "eastus2"]
}
}
In the assignment, you provide the values:
"parameters": {
"allowedLocations": {
"value": ["brazilsouth"]
}
}
3.8 Exclusions (Exclusions / Not Scopes)β
When creating an assignment, you can define excluded scopes (notScopes). Resources within these scopes are not evaluated by the policy. This is useful for controlled exceptions, such as a laboratory Resource Group that doesn't need to follow the same production rules.
4. Structural Viewβ
Policy hierarchy and inheritanceβ
Like RBAC, policies assigned at higher scopes are inherited by scopes below:
RG1 inherits everything from the Management Group and Subscription, without its own assignments. RG2 inherits everything and adds a more restrictive policy. RG3 inherits everything but has an exclusion configured in the tags assignment.
Policy evaluation lifecycleβ
5. Practical Operationβ
The compliance evaluation cycleβ
Azure Policy evaluates resources at two distinct moments:
Synchronous evaluation (at the time of ARM operation): Happens when a resource is created or modified. Policies with Deny, Append, and Modify effects are evaluated here. If a Deny policy blocks the operation, the user immediately receives an error with details about which policy rejected the request.
Asynchronous evaluation (continuous): Azure re-evaluates all resources in a scope periodically, approximately every 24 hours. This captures resources that existed before the policy was assigned, or that became non-compliant due to policy changes. The result is reflected in the compliance dashboard.
To force an immediate evaluation:
az policy state trigger-scan --resource-group "rg-production"
Compliance State: what each state meansβ
| State | Meaning |
|---|---|
| Compliant | Resource evaluated and in compliance with all policies |
| Non-compliant | Resource evaluated and violating one or more policies |
| Exempt | Resource explicitly exempted from a policy assignment |
| Conflicting | Two policies with Deny effect have conflicting conditions |
| Not started | Evaluation has not yet been executed for this resource |
| Not registered | Resource provider is not registered for evaluation |
Non-obvious behavior: resources existing before the policyβ
When you assign a policy with Deny effect to a scope, resources that already exist and violate the policy are not deleted or blocked. They simply appear as Non-compliant. The Deny policy only blocks new creations and modifications. To fix existing resources, you need Modify or DeployIfNotExists with a Remediation Task.
Remediation Tasksβ
A Remediation Task is the mechanism to apply DINE and Modify effects to resources that already exist (Non-compliant). Without a Remediation Task, these effects only apply to resources created/modified after the policy assignment.
The remediation process:
- Azure Policy identifies Non-compliant resources for policies with Modify or DINE effect
- You initiate a Remediation Task manually or configure auto-remediation
- The Managed Identity associated with the policy assignment executes corrections on the resources
- Resource status is updated after remediation
The Managed Identity needs to have adequate permissions to modify the target resources. For example, a DINE policy that installs an extension on VMs needs a Managed Identity with at least Contributor on the VMs scope.
6. Implementation Methodsβ
Azure Portalβ
When to use: initial creation, exploring built-in policies, compliance review
To create an assignment via portal:
- Azure portal > Policy > Assignments > Assign Policy
- Select scope (MG, Sub, RG)
- Select the Policy Definition or Initiative
- Configure parameters
- Define exclusions if necessary
- For DINE/Modify effects: configure Managed Identity (System-assigned recommended)
- Review and create
To check compliance: Portal > Policy > Compliance: dashboard with compliance overview by scope, policy, and resource.
Limitation: not reproducible, no version control, difficult to scale for many assignments.
Azure CLIβ
# List built-in policy definitions
az policy definition list \
--query "[?policyType=='BuiltIn'].{Name:displayName, ID:name}" \
--output table
# View details of a specific policy
az policy definition show \
--name "e56962a6-4747-49cd-b67b-bf8b01975c4c"
# Create a simple policy assignment
az policy assignment create \
--name "deny-non-brazil-resources" \
--display-name "Deny resources outside Brazil" \
--policy "e56962a6-4747-49cd-b67b-bf8b01975c4c" \
--scope "/subscriptions/<sub-id>" \
--params '{"listOfAllowedLocations": {"value": ["brazilsouth"]}}'
# Create assignment with Managed Identity (for DINE)
az policy assignment create \
--name "deploy-monitor-agent" \
--policy "<policy-definition-id>" \
--scope "/subscriptions/<sub-id>/resourceGroups/rg-prod" \
--mi-system-assigned \
--location "brazilsouth"
# Assign role to policy Managed Identity (necessary for DINE)
az role assignment create \
--assignee-object-id "<managed-identity-object-id>" \
--role "Contributor" \
--scope "/subscriptions/<sub-id>/resourceGroups/rg-prod"
# Check compliance of an RG
az policy state list \
--resource-group "rg-production" \
--filter "complianceState eq 'NonCompliant'" \
--output table
# Create a remediation task
az policy remediation create \
--name "remediate-tags" \
--policy-assignment "/subscriptions/<sub-id>/providers/Microsoft.Authorization/policyAssignments/tag-policy" \
--resource-discovery-mode ReEvaluateCompliance \
--resource-group "rg-production"
# Force compliance scan
az policy state trigger-scan \
--resource-group "rg-production"
Azure PowerShellβ
# List built-in policies
Get-AzPolicyDefinition -BuiltIn | Select-Object -Property DisplayName, Name
# Create a policy assignment
$policy = Get-AzPolicyDefinition -Name "e56962a6-4747-49cd-b67b-bf8b01975c4c"
New-AzPolicyAssignment `
-Name "deny-non-brazil" `
-DisplayName "Deny resources outside Brazil South" `
-Scope "/subscriptions/<sub-id>" `
-PolicyDefinition $policy `
-PolicyParameterObject @{
listOfAllowedLocations = @{
value = @("brazilsouth")
}
}
# Create assignment with Managed Identity for DINE
$assignment = New-AzPolicyAssignment `
-Name "deploy-diagnostics" `
-Scope "/subscriptions/<sub-id>/resourceGroups/rg-prod" `
-PolicyDefinition $policy `
-AssignIdentity `
-Location "brazilsouth"
# View non-compliant resources
Get-AzPolicyState `
-ResourceGroupName "rg-production" `
-Filter "ComplianceState eq 'NonCompliant'" |
Select-Object ResourceId, PolicyAssignmentName, ComplianceState
# Create remediation task
Start-AzPolicyRemediation `
-Name "remediate-tags" `
-PolicyAssignmentId "/subscriptions/<sub-id>/providers/Microsoft.Authorization/policyAssignments/tag-policy" `
-ResourceGroupName "rg-production"
Custom Policy Definitions in JSON/Bicepβ
When no built-in policy meets the need, you create a custom policy definition. The JSON structure of a policy has the following main sections:
{
"mode": "All",
"policyRule": {
"if": {
"allOf": [
{
"field": "type",
"equals": "Microsoft.Compute/virtualMachines"
},
{
"field": "tags['CostCenter']",
"exists": "false"
}
]
},
"then": {
"effect": "[parameters('effect')]"
}
},
"parameters": {
"effect": {
"type": "String",
"defaultValue": "Audit",
"allowedValues": ["Audit", "Deny", "Disabled"]
}
}
}
The mode property defines which types of resources are evaluated:
| Mode | What it evaluates |
|---|---|
| All | All resource types and properties |
| Indexed | Only resource types that support tags and location |
| Microsoft.KeyVault.Data | Key Vault certificates, keys, and secrets |
| Microsoft.Kubernetes.Data | Kubernetes objects in AKS clusters |
| All | All resource types, including resource groups and subscriptions |
| Indexed | Only resource types that support tags and location (recommended for tag and location policies) |
Using Indexed instead of All when the policy is about tags or location avoids false positives on resource types that don't support these properties.
Creating a custom policy via Bicep:
resource customPolicy 'Microsoft.Authorization/policyDefinitions@2021-06-01' = {
name: 'require-costcenter-tag'
properties: {
displayName: 'Require CostCenter tag on VMs'
policyType: 'Custom'
mode: 'Indexed'
parameters: {
effect: {
type: 'String'
defaultValue: 'Audit'
allowedValues: ['Audit', 'Deny', 'Disabled']
}
}
policyRule: {
if: {
allOf: [
{
field: 'type'
equals: 'Microsoft.Compute/virtualMachines'
}
{
field: 'tags[\'CostCenter\']'
exists: 'false'
}
]
}
then: {
effect: '[parameters(\'effect\')]'
}
}
}
}
7. Control and Securityβ
Who can manage policiesβ
Managing Policy Definitions and Assignments requires specific RBAC permissions:
| Action | Required Role |
|---|---|
| Create/edit Policy Definitions | Resource Policy Contributor or Owner |
| Create Policy Assignments | Resource Policy Contributor or Owner |
| View compliance | Reader (view only) |
| Create Remediation Tasks | Resource Policy Contributor + permissions on target resources |
Managed Identity and DINE securityβ
Policies with DeployIfNotExists or Modify effects need a Managed Identity to execute corrective actions. There are two types:
System-assigned Managed Identity: created automatically along with the policy assignment and deleted when the assignment is removed. Recommended for most cases.
User-assigned Managed Identity: created separately and reusable across multiple assignments. Useful when you want precise control over permissions or to reuse the identity.
In both cases, the Managed Identity needs to have adequate permissions (via RBAC) to execute the actions that the policy defines. Microsoft recommends following the principle of least privilege: grant only the permissions necessary for the specific actions of the policy.
Exemptionsβ
Unlike exclusions (notScopes), Exemptions are a more sophisticated feature that allows marking a specific resource as exempt from a policy for a period of time or permanently, with recorded justification.
az policy exemption create \
--name "vm-legacy-exemption" \
--display-name "Legacy VM exemption - ticket #12345" \
--scope "/subscriptions/<sub-id>/resourceGroups/rg-prod/providers/Microsoft.Compute/virtualMachines/vm-legacy-01" \
--policy-assignment "/subscriptions/<sub-id>/providers/Microsoft.Authorization/policyAssignments/require-tags" \
--exemption-category "Waiver" \
--expires-on "2026-12-31"
The exemption categories are:
| Category | When to use |
|---|---|
| Waiver | The resource cannot be corrected (legacy, technically impossible) |
| Mitigated | The requirement is met by an external compensating control |
8. Decision Makingβ
Effect selectionβ
| Situation | Recommended Effect | Reason |
|---|---|---|
| Prohibit resources in unauthorized regions | Deny | Absolute prevention, no exceptions |
| Identify VMs without tags, without blocking | Audit | Initial governance phase, don't impact production |
| Add a default tag if it doesn't exist | Modify | Automatically corrects on creation and via remediation |
| Ensure every VM has monitoring extension | DeployIfNotExists | Creates related resource that doesn't exist |
| Check if VM has backup but don't create automatically | AuditIfNotExists | Alert without automating (decision requires human) |
| Disable policy temporarily | Disabled | Avoids deleting and recreating the policy |
| Force property on resource (not tag) | Append | Adds to existing property array |
When to use Policy vs. RBACβ
| Need | Tool | Reason |
|---|---|---|
| Prevent user X from creating resources | RBAC | Control of who can act |
| Prevent VMs from being created without tags | Policy | Control of how resources can exist |
| Ensure every VM has encryption | Policy (Deny/Modify) | Infrastructure state control |
| Restrict access to data in storage | RBAC (DataActions) | Data access control |
| Prohibit resource creation in wrong region | Policy (Deny) | Infrastructure rule, not identity |
When to use built-in vs. custom policyβ
| Situation | Choice | Reason |
|---|---|---|
| Common security requirement (HTTPS, encryption, MFA) | Built-in | Tested, maintained by Microsoft, free |
| Standard regulatory requirement (ISO, CIS, NIST) | Built-in Initiative | Complete ready package |
| Company-specific business rule | Custom | No built-in covers the case |
| Custom naming convention | Custom | Proprietary rule |
| Required tag with specific allowed values | Custom | Built-in parameters may not be sufficient |
9. Best Practicesβ
Start with Audit, migrate to Deny. Never implement a Deny policy directly in production without first evaluating impact. Assign with Audit effect first, analyze the non-compliance report for a few days, fix existing resources, and then change to Deny.
Parameterize everything that can vary. Policies should be generic and configured via parameters in assignments. A single policy definition "Require specific tag" with name and value parameters is more sustainable than 10 policy definitions for 10 different tags.
Use initiatives for logical grouping. Group related policies in initiatives even if you only have 3 or 4 policies. This facilitates compliance reporting and future expansions.
Never assign policies at Resource level. The most restrictive recommended scope for assignment is Resource Group. Assignments at individual Resource level are difficult to manage and scale.
Document exclusions and exemptions. Every exclusion (notScope) or Exemption should have documented justification, a responsible owner and, when possible, an expiration date. Permanent exemptions without justification are compliance risks.
Use Managed Identity with least privilege for DINE. The policy's Managed Identity should have only the permissions necessary for the specific action that the policy executes, in the most restrictive scope possible.
Prefer System-assigned Managed Identity for simplicity. For most cases, System-assigned MI is sufficient and has lifecycle tied to the assignment, eliminating the risk of orphaned identities.
Keep custom policies versioned. Store custom policy definitions in a Git repository and use CI/CD for deployment. Changes to policy definitions have wide impact and need review and traceability.
10. Common Errorsβ
| Error | Why it happens | How to avoid |
|---|---|---|
| Apply Deny directly in production | Overconfidence in policy | Always start with Audit, analyze, then Deny |
| Forget to give permissions to DINE Managed Identity | Policy created but remediation fails silently | Always check remediation task status |
Use mode: All in tag policy | Evaluates types without tag support, generates non-compliant on system resources | Use mode: Indexed for tag and location policies |
| Confuse Assignment exclusion (notScopes) with Exemption | Both exclude scopes, but Exemption is per resource and traceable | Use Exemption when you need audit trail of exception |
| Create policy with always-true condition | Error in policyRule logic, affects all resources | Test policy with Audit before Deny |
| Not use parameters and create duplicate policies | Design flaw; one policy per tag, per region, etc. | Parameterize from the beginning |
| Believe that remediation is automatic for Modify/DINE | Remediation doesn't happen automatically for existing resources | Create Remediation Tasks explicitly for existing resources |
| Not check Non-compliant resources before Deny | When changing from Audit to Deny, legitimate operations may be blocked | Always resolve non-compliance before activating Deny |
11. Operation and Maintenanceβ
Compliance dashboardβ
The Policy > Compliance portal shows:
- Overall compliance percentage: percentage of compliant resources
- Non-compliant resources: detailed list with links to each resource
- Non-compliant policies: which policies have the most violations
- Compliance by initiative: aggregated view by initiative
You can filter by scope, initiative, individual policy, and state.
Query compliance via CLIβ
# Subscription compliance summary
az policy state summarize \
--subscription "<sub-id>"
# List non-compliant resources in an RG
az policy state list \
--resource-group "rg-producao" \
--filter "complianceState eq 'NonCompliant'" \
--select "resourceId, policyAssignmentName, policyDefinitionName" \
--output table
# View details of ongoing remediation task
az policy remediation show \
--name "remediate-tags" \
--resource-group "rg-producao"
# List deployments created by remediation tasks (for DINE)
az policy remediation deployment list \
--name "remediate-monitoring" \
--resource-group "rg-producao"
Important limitsβ
| Limit | Value |
|---|---|
| Policy definitions per tenant | 500 (custom) |
| Initiative definitions per tenant | 200 (custom) |
| Policy assignments per scope | 200 |
| Parameters per policy definition | 20 |
| Parameters per initiative | 100 |
| Conditions per policy rule | 512 |
The limit of 500 custom policy definitions is sufficient for most organizations, but should be monitored in large tenants with many teams creating policies independently. Policy governance (who can create, approval process) is as important as the policies themselves.
Monitor policy changesβ
Like role assignments, policy assignment changes are logged in the Activity Log:
az monitor activity-log list \
--resource-provider "Microsoft.Authorization" \
--query "[?operationName.value=='Microsoft.Authorization/policyAssignments/write']" \
--output table
12. Integration and Automationβ
Azure Policy as part of a Landing Zone pipelineβ
In organizations using the Landing Zone pattern (like Microsoft's Azure Landing Zone), policies are applied in layers via Management Groups:
Terraform integrationβ
# Custom policy definition
resource "azurerm_policy_definition" "require_tag" {
name = "require-costcenter-tag"
policy_type = "Custom"
mode = "Indexed"
display_name = "Require CostCenter tag"
policy_rule = jsonencode({
if = {
allOf = [
{ field = "type", equals = "Microsoft.Compute/virtualMachines" },
{ field = "tags['CostCenter']", exists = "false" }
]
}
then = { effect = "Deny" }
})
}
# Policy assignment
resource "azurerm_resource_group_policy_assignment" "enforce_tag" {
name = "enforce-costcenter-tag"
resource_group_id = azurerm_resource_group.prod.id
policy_definition_id = azurerm_policy_definition.require_tag.id
}
Azure DevOps / GitHub Actions integrationβ
Policies can be validated in pipeline before deployment, using the Azure Policy Compliance Scan extension:
# GitHub Actions
- name: Azure Policy Compliance Scan
uses: azure/policy-compliance-scan@v0
with:
scopes: |
/subscriptions/<sub-id>/resourceGroups/rg-prod
wait: true
credentials: ${{ secrets.AZURE_CREDENTIALS }}
This allows a pipeline to fail if non-compliant resources are detected after a deployment, creating an automated compliance gate.
Microsoft Defender for Cloud integrationβ
Defender for Cloud (formerly Azure Security Center) uses Azure Policy underneath. Defender's security recommendations are, in practice, policies with AuditIfNotExists effect. When enabling Defender for Cloud, a security initiative is automatically assigned to the subscription.
This means that when viewing recommendations in Defender for Cloud, you're seeing Azure Policy evaluation results. You can view and manage these policies directly in the Policy portal.
13. Final Summaryβ
Essential points:
- Azure Policy controls what can exist and how it should be configured in infrastructure, complementing RBAC which controls who can act
- Every policy has three parts: condition (if), effect (then) and optional parameters
- Preventive effects (Deny, Append, Modify) act synchronously in ARM operation; informative and corrective effects (Audit, AINE, DINE) act asynchronously
- Policies are inherited top-down in the scope hierarchy, just like RBAC
- Resources existing before policy assignment are not blocked; they appear as Non-compliant and need Remediation Tasks for correction
Critical differences:
- Deny vs. Audit: Deny blocks on operation; Audit logs without blocking
- Modify vs. Append: Modify replaces existing values; Append only adds without modifying what exists
- DINE vs. AINE: DINE deploys the missing related resource; AINE only audits the absence
- Exclusion (notScopes) vs. Exemption: notScopes excludes an entire scope from assignment; Exemption exempts a specific resource with traceable justification
- Mode All vs. Indexed: All evaluates all resource types; Indexed evaluates only types that support tags and location
What needs to be remembered for AZ-104:
- The correct sequence is: Audit first, Deny after validating impact
- Policies with DINE and Modify effects need Managed Identity with adequate permissions
- Remediation Tasks must be created explicitly to fix existing non-compliant resources
- The Policy > Compliance portal is the main dashboard to check compliance status
az policy state trigger-scanforces immediate reevaluation without waiting for the 24-hour cycle- Initiatives are policy groupings assignable as a unit, with consolidated compliance reporting
- The limit of assignments per scope is 200 and custom policy definitions per tenant is 500