Troubleshooting Lab: Create and implement an Azure Firewall deployment

Diagnostic Scenarios

Scenario 1 — Root Cause

An operations team deployed an Azure Firewall in a hub VNet called vnet-hub-eastus with the following addressing:

VNet hub:           10.0.0.0/16
AzureFirewallSubnet: 10.0.1.0/26
GatewaySubnet:       10.0.2.0/27

After provisioning, the firewall appears with Failed status in the Azure portal. The team verifies that the subscription has sufficient quota for public IPs and that the East US region is available. An engineer mentions that the same ARM template was used successfully in another subscription two months ago. The account used for deployment has Contributor permission on the resource group. Activity logs show:

OperationName: Create or Update Azure Firewall
Status: Failed
Error Code: DeploymentFailed
Message: Subnet 'AzureFirewallSubnet' size /26 is too small.
         Minimum required prefix length is /26.
SubnetId: /subscriptions/.../subnets/AzureFirewallSubnet

The team also reports that peering between vnet-hub-eastus and two spoke VNets was configured before the firewall deployment.

What is the root cause of the problem?

A) The peering with spoke VNets was configured before firewall provisioning, causing routing conflicts that prevent deployment.

B) The AzureFirewallSubnet subnet has a /26 prefix, which is smaller than the minimum required by Azure Firewall.

C) The account with Contributor permission does not have sufficient privileges to provision network resources with associated public IP.

D) The ARM template is outdated and references an Azure Firewall API version that doesn't support the East US region.

Scenario 2 — Collateral Impact

An administrator identified that the Azure Firewall in a hub VNet was processing all outbound traffic from spoke VNets without any network or application rules defined in the associated policy. To resolve the excessive permissiveness issue, the administrator added a network rule with priority 100 denying all traffic from any source to any destination on all ports.

The rule was successfully applied and confirmed in the portal. Unauthorized traffic ceased immediately.

What is the most likely secondary consequence of this action?

A) The total deny rule overrides existing DNAT rules, preventing inbound sessions via NAT from continuing to function.

B) Management traffic from the Azure Firewall itself to Microsoft control services is blocked, causing failure in the resource's health probe.

C) Legitimate outbound connections from spoke VNets, including access to services like OS updates and communication with Azure endpoints, are blocked along with unauthorized traffic.

D) The deny rule at priority 100 is ignored by firewall processing because application rules take precedence over network rules when the destination is an FQDN.

Scenario 3 — Action Decision

A security team identified that the Azure Firewall in production is processing traffic between spoke VNets without proper inspection, as User Defined Routes in spoke subnets point to the private IP of a legacy third-party firewall that was decommissioned. The root cause is formally documented: UDRs associated with spoke subnets reference a non-existent next hop.

The environment processes real-time financial transactions. There is no approved maintenance window. The change team requires formal approval with at least 48 hours advance notice for changes to production UDRs. There is an emergency route in the change management process that can be triggered with CISO approval within 2 hours.

What is the correct action to take at this time?

A) Immediately update UDRs in spoke subnets to point to the Azure Firewall private IP, as the risk of uninspected traffic outweighs the risk of an unapproved change.

B) Trigger the emergency route of the change management process, document the risk, obtain CISO approval, and execute the UDR correction within the formal process.

C) Create a new Route Table with correct routes and associate it with spoke subnets without going through the change process, as this is a new resource and not an alteration.

D) Wait for the next approved maintenance window and document the risk in the incident registry while traffic continues without proper inspection.

Scenario 4 — Root Cause

A company configured Azure Firewall in a hub-and-spoke architecture. VMs in spoke VNets can access the internet normally but cannot resolve DNS names for internal resources hosted on a private DNS server located at 10.10.5.4 in the infrastructure spoke VNet. The network team confirms that peering between all VNets is active and bidirectional, and that the DNS server at 10.10.5.4 responds correctly when queried directly from a VM in the same VNet.

The Azure Firewall is configured with the following DNS settings:

"dnsSettings": {
  "servers": [],
  "enableProxy": true
}

Firewall network rules allow UDP traffic on port 53 from any source to any destination. An engineer mentions that the 10.10.5.4 server was added to the environment three weeks ago as part of a DNS migration. VMs in spoke subnets are configured to use the Azure Firewall IP as DNS server.

What is the root cause of the problem?

A) Network rules allow only UDP on port 53, but larger DNS queries use TCP on port 53, which is blocked by the firewall.

B) Peering between spoke VNets does not allow transitive DNS traffic, blocking resolution between subnets of different VNets.

C) The firewall's DNS proxy is enabled, but the upstream DNS servers list is empty, causing the firewall to forward queries to Azure public DNS instead of the private server at 10.10.5.4.

D) VMs are configured to use the firewall IP as DNS, but the firewall does not support internal name resolution and forwards all queries directly to the internet.

Answer Key and Explanations

Answer Key — Scenario 1

Answer: B

The error message in the activity log is direct: Subnet 'AzureFirewallSubnet' size /26 is too small. Minimum required prefix length is /26. At first reading this seems contradictory, but Azure Firewall requires the AzureFirewallSubnet subnet to have a prefix at most /26, meaning /26 is the smallest accepted prefix, and any larger value (like /27, /28) is rejected. The error confirms that the subnet is exactly at the lower limit. However, the message makes it clear that the current size is insufficient, indicating that the prefix used was more restrictive than /26.

The information about peering with spoke VNets is purposely irrelevant: peering does not interfere with firewall provisioning. The Contributor permission is sufficient to provision Azure Firewall. The ARM template API version is not indicated as problematic in any scenario data.

The most dangerous distractor is alternative A, as it leads the diagnosis to the network and routing layer, where the team can spend time investigating peering and route tables without ever finding the real cause, which is in the subnet configuration.

Answer Key — Scenario 2

Answer: C

When a total deny rule is added with high priority (100) without exceptions, it indiscriminately blocks all outbound traffic processed by the firewall. This includes legitimate connections from VMs in spoke VNets to OS update endpoints, Azure services like Azure Monitor, Key Vault, Log Analytics, and any other service that depends on outbound connectivity. The intention was to block unauthorized traffic, but the rule scope is unrestricted.

Alternative A is incorrect: DNAT rules are processed before network rules, and a well-configured DNAT rule is not overridden by a network deny rule. Alternative B describes behavior that does not occur in Azure Firewall: management traffic from the firewall itself to Microsoft control infrastructure does not pass through user rules. Alternative D reverses the precedence logic: network rules are processed before application rules, not the other way around.

The collateral impact of alternative C is the most severe in practice, as it can silently and gradually interrupt critical VM operations in production, with symptoms that take time to be associated with the rule change.

Answer Key — Scenario 3

Answer: B

The scenario imposes an explicit organizational constraint: changes to production UDRs require formal approval with 48 hours advance notice, but there is an emergency mechanism that allows approval within up to 2 hours. The correct action is to use the process that was created exactly for situations like this, where the risk is immediate and documented.

Alternative A ignores the change process without any authorization, exposing the team to disciplinary risk and the environment to an untracked change. Alternative C is an attempt to bypass the process using an invalid technical justification: associating a new Route Table to a subnet is functionally equivalent to altering the existing UDR and is subject to the same process restrictions. Alternative D is unacceptable because it documents the risk and waits passively while the problem persists in a financial transaction environment.

The emergency process exists to be used. Ignoring it in either direction, whether acting without approval or waiting without triggering the formal mechanism, represents governance failure.

Answer Key — Scenario 4

Answer: C

The firewall configuration shows "servers": [] with "enableProxy": true. When DNS proxy is enabled but the upstream servers list is empty, Azure Firewall forwards DNS queries to Azure public DNS (168.63.129.16), which has no knowledge about internal private zones or the DNS server at 10.10.5.4. VMs in spoke subnets use the firewall IP as DNS, so all queries pass through the proxy, and all are resolved only by public DNS.

The information about adding the 10.10.5.4 server three weeks ago is irrelevant to the diagnosis: the server works correctly when queried directly, which eliminates any problem with the server itself. Alternative A is a plausible but incorrect distractor: DNS queries exceeding 512 bytes use TCP/53, but the vast majority of internal resolutions work via UDP/53. Alternative B is technically incorrect: hub-and-spoke peering with traffic routed through the firewall allows DNS transitivity when the proxy is correctly configured. Alternative D is incorrect because Azure Firewall's DNS proxy was designed exactly to forward queries to private DNS servers when properly configured.

Troubleshooting Tree: Azure Firewall Creation and Implementation

100%

Scroll para zoom · Arraste para mover · 📱 Pinch para zoom no celular

Color legend:

Color	Node type
Dark blue	Initial symptom, investigation entry point
Blue	Closed and verifiable diagnostic question
Red	Precisely identified cause
Green	Recommended action or problem resolution
Orange	Intermediate validation before closing diagnosis

To use this tree when facing a real Azure Firewall problem, start with the root node by identifying whether the problem occurs during provisioning or after the firewall is already operational. This first decision defines the main investigation branch. From there, answer each question based only on what has been observed or directly verified, without skipping steps. Each branch ends with a named cause followed by a specific action, ensuring that correction is only applied after the diagnosis is confirmed.

Diagnostic Scenarios​

Scenario 1 — Root Cause​

Scenario 2 — Collateral Impact​

Scenario 3 — Action Decision​

Scenario 4 — Root Cause​

Answer Key and Explanations​

Answer Key — Scenario 1​

Answer Key — Scenario 2​

Answer Key — Scenario 3​

Answer Key — Scenario 4​

Troubleshooting Tree: Azure Firewall Creation and Implementation​