Skip to main content

Troubleshooting Lab: Design an Azure Firewall Deployment

Diagnostic Scenarios​

Scenario 1 β€” Root Cause​

A network team deployed an Azure Firewall Standard in the AzureFirewallSubnet subnet of the hub VNet vnet-hub-eastus. The spoke VNets vnet-app and vnet-data are connected via peering to the hub VNet. A UDR was applied to both spoke VNets redirecting 0.0.0.0/0 to the firewall's private IP.

After deployment, administrators report that VMs in the spoke VNets can communicate with the internet normally, but traffic between the two spoke VNets is being dropped without any explicit error message.

The administrator verifies the rules configured in Azure Firewall and finds the following:

Network rules:
Name: Allow-Internet-Outbound
Source: 10.0.0.0/8
Destination: *
Protocol: TCP
Port: 80,443
Action: Allow

Application rules: none configured
NAT rules: none configured

Additional information collected:

  • Peering between vnet-hub-eastus and the spoke VNets has Allow forwarded traffic enabled
  • Azure Firewall is in Running state and with throughput within expected limit
  • The Log Analytics Workspace was associated with the firewall two weeks ago
  • The AzureFirewallSubnet subnet is sized /26

What is the root cause of the observed problem?

A. Peering between the spoke VNets and hub VNet is configured incorrectly because the Use remote gateways option is not enabled.

B. There is no network rule in Azure Firewall that explicitly allows traffic between spoke VNet prefixes, and the firewall's default behavior is to deny everything that has not been allowed.

C. The Log Analytics Workspace was recently associated and is not yet capturing denied flow logs, preventing proper diagnosis.

D. The AzureFirewallSubnet subnet with /26 size is insufficient for Azure Firewall Standard and is causing packet drops.


Scenario 2 β€” Diagnostic Sequence​

An engineer receives a ticket: Azure Firewall is allowing traffic that should be blocked by a newly created application rule. The traffic in question is HTTP to a specific domain from a VM in the spoke VNet.

The available steps for investigation are:

  • (P) Check Azure Firewall logs (category AzureFirewallApplicationRule) to see if traffic appears as allowed or denied and which rule was applied
  • (Q) Confirm that the VM's subnet UDR is redirecting traffic to the firewall's private IP
  • (R) Check the priority of the rule collection containing the blocking rule compared to other application rule collections
  • (S) Confirm that the application rule is configured with the correct FQDN and HTTP protocol on port 80
  • (T) Test connectivity from the VM to the blocked domain using curl or browser

What is the correct investigation sequence?

A. T β†’ P β†’ Q β†’ S β†’ R

B. Q β†’ T β†’ P β†’ R β†’ S

C. P β†’ Q β†’ S β†’ T β†’ R

D. T β†’ Q β†’ P β†’ S β†’ R


Scenario 3 β€” Root Cause​

The security team configured a Firewall Policy in Azure Firewall Premium with an application rule to inspect and block outbound TLS traffic to the site category classified as Malware. The environment has the following characteristics:

  • Azure Firewall Premium is deployed in the hub VNet
  • Firewall Policy has TLS Inspection enabled
  • A self-signed root certificate was created for TLS inspection and stored in Key Vault
  • The application rule points to the Malware category with Deny action
  • VMs in spoke VNets continue to access sites classified as malware without blocking

The operations team checks the logs and finds the following:

Category:    AzureFirewallApplicationRule
Action: Allow
Rule: Default-Allow-Web
RuleCollection: AllowWeb-Collection
TargetUrl: http://known-malware-site.example.com
Protocol: Http:80

The security administrator mentions that the root certificate was imported to Key Vault three days ago. Firewall throughput is within expected range. The firewall's managed identity has access to Key Vault.

What is the root cause of the problem?

A. The self-signed root certificate was recently imported to Key Vault and has not yet synchronized with Azure Firewall Premium.

B. The traffic recorded in logs uses HTTP protocol on port 80, which is not TLS traffic. TLS inspection and malware category rules apply only to HTTPS traffic, so another permission rule is being applied first.

C. The firewall's managed identity does not have sufficient permissions in Key Vault to access the certificate, causing TLS inspection to fail silently and traffic to be allowed.

D. Firewall Policy with TLS Inspection enabled is incompatible with threat category rules in Azure Firewall Premium, and these two features cannot be used simultaneously.


Scenario 4 β€” Action Decision​

The architecture team identified that the Azure Firewall deployed in production is on the Standard SKU and needs to be migrated to the Premium SKU to enable TLS inspection and threat category filtering. The firewall is in active operation, processing all outbound traffic from 12 spoke VNets. The migration needs to occur with the smallest possible downtime window.

The cause is identified: the Standard SKU does not support the required features. Additional context is:

  • There is a Firewall Policy of Standard type already associated with the current firewall
  • The environment has a second staging environment with identical configuration
  • The Change Management team requires an approved maintenance window for any change affecting the data plane
  • The network team has full Azure access and can execute the migration

What is the correct action to take at this time?

A. Delete the current Azure Firewall Standard, provision a new Azure Firewall Premium in the same place, and reassign all spoke VNet UDRs to the new Premium firewall private IP.

B. Update the existing Azure Firewall SKU directly through the Azure portal from Standard to Premium, maintaining the same IP and UDRs without interruption.

C. Migrate the existing Firewall Policy from Standard to Premium type, associate it with the current firewall, and wait for automatic synchronization of new features.

D. Provision a new Azure Firewall Premium in parallel, migrate the Firewall Policy to Premium type, validate in the staging environment, and plan the cutover with approved maintenance window by updating UDRs at transition time.


Answer Key and Explanations​

Answer Key β€” Scenario 1​

Answer: B

Azure Firewall operates with an implicit deny policy: any traffic that is not explicitly allowed by a network, application, or NAT rule is dropped. The existing rule, Allow-Internet-Outbound, only allows TCP traffic on ports 80 and 443 to any destination, which covers internet communication. There is no rule that allows traffic between spoke VNet prefixes, such as from 10.1.0.0/16 to 10.2.0.0/16. All this traffic reaches the firewall via UDR and is silently dropped due to lack of a corresponding rule.

The confirmatory clue is in the absence of rules covering east-west communication between spoke VNets. Internet traffic works because a rule exists for it; spoke-to-spoke traffic fails because it doesn't exist.

Irrelevant information: The Log Analytics Workspace associated two weeks ago does not affect firewall behavior. The /26 subnet is the minimum recommended size for Azure Firewall and is correct. Peering with Allow forwarded traffic is adequate.

The most dangerous distractor is A, which leads the administrator to modify peering configurations that are correct, wasting time while the real problem, the absence of network rule, remains unsolved.


Answer Key β€” Scenario 2​

Answer: B

The correct sequence is Q β†’ T β†’ P β†’ R β†’ S:

  1. Before any testing, you need to confirm that VM traffic is actually passing through the firewall via UDR (Q). If the UDR is missing or incorrect, the firewall will never see the traffic and any investigation into rules will be useless.
  2. With the route confirmed, reproduce the behavior (T) to ensure the problem still occurs and generate log entries.
  3. Check the logs (P) to verify which rule was actually applied to the traffic.
  4. With the rule name and collection identified in logs, check priority (R) to understand if another higher-priority collection is allowing traffic before the blocking rule.
  5. Confirm the exact rule configuration (S) to validate FQDN and protocol.

Alternative A skips UDR verification and goes straight to testing, which can generate misleading results. Alternative C starts with logs without first reproducing the problem, which may result in logs without recent entries. Alternative D checks UDR after testing, inverting the logical order.


Answer Key β€” Scenario 3​

Answer: B

The log records site access as Protocol: Http:80, that is, pure HTTP traffic on port 80, not HTTPS. Azure Firewall Premium's TLS inspection intercepts and inspects only TLS/HTTPS traffic. Threat category rules that depend on content inspection also operate on HTTPS traffic. HTTP traffic on port 80 does not go through the TLS inspection chain and is evaluated only by common application rules without content inspection. Since there is a higher-priority Allow rule covering HTTP traffic, access is allowed before any category analysis.

The confirmatory clue is in the Protocol: Http:80 field in the logs, which reveals that the traffic is not TLS and therefore would never be intercepted by TLS Inspection functionality.

Irrelevant information: The certificate import date, throughput, and managed identity permissions are details that do not influence diagnosis, as the problem does not involve TLS.

The most dangerous distractor is C, which leads the administrator to investigate Key Vault permissions and possibly grant unnecessary access, when the real cause is the non-TLS nature of the traffic.


Answer Key β€” Scenario 4​

Answer: D

The correct action is to provision Azure Firewall Premium in parallel, validate in the staging environment, and execute cutover in an approved maintenance window. This approach meets all scenario constraints: minimizes downtime by keeping the current firewall operational until transition time, respects the Change Management process, and allows prior validation in staging before affecting production.

Alternative A requires deleting the existing firewall before provisioning the new one, which implies total interruption of all traffic from the 12 spoke VNets during provisioning. Alternative B is technically invalid: Azure Firewall does not support in-place SKU upgrade from Standard to Premium; these are distinct resources that need to be created separately. Alternative C is also invalid: a Standard type Firewall Policy cannot be converted to Premium; you need to create a new Premium policy and reconfigure the rules.

The central reasoning error in distractors A and B is assuming there is an in-place migration path when, in practice, the transition between SKUs requires deploying a new resource.


Troubleshooting Tree: Design an Azure Firewall Deployment​

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

Color legend:

ColorNode type
Dark blueInitial symptom (root)
BlueDiagnostic question
RedIdentified cause
GreenRecommended action or resolution
OrangeIntermediate validation or verification

To use the tree when facing a real problem, start with the root node and answer each question based on what you observe in the environment. The mandatory first step is to confirm that traffic is actually reaching the firewall via UDR, as without this confirmation any investigation into rules may be irrelevant. Next, check logs before any changes, as they reveal which rule was applied and in which direction. Follow the path corresponding to the observed behavior: traffic blocked unexpectedly points to missing rules or SKU incompatibility; traffic allowed unexpectedly points to collection priority issues, protocol not covered by inspection, or failure in the TLS Inspection chain.