Skip to main content

Troubleshooting Lab: Implement a WAF Policy

Diagnostic Scenarios​

Scenario 1 β€” Root Cause​

A company deployed an Azure Front Door Premium with an associated WAF Policy to protect a global web application. The security team configured the policy with the Microsoft_DefaultRuleSet_2.1 ruleset and associated it with the Front Door endpoint. After deployment, the development team reports that the application works normally from internal access, but external customers in multiple countries receive HTTP 403 responses intermittently.

The analyst checks the logs and finds:

{
"category": "FrontdoorWebApplicationFirewallLog",
"properties": {
"clientIP": "189.28.14.5",
"requestUri": "/api/v2/search?q=<script>alert(1)</script>",
"action": "Block",
"ruleName": "Microsoft_DefaultRuleSet-2.1-XSS-942110",
"policyMode": "Prevention",
"trackingReference": "0H3MK2...",
"host": "app.contoso.com"
}
}

Additional information:

  • The WAF Policy was created in Prevention mode from the beginning, without a prior Detection period
  • The Front Door Premium was provisioned 10 days ago
  • The custom TLS certificate was imported from Key Vault 5 days ago
  • Internal access uses IPs from the 10.0.0.0/8 range that are on an exclusion list configured in the policy
  • The network team updated the Front Door origins last week

What is the root cause of the observed problem?

A. The custom TLS certificate imported from Key Vault 5 days ago is causing TLS handshake failures for external clients, resulting in 403 responses.

B. The WAF Policy was deployed directly in Prevention mode without a validation period in Detection, and requests from external clients containing patterns detected as XSS are being blocked, including potentially false positives not previously identified.

C. The Front Door origins update last week caused a desynchronization between the WAF Policy and the new backends, causing external requests to be rejected.

D. The exclusion list configured for internal IPs 10.0.0.0/8 is applying inverted logic and blocking all IPs outside this range, including legitimate external customers.


Scenario 2 β€” Diagnostic Sequence​

An administrator receives a ticket: a WAF Policy associated with an Application Gateway v2 is not blocking known malicious requests, even though it's in Prevention mode. The configured ruleset is OWASP CRS 3.2. The administrator needs to diagnose why the protection isn't working.

The available steps for investigation are:

  • (P) Check in the ApplicationGatewayFirewallLog logs if malicious requests appear as Detected or Blocked or if they don't appear in the logs
  • (Q) Confirm that the WAF Policy is in Enabled state and associated with the correct Application Gateway listener
  • (R) Verify if the WAF Policy mode is set to Prevention and not to Detection
  • (S) Confirm that the Application Gateway diagnostic settings include the ApplicationGatewayFirewallLog category sending to Log Analytics
  • (T) Test with a test request containing a known OWASP payload, such as a simulated Nikto scanner, and observe the behavior

What is the correct investigation sequence?

A. T β†’ P β†’ Q β†’ R β†’ S

B. Q β†’ R β†’ S β†’ T β†’ P

C. S β†’ Q β†’ T β†’ P β†’ R

D. R β†’ Q β†’ P β†’ T β†’ S


Scenario 3 β€” Root Cause​

A company's security team migrated a WAF Policy from an Application Gateway from the Classic policy version to a WAF Policy Resource type (independent resource) to gain flexibility in management. After migration, the administrator confirms that the new WAF Policy Resource is in Prevention mode with the OWASP CRS 3.2 ruleset configured correctly.

Two days after migration, the development team reports that the custom rules that blocked IPs from a specific range of a competitor stopped working. Requests originating from the IP 198.51.100.0/24 that were previously blocked now reach the application normally.

The administrator checks the new WAF Policy Resource and confirms:

WAF Policy: waf-policy-prod
Mode: Prevention
Ruleset: OWASP_CRS 3.2
State: Enabled
Association: Application Gateway - appgw-prod (HTTP Listener: listener-https)

Custom Rules:
(no custom rule configured)

Additional information:

  • The WAF Policy Resource was created from scratch during migration
  • The Application Gateway is on the WAF_v2 SKU
  • The previous Classic policy was disassociated and is marked for deletion
  • The infrastructure team updated the resource billing tags last week

What is the root cause of the observed problem?

A. The billing tags update last week caused a reset in the WAF Policy Resource security settings, automatically removing the custom rules.

B. The Application Gateway WAF_v2 SKU doesn't support custom rules in WAF Policy Resources and requires IP blocking rules to be configured as network rules in the subnet NSG.

C. The custom rules that existed in the Classic policy were not recreated in the new WAF Policy Resource during migration, resulting in the absence of IP blocking rules.

D. The WAF Policy Resource only applies custom rules when the OWASP ruleset is disabled, and the presence of active CRS 3.2 is suppressing custom rules evaluation.


Scenario 4 β€” Action Decision​

The security team identified that the WAF Policy associated with an Azure Front Door for a healthcare application has been in Detection mode for four months. During this period, the logs accumulated sufficient data and the analysis shows there are three Microsoft_DefaultRuleSet_2.1 ruleset rules that generate recurring false positives for legitimate application endpoints:

  • Rule 942200 triggering on the /api/patient/search endpoint due to medical search parameters
  • Rule 931130 triggering on the /api/documents/upload endpoint due to filenames with specific extensions
  • Rule 920350 triggering on all endpoints due to the Host header format used by the mobile client

The application processes healthcare data and compliance requirements demand Prevention mode activation. The security team has full access to the WAF Policy.

What is the correct action to take at this time?

A. Activate Prevention mode immediately, as four months of Detection data is sufficient to confirm the environment is ready, and false positives can be handled with exclusions after activation.

B. Create specific rule exclusions for each of the three rules on the exact endpoints and parameters where false positives occur, validate in Detection mode logs that false positives have been eliminated, and only then activate Prevention mode.

C. Disable the three rules identified as sources of false positives in the WAF Policy before activating Prevention mode, ensuring no legitimate blocking occurs after the transition.

D. Replace the Microsoft_DefaultRuleSet_2.1 ruleset with DRS 1.0 ruleset which has fewer rules and lower chance of generating false positives before activating Prevention.


Answer Key and Explanations​

Answer Key β€” Scenario 1​

Answer: B

The log confirms that blocking is occurring through rule XSS-942110 of the Microsoft_DefaultRuleSet_2.1 ruleset, with action: Block and policyMode: Prevention. The root cause is that the WAF Policy was placed directly in Prevention mode without any Detection mode operation period, meaning false positives were never identified and addressed before blocking was activated. Requests from external clients containing patterns interpreted as XSS, including potentially legitimate content, are being blocked.

The central confirmatory clue is in the statement: "The WAF Policy was created in Prevention mode from the beginning, without a prior Detection period." This is the direct indicator that the correct validation process was skipped.

Irrelevant information: The TLS certificate imported from Key Vault affects the transport layer, not WAF payload inspection. The Front Door origins update affects backend routing, not WAF Policy rule application. Both details were inserted to divert diagnosis.

The most dangerous distractor is D, which leads the administrator to investigate and potentially modify the internal IP exclusion list, a sensitive security configuration, when the problem is in the absence of prior rule validation in Detection mode.


Answer Key β€” Scenario 2​

Answer: B

The correct sequence is Q β†’ R β†’ S β†’ T β†’ P:

  1. Confirm that the WAF Policy is enabled and associated with the correct listener (Q) is the absolute prerequisite. If the policy isn't associated, no inspection occurs and all other steps are irrelevant.
  2. Verify if the mode is in Prevention and not Detection (R) is the second step, as in Detection the correct behavior is not to block, which would explain the symptom without indicating a defect.
  3. Confirm that firewall logs are being sent to Log Analytics (S) is necessary before executing tests, as without this there's no way to observe the result in logs.
  4. Execute a controlled test with known payload (T) generates observable evidence.
  5. Analyze the resulting logs (P) closes the diagnosis based on concrete evidence.

Alternative A starts with the test before validating if the policy is associated and if logs work, which can generate tests without observable results. Alternative C starts with diagnostic configuration, skipping the policy association check. Alternative D verifies the mode before association, which is logically correct but inverts priority.


Answer Key β€” Scenario 3​

Answer: C

The custom rules section of the new WAF Policy Resource is empty: (no custom rule configured). When a WAF Policy Resource is created from scratch during migration, it contains only what was explicitly configured in it. Custom rules from the previous Classic policy are not automatically migrated to the new WAF Policy Resource. The administrator would need to recreate them manually or via script. Since the IP blocking custom rules weren't recreated, traffic from the 198.51.100.0/24 range doesn't encounter any rule that blocks it and passes normally.

The confirmatory clue is in two combined elements: the Custom Rules: (no custom rule configured) field in the new policy and the information that it "was created from scratch during migration."

Irrelevant information: Billing tags update doesn't affect functional WAF settings. Tags are organizational and billing metadata with no impact on policy behavior.

The most dangerous distractor is B, as it leads the administrator to erroneously conclude that custom rules aren't supported in WAF_v2 SKU with WAF Policy Resources, when in reality they are fully supported. Acting based on this conclusion would lead to a search for unnecessary alternative solutions, like NSG rules, wasting time while the correct solution is simply recreating the custom rules in the new policy.


Answer Key β€” Scenario 4​

Answer: B

The correct action is to create specific and surgical rule exclusions for each identified false positive, validate that false positives have been eliminated in logs still in Detection mode, and only then activate Prevention mode. This approach uses Detection mode as a validation environment for exclusions before activating blocking, ensuring that the transition to Prevention doesn't cause legitimate request blocking in a critical healthcare application.

Alternative A commits the same error as Scenario 1: activating Prevention before addressing known false positives, with the aggravating factor that here false positives have already been identified and documented, making it even more unjustifiable to ignore them. Alternative C is disproportionate: disabling entire rules removes protection against real threats to resolve false positives that occur only on specific endpoints and parameters. Surgical rule exclusion is always preferable to rule deactivation. Alternative D introduces security regression risk by replacing a more complete ruleset with a simpler one without analyzing the impact on application security posture.

The central principle here is corrective action proportionality: surgical exclusions preserve general protection while specifically eliminating documented false positives.


Troubleshooting Tree: Implement a WAF Policy​

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

Color legend:

ColorNode type
Dark blueInitial symptom (root)
BlueDiagnostic question
RedIdentified cause
GreenRecommended action or resolution
OrangeIntermediate validation or verification

To use the tree when facing a real problem, start at the root node and answer each question based on what you observe in the environment. The first two steps are always to verify if the WAF Policy is enabled and if it's associated with the correct resource, as without these conditions no protection occurs. Next, confirm the mode: Detection logs but doesn't block, and this is the expected behavior. With Prevention mode confirmed, divide diagnosis according to the symptom: legitimate traffic blocking points to false positives requiring surgical exclusions; absence of threat blocking points to missing custom rules or unconfigured logs. Each path ends in a proportional and precise action.