Troubleshooting Lab: Design a WAF Deployment
Diagnostic Scenariosβ
Scenario 1 β Root Causeβ
A security team configured Azure Web Application Firewall (WAF) in Prevention mode on an Azure Application Gateway v2 to protect an internal REST API. The environment uses managed rules from the OWASP CRS 3.2 rule set without customizations. The Application Gateway is in a dedicated subnet with an associated NSG.
After WAF activation, developers report that legitimate requests from an integration client are being blocked. The operations team collects the following log:
{
"resourceId": "/SUBSCRIPTIONS/.../APPLICATIONGATEWAYS/APIGW-PROD",
"operationName": "ApplicationGatewayFirewall",
"category": "ApplicationGatewayFirewallLog",
"properties": {
"instanceId": "appgw-prod_1",
"clientIp": "10.50.1.22",
"requestUri": "/api/v2/payload",
"ruleSetType": "OWASP",
"ruleSetVersion": "3.2",
"ruleId": "942100",
"message": "SQL Injection Attack Detected via libinjection",
"action": "Blocked",
"site": "Global",
"details": {
"matchVariableName": "args",
"matchVariableValue": "SELECT=value&filter=active"
}
}
}
The team verifies that the client sends query string parameters with names like SELECT and filter, which are common terms in their API. The Application Gateway subnet NSG allows traffic on ports 80 and 443 from any source. The Application Gateway SKU is WAF_v2. The WAF was activated 48 hours ago and no exclusion rules have been configured.
What is the root cause of the problem?
A) The NSG associated with the Application Gateway subnet is blocking return traffic from the backend API, causing false positives in the WAF log.
B) The OWASP CRS 3.2 rule set is interpreting the API query string parameter names as SQL injection patterns, and no exclusion has been configured to protect these specific fields.
C) WAF in Prevention mode with WAF_v2 SKU does not support query string inspection in requests to REST APIs, requiring downgrade to Standard_v2 SKU.
D) RuleId 942100 was incorrectly activated during WAF configuration and needs to be removed from the managed rule set to restore normal operation.
Scenario 2 β Side Impactβ
An organization operates an Azure Front Door Premium with WAF enabled in Detection mode to monitor a public web application. After a security analysis, the operations team decides to migrate the WAF to Prevention mode with the goal of blocking active attacks identified in the logs.
The change is successfully applied and confirmed in the portal. In the first 30 minutes after the change, the helpdesk receives complaints from legitimate users reporting HTTP 403 errors on specific application functionalities, including the advanced search module and document upload form.
The team confirms that no changes were made to managed rules or custom rules during the transition. The application was not modified.
What side consequence did this mode change cause?
A) The change to Prevention mode automatically disabled WAF diagnostic logs, preventing visibility into active blocks.
B) Requests that previously only generated alerts in Detection mode began being blocked in Prevention mode, exposing preexisting false positives that had never been addressed.
C) Azure Front Door Premium reprocessed the cache of all previous requests when applying the new mode, discarding active sessions and forcing user re-authentication.
D) Enabling Prevention mode in WAF associated with Azure Front Door Premium requires profile restart, which caused temporary unavailability during configuration propagation.
Scenario 3 β Root Causeβ
An architect deployed a global WAF policy on Azure Front Door Premium to protect multiple origins distributed across different regions. The policy is in Prevention mode with managed rules from the Microsoft_DefaultRuleSet 2.1 rule set and two custom geolocation blocking rules.
After deployment, the QA team reports that requests originating from certain countries that should be blocked by geolocation rules continue reaching the origins without blocking. The architect verifies the configuration and observes the following:
Custom Rules:
Rule Name: BlockGeoRegion
Priority: 200
Action: Block
Match Condition: RemoteAddr GeoMatch [CN, RU, KP]
Managed Rules:
RuleSet: Microsoft_DefaultRuleSet 2.1
Status: Enabled
WAF Policy Association:
Profile: FrontDoor-Premium-Prod
Domain: app.contoso.com (Enabled)
Domain: api.contoso.com (Not associated)
The architect also mentions that the policy was created in the East US region and that Front Door Premium uses global points of presence. The monitoring team confirms that WAF logs show requests blocked correctly for app.contoso.com, but there are no blocking records for api.contoso.com.
What is the root cause of the problem?
A) WAF policies created in the East US region are not automatically replicated to points of presence outside North America, requiring creation of separate regional policies.
B) The Microsoft_DefaultRuleSet 2.1 rule set overrides custom geolocation rules with priority above 100, requiring reconfiguration of the BlockGeoRegion rule priority.
C) The api.contoso.com domain is not associated with the WAF policy, so traffic destined for this endpoint is not inspected or filtered by the configured rules.
D) Custom geolocation rules in Azure Front Door Premium require the WAF to be associated with a specific routing endpoint, not a global profile, for blocking to work per domain.
Scenario 4 β Action Decisionβ
The security team identified that a specific managed rule from the OWASP CRS 3.2 rule set, ruleId 931130, is generating false positives on a critical file upload endpoint of a production application on Application Gateway WAF v2. The WAF is in Prevention mode and the affected endpoint processes legal contracts in PDF format sent by external partners.
The cause has been confirmed: the form field name used by the partners' legacy system contains a pattern that triggers rule 931130 (Remote File Inclusion). Modifying the partners' legacy system is not viable in the short term. The environment is in production and any interruption has direct contractual impact.
What is the correct action to take at this time?
A) Globally disable ruleId 931130 in the WAF policy to eliminate false positives immediately, keeping all other rules active.
B) Configure a WAF rule exclusion that excludes the specific form field from ruleId 931130 evaluation, preserving rule protection for all other fields and endpoints.
C) Migrate the WAF to Detection mode temporarily, log all partner requests for analysis, and return to Prevention mode after confirming the exact false positive pattern.
D) Create a custom rule with Allow action and higher priority that blocks evaluation of partner traffic before it reaches managed rules, completely freeing the upload endpoint for partner IPs.
Answer Key and Explanationsβ
Answer Key β Scenario 1β
Answer: B
The root cause is that the API query string parameter names, specifically SELECT and filter, match patterns that the OWASP CRS 3.2 detection engine associates with SQL injection attempts. RuleId 942100 uses the libinjection library to detect SQL injection patterns in request variables, including query string arguments. Since the API uses naming conventions that coincide with SQL reserved words, the WAF interprets the requests as malicious.
The definitive clue in the scenario is the combination of three elements: ruleId 942100 with the message "SQL Injection Attack Detected via libinjection", the match variable args, and the value SELECT=value&filter=active. The log shows precisely that it's the parameter names, not the values, that trigger the rule.
The irrelevant information is the NSG configuration. The log confirms that the request reached the WAF and was processed, which rules out any network blocking hypothesis before the WAF. The NSG has no relation to the false positive.
The main reasoning error of the distractors is directing the diagnosis toward infrastructure (NSG in A) or non-existent product limitations (C), instead of focusing on the direct log evidence. Alternative D is the most dangerous distractor: disabling ruleId 942100 entirely would remove SQL injection protection from the entire application, when the correct solution is to configure a surgical exclusion only for the affected fields and endpoints.
Answer Key β Scenario 2β
Answer: B
The actual side impact is the exposure of false positives that existed before the mode change but had never been addressed. In Detection mode, the WAF evaluates all requests, logs those that match rules, but doesn't block any. In Prevention mode, the same rules actively block. If false positives existed in Detection mode without treatment, they become real blocks immediately after the transition.
The scenario confirms this by indicating that no rules were changed and the application was not modified. The 403 errors in advanced search and document upload functionalities are exactly the profile of endpoints that commonly generate false positives in SQL injection and File Upload CRS rules, respectively.
The most dangerous distractor is D, because it describes behavior that doesn't exist: changing mode in Azure Front Door Premium WAF doesn't cause profile restart or unavailability. Acting on D would lead the team to open a support ticket about non-existent behavior, wasting time while users remain blocked.
The practical consequence of this scenario is that the transition from Detection to Prevention should always be preceded by a period of analysis and treatment of false positives identified in Detection logs.
Answer Key β Scenario 3β
Answer: C
The root cause is that the api.contoso.com domain is not associated with the WAF policy. In Azure Front Door Premium, a WAF policy must be explicitly associated with each domain or endpoint you want to protect. Association is not automatic for all profile domains. The configuration output in the scenario clearly shows: api.contoso.com (Not associated).
The definitive clue is the absence of blocking records in the WAF for api.contoso.com, while blocks for app.contoso.com work correctly. This eliminates any hypothesis of problems in the rule itself or in the managed rule set, since both domains would use the same policy if it were associated.
The irrelevant information is the mention of the East US region where the policy was created. WAF policies associated with Azure Front Door are global by nature and have no regional scope. The creation region doesn't affect policy coverage or propagation.
The most dangerous distractor is A, because the premise that policies created in a specific region don't propagate globally is false for Front Door, but plausible for those who confuse it with regional services. Acting on A would lead to unnecessary creation of duplicate policies, increasing operational complexity without solving the real problem.
Answer Key β Scenario 4β
Answer: B
The correct action is to configure a WAF rule exclusion that removes the specific form field from ruleId 931130 evaluation. The WAF exclusion mechanism allows precise definition of which request variable (field name, header, cookie) should be excluded from evaluation by a specific rule or rule set, without affecting protection for other requests.
The determining restriction context is active production with contractual impact. Alternative A globally disables ruleId 931130, which removes Remote File Inclusion protection from the entire application, not just the problematic field. Alternative C migrates to Detection temporarily, which is counterproductive: the cause has already been confirmed and the mode change exposes the entire application while additional diagnosis adds no value.
Alternative D is technically plausible and frequently used, but presents a specific risk in this scenario: creating an Allow rule by IP for all external partners completely frees the upload endpoint for those IPs, including protection against other threats besides ruleId 931130. Rule exclusion is surgical and keeps all other protections active for the same traffic.
A well-configured exclusion specifies: the exact form field, the match operator, and scope restricted to ruleId 931130, which preserves the principle of least privilege in WAF configuration.
Troubleshooting Tree: Design a WAF Deploymentβ
Color Legend:
| Color | Node Type |
|---|---|
| Dark Blue | Initial symptom (tree root) |
| Medium Blue | Diagnostic question |
| Red | Identified cause |
| Green | Recommended action or resolution |
| Orange | Intermediate validation or verification |
To use this tree when facing a real problem, start at the root node by identifying whether the WAF is blocking legitimate traffic or failing to inspect expected traffic. The first bifurcation checks for log records, which immediately separates policy association problems from rule configuration problems. Follow the questions answering only what can be verified directly in the portal, diagnostic logs, or policy configuration, without assuming the cause. Each branch eliminates a class of hypothesis until you reach a specific cause with the corresponding action.