Troubleshooting Lab: Configure Detection or Prevention Mode
Diagnostic Scenariosβ
Scenario 1 β Root Causeβ
A company deployed an Application Gateway with WAF v2 in front of an e-commerce web application. The WAF was configured two weeks ago by the security team. The development team reports that legitimate customers are receiving HTTP 403 responses when trying to access checkout pages that contain form parameters with fields like order_id and product_description.
The administrator consults the WAF logs and finds the following entries:
{
"ruleId": "942100",
"ruleGroup": "SQLI",
"message": "SQL Injection Attack Detected via libinjection",
"action": "Blocked",
"matchedData": "product_description=SELECT",
"requestUri": "/checkout/submit",
"policyMode": "Prevention"
}
Additional information collected:
- The configured ruleset is OWASP CRS 3.2
- The Application Gateway was provisioned 45 days ago
- The infrastructure team updated the Application Gateway TLS certificate last week
- Affected customers are in different geographic regions
- The
product_descriptionfield accepts free text where users can type product names
What is the root cause of the observed problem?
A. The TLS certificate was recently renewed and the WAF is in a restart state, causing incorrect blocks of legitimate requests.
B. The WAF is in Prevention mode and an OWASP rule is generating a false positive by interpreting legitimate content in the product_description field as SQL injection.
C. The OWASP CRS 3.2 ruleset is incompatible with e-commerce applications and should be replaced with CRS 3.1 to avoid blocks on form fields.
D. Customers are in different geographic regions and the WAF is applying automatic geo-blocking based on location for unauthorized regions.
Scenario 2 β Collateral Impactβ
A company's security team decides to migrate the Application Gateway WAF from Detection mode to Prevention mode in the production environment. The migration is executed immediately after the decision, without any additional preparatory steps. The immediate result is that the WAF starts actively blocking requests that it previously only logged.
What secondary consequence can this action cause?
A. The Application Gateway starts consuming significantly more CPU after changing to Prevention, potentially causing performance degradation for all requests, regardless of whether they are blocked or not.
B. False positives that existed silently in Detection mode, and were never reviewed and adjusted, start blocking legitimate requests from real users in production.
C. The WAF OWASP rules are automatically disabled during the transition between modes, creating an exposure window without protection while the new mode is synchronized.
D. Prevention mode automatically disables WAF diagnostic logs, preventing the security team from viewing the blocks that are occurring after migration.
Scenario 3 β Root Causeβ
A security team configured a WAF in Detection mode on the Application Gateway to monitor traffic from an internal REST API before activating blocking. After a week of operation, the team reviews the logs expecting to find alerts of real threats. The administrator executes the following query in Log Analytics:
AzureDiagnostics
| where ResourceType == "APPLICATIONGATEWAYS"
| where Category == "ApplicationGatewayFirewallLog"
| where TimeGenerated > ago(7d)
| project TimeGenerated, hostname_s, requestUri_s, action_s, ruleId_s, Message
| order by TimeGenerated desc
The returned result is:
No results found for the specified time range.
The administrator confirms that the Application Gateway is receiving real traffic, as access logs show hundreds of requests per hour. The WAF Policy is in Enabled state and associated with the Application Gateway. The Log Analytics workspace was created two months ago.
Additional information:
- The Application Gateway is on Standard_v2 SKU
- The resource region is Brazil South
- The networking team updated the Application Gateway subnet NSG rules three days ago
What is the root cause of the problem?
A. The Application Gateway Standard_v2 SKU does not support WAF and firewall logs will never be generated regardless of configuration.
B. The Application Gateway diagnostic settings do not include the ApplicationGatewayFirewallLog category as a destination for the Log Analytics Workspace, so WAF logs are never being sent.
C. The Application Gateway subnet NSG rules were recently updated and are blocking communication between the Application Gateway and the Log Analytics Workspace.
D. The Log Analytics Workspace was created two months ago and reached the default retention limit, automatically discarding logs from the last 7 days.
Scenario 4 β Action Decisionβ
The security team identified that the WAF in Detection mode is generating a very high volume of false positives in a financial management web application. After reviewing the logs, the analyst identifies that rule 942440 from the SQLI group is triggering on legitimate fields that contain mathematical expressions like total_value = base + interest. The application is critical and processes real-time transactions 24 hours a day.
The cause is identified: rule 942440 generates false positives for the application's legitimate data patterns. The additional context is:
- The WAF is still in Detection mode and has never been placed in Prevention
- The business team has urgency to activate Prevention mode due to compliance requirements
- The application has no equivalent staging environment for WAF testing
- The security team has full access to the WAF Policy in the Azure portal
What is the correct action to take at this moment?
A. Immediately activate Prevention mode on the WAF Policy, since the team has already identified the false positive and can correct the rule after activation if necessary.
B. Create a rule exclusion in the WAF Policy for rule 942440 applied to the specific field that generates the false positive, validate in Detection mode logs that the false positive has been eliminated, and only then migrate to Prevention mode.
C. Disable the complete SQLI rule group in the WAF Policy to eliminate all SQL-related false positives before activating Prevention mode.
D. Replace the current OWASP CRS ruleset with an earlier version that does not contain rule 942440, ensuring the application is not affected before activating Prevention.
Answer Key and Explanationsβ
Answer Key β Scenario 1β
Answer: B
The log is conclusive: the recorded action is Blocked, the mode is Prevention, the triggered rule is 942100 from the SQLI group, and the matched data is product_description=SELECT. The WAF interpreted legitimate textual content in the product description field as a SQL injection attempt because the text entered by the user contains the word SELECT, common in product names or descriptions in Portuguese and English. This is a classic false positive: the WAF is working correctly according to its configuration, but the rule is being applied to a field that accepts free text with uncontrolled content.
The confirmatory clue is in two log details: matchedData: product_description=SELECT and the fact that affected customers are legitimate users accessing the checkout page.
Irrelevant information: The TLS certificate update has no relation to the WAF's payload inspection behavior. The certificate is relevant for the transport layer, not for layer 7 content analysis.
The central reasoning error in the distractors is diverting diagnosis to components that were recently changed (TLS certificate) or to false statements about CRS version incompatibility. The most dangerous distractor is A, as it leads the administrator to investigate and potentially change TLS configurations that are completely outside the scope of the problem.
Answer Key β Scenario 2β
Answer: B
When the WAF operates in Detection mode, it logs all requests that would trigger rules but does not block them. This is precisely the purpose of Detection mode: allowing the team to review and adjust rules before activating blocking. If the logs accumulated during the Detection period were never reviewed and false positives were never treated with exclusions or adjustments, all of them become real blocks the moment Prevention mode is activated. Legitimate users who were previously only "monitored" start receiving 403 responses without warning.
Alternative A is false: the CPU consumption difference between modes is not significant enough to cause degradation, as inspection occurs in both modes. Alternative C is false: there is no unprotected window during transition; the WAF continues inspecting in both modes. Alternative D is false: mode change does not disable diagnostic logs, which continue working normally in Prevention.
The impact of alternative B is the most silent and most common in practice: teams that activate Prevention without reviewing Detection logs frequently cause self-inflicted availability incidents.
Answer Key β Scenario 3β
Answer: B
The Application Gateway sends diagnostic logs to Log Analytics only when a diagnostic setting is explicitly created and includes each desired log category as a destination. The fact that the WAF Policy is enabled and associated with the gateway does not mean firewall logs are being forwarded. The ApplicationGatewayFirewallLog category is separate from access and performance categories and needs to be individually selected in the diagnostic configuration. Without this configuration, WAF logs are silently discarded.
The confirmatory clue is in the contrast between the absence of results in the firewall query and the presence of access logs confirming real traffic. If the gateway receives traffic and generates access logs, the absence of WAF logs points to a diagnostic configuration problem, not a WAF functionality issue.
Irrelevant information: The Brazil South region, NSG subnet updates, and workspace creation date do not affect diagnostic log generation or forwarding. The Application Gateway subnet NSGs control network traffic, not diagnostic telemetry sent to Log Analytics.
The most dangerous distractor is C, which leads the administrator to modify NSG rules that control production traffic, potentially interrupting Application Gateway functionality, when the problem is entirely in diagnostic configuration.
Answer Key β Scenario 4β
Answer: B
The correct action is to create a surgical rule exclusion for rule 942440 applied to the specific field that generates the false positive, validate in Detection mode logs that the false positive has been eliminated, and only then migrate to Prevention. This approach eliminates the identified risk precisely, maintains all other protections intact, and uses Detection mode as a validation environment before activating blocking.
Alternative A ignores a fundamental principle: never activate Prevention without first treating known false positives, as the result will be immediate blocking of legitimate financial transactions in a critical application without staging. Alternative C is excessive and dangerous: disabling the entire SQLI group eliminates all SQL injection protection from the application to solve a false positive from a single rule in a single field. Alternative D is impractical and introduces security regression risk by using an earlier ruleset version that may have uncorrected vulnerabilities.
The central error in alternatives A, C, and D is the disproportionality of the action relative to the identified problem. Rule exclusion exists precisely to solve false positives surgically without compromising the overall security posture.
Troubleshooting Tree: Configure Detection or Prevention Modeβ
Color legend:
| Color | Node type |
|---|---|
| Dark blue | Initial symptom (root) |
| Blue | Diagnostic question |
| Red | Identified cause |
| Green | Recommended action or resolution |
| Orange | Intermediate validation or verification |
To use the tree when facing a real problem, start with the root node and answer each question based on what you observe in the environment. The first step is always to confirm that the WAF Policy is enabled and associated with the correct resource, as without this no inspection occurs. Next, verify that logs are reaching Log Analytics by checking diagnostic settings. With logs available, the path splits according to the symptom: blocks in legitimate traffic point to false positives that require surgical rule exclusions; absence of blocks in Prevention mode after migrating from Detection points to false positives not previously treated. Each path ends in a precise and proportional action to the identified problem.