Skip to main content

Troubleshooting Lab: Configure log settings in Azure Monitor

Diagnostic Scenarios​

Scenario 1 β€” Root Cause​

The security team reports that audit logs from an Azure Key Vault are not appearing in the Log Analytics Workspace configured as the destination. The environment was set up three weeks ago and was working normally until two days ago.

The administrator checks the Key Vault's Diagnostic Setting and finds the following state:

{
"name": "diag-keyvault-prod",
"properties": {
"workspaceId": "/subscriptions/a1b2c3/resourceGroups/rg-monitoring/providers/Microsoft.OperationalInsights/workspaces/law-prod",
"logs": [
{ "category": "AuditEvent", "enabled": true },
{ "category": "AzurePolicyEvaluationDetails", "enabled": true }
],
"metrics": [
{ "category": "AllMetrics", "enabled": true }
]
}
}

The administrator also verifies that:

  • The Key Vault is in the East US region
  • The Log Analytics Workspace is in the West Europe region
  • The Workspace has 47 GB ingested in the last 30 days and the plan is Pay-As-You-Go
  • The Key Vault has 3 Azure policies assigned, none related to diagnostics
  • The service account used for KQL queries has the Log Analytics Reader role

When executing the query below in the Workspace, the return is empty:

AzureDiagnostics
| where ResourceType == "VAULTS"
| where TimeGenerated > ago(48h)
| limit 10

What is the root cause of the missing logs in the Workspace?

A) The region difference between the Key Vault and Log Analytics Workspace prevents diagnostic log ingestion between different regions.

B) The Log Analytics Workspace was disconnected from the Diagnostic Setting when it reached a high ingestion volume, requiring manual reconnection.

C) The Workspace was deleted and recreated, generating a new Resource ID that no longer matches the one configured in the Diagnostic Setting.

D) The Log Analytics Reader role assigned to the service account does not allow viewing data from the AzureDiagnostics table, which requires elevated permissions.


Scenario 2 β€” Action Decision​

The operations team identified that the Diagnostic Setting of a production Azure SQL Database was configured incorrectly: all diagnostic logs are being sent to a Storage Account located in a different subscription, when the requirement was to send them to the Log Analytics Workspace of the production subscription.

The cause is confirmed. The current context is:

  • The Storage Account is in another subscription and the data already stored there belongs to another team
  • The production database serves 2,400 concurrent users at this moment
  • There is no maintenance window until Friday (in 4 days)
  • The audit team needs to query logs from the last 6 hours for an ongoing investigation
  • Changing or deleting the existing Diagnostic Setting does not cause database downtime

What is the correct action to take at this moment?

A) Delete the current Diagnostic Setting and create a new one pointing to the correct Log Analytics Workspace, taking advantage that the operation does not affect database availability.

B) Wait for Friday's maintenance window to make any changes to the Diagnostic Setting, avoiding risks in production.

C) Add the Log Analytics Workspace as an additional destination in the existing Diagnostic Setting without removing the Storage Account, ensuring that logs from the next hours are captured for the ongoing investigation.

D) Request temporary access to the Storage Account from the other subscription's team so the audit team can query the logs directly from there, solving the immediate need without changing configurations.


Scenario 3 β€” Root Cause​

An administrator configures a Diagnostic Setting on a Virtual Network to send logs to a Log Analytics Workspace. After 2 hours, he executes the query below and gets normal results:

AzureDiagnostics
| where Category == "VMProtectionAlerts"
| limit 5

The next day, the same administrator tries to configure a second Diagnostic Setting on the same Virtual Network, with destination to a different Event Hub, to feed a SIEM pipeline. The operation is completed without errors in the portal. However, 6 hours later, the SIEM reports that no events were received from the Event Hub.

Additional information collected:

  • The Event Hub was created 8 months ago and is active with other data producers
  • The Event Hub namespace is in the same region as the Virtual Network
  • The administrator has the Owner role on the subscription
  • The Virtual Network has 12 configured subnets and 340 associated resources
  • The Diagnostic Setting created for the Event Hub appears listed in the portal with active status

The administrator checks the Event Hub logs:

Event Hub: evh-siem-prod
Namespace: evhns-corp-eastus
Incoming Messages (last 6h): 0
Active Connections: 14
Outgoing Messages (last 6h): 892

What is the root cause of the problem?

A) The Event Hub has 14 active connections from other producers, reaching the simultaneous connection limit and blocking new data from the Diagnostic Setting.

B) The second Diagnostic Setting was created with log categories disabled by default, and no category was explicitly enabled during configuration.

C) The Diagnostic Setting for Event Hub requires the administrator to manually configure an Authorization Rule with Send permission on the namespace, and this step was omitted.

D) The Virtual Network already has an active Diagnostic Setting, and Azure Monitor does not allow more than one Diagnostic Setting per resource for the same destination type.


Scenario 4 β€” Diagnostic Sequence​

An administrator receives an alert informing that diagnostic logs from an App Service stopped being ingested into the Log Analytics Workspace. The App Service is in production and cannot be restarted.

The following investigation steps are available, out of order:

  1. Verify if the App Service's Diagnostic Setting is enabled and with the correct categories activated
  2. Execute a KQL query in the Workspace to confirm data absence and identify the last ingested record
  3. Check the Log Analytics Workspace health status and confirm if there are ingestion alerts in Azure Monitor
  4. Confirm if the App Service is generating traffic and producing logs at the application level
  5. Check if there were recent changes to the Diagnostic Setting or Workspace using the subscription's Activity Log

What is the correct investigation sequence?

A) 2, 1, 5, 3, 4

B) 4, 2, 1, 5, 3

C) 1, 3, 2, 5, 4

D) 2, 5, 1, 3, 4


Answer Key and Explanations​

Answer Key β€” Scenario 1​

Answer: C

The critical clue is in the statement: the Diagnostic Setting was working normally until two days ago and suddenly stopped, without any declared changes in the visible configuration. The workspaceId in the JSON points to a fixed Resource ID. If the Log Analytics Workspace was deleted and recreated, even with the same name, it receives a new Resource ID. The Diagnostic Setting continues pointing to the previous ID, which no longer exists, and ingestion silently fails.

The irrelevant information in the scenario is the 47 GB volume ingested in the Workspace. This data was included to induce the hypothesis of throttling or ingestion limit, which does not exist as an automatic disconnection mechanism in Azure Monitor.

  • Alternative A is false: Azure Monitor fully supports sending logs between different regions; the geographical difference between resource and Workspace does not prevent ingestion.
  • Alternative B describes a behavior that does not exist: Azure Monitor does not automatically disconnect Diagnostic Settings due to volume.
  • Alternative D confuses visibility with ingestion: the Log Analytics Reader role allows querying already ingested data. The problem is that data is not reaching the Workspace, not that it is inaccessible to the account.

The most dangerous distractor is A, as the region difference is real and visible information, and many administrators assume geographical restrictions that do not exist in Azure Monitor.


Answer Key β€” Scenario 2​

Answer: C

The most critical constraint in the scenario is that the audit team needs logs from the next hours for an ongoing investigation. Deleting the current Diagnostic Setting (alternative A) and creating a new one takes propagation time and leaves a window without collection. During this interval, logs generated by the database would not be captured anywhere, compromising the investigation.

Adding the Log Analytics Workspace as an additional destination without removing the Storage Account ensures immediate and continuous coverage. A single Diagnostic Setting can have multiple simultaneous destinations, and this change does not cause downtime.

  • Alternative B is wrong because waiting for Friday's maintenance window is unnecessary: Diagnostic Settings changes do not affect the monitored resource's availability.
  • Alternative D solves the audit's immediate problem but does not fix the incorrect configuration, leaving the environment in a non-compliant state for 4 more days.
  • The most dangerous distractor is A: technically correct in another context, but ignores the collection absence window during recreation, which is exactly what the ongoing investigation cannot tolerate.

Answer Key β€” Scenario 3​

Answer: B

The root cause is that the second Diagnostic Setting was created without enabling any log category. In the Azure portal, when creating a new Diagnostic Setting, individual log categories are not automatically enabled: the administrator must explicitly select each desired category. The Diagnostic Setting appears as "active" because it was successfully saved, but with zero enabled categories.

The confirmatory clue is in the Event Hub metrics: Incoming Messages (last 6h): 0 with Active Connections: 14 and Outgoing Messages: 892. The Event Hub is working normally for other producers. The problem is at the source, not the destination.

The irrelevant information is the number of subnets and resources associated with the Virtual Network. This data was included to suggest complexity and induce hypotheses about scale limits.

  • Alternative A is false: the Event Hub connection limit does not block producers; the 14 active connections data is irrelevant for diagnosis.
  • Alternative C describes a real restriction in other contexts, but the statement informs that the administrator has the Owner role on the subscription, which includes permissions over Event Hub namespaces.
  • Alternative D is false: Azure Monitor allows multiple Diagnostic Settings per resource, as long as they point to different destinations. Two Diagnostic Settings on the same resource with different destinations is a supported and common pattern.

The most dangerous distractor is D, as the "one destination per type" limitation sounds like a plausible technical restriction and many administrators do not test this scenario in practice.


Answer Key β€” Scenario 4​

Answer: A

The correct sequence is: 2, 1, 5, 3, 4.

The correct diagnostic reasoning starts from the observation point closest to the reported symptom and advances toward possible causes:

Step 2 confirms the symptom with precision: the KQL query determines exactly when ingestion stopped, transforming a generic alert into objective data with timestamp.

Step 1 checks the Diagnostic Setting configuration: if categories are disabled or the destination was changed, the cause is here and subsequent steps are unnecessary.

Step 5 investigates if there was a recent change that explains the interruption: the Activity Log records changes in Diagnostic Settings and the Workspace, revealing human or automated actions that coincide with the timestamp identified in step 2.

Step 3 checks Workspace health: ingestion problems on the destination side (throttling, degraded state) only make sense to investigate after ruling out problems at the source.

Step 4 is last because checking if the App Service is generating logs is only relevant if all other components are working correctly. Additionally, the statement already indicates that the App Service is in active production, making this step the least urgent.

Alternative B starts with the App Service, ignoring that the symptom was already confirmed externally by an alert. Alternative C starts with configuration without first confirming the extent of the problem. Alternative D replicates the correct sequence but reverses steps 1 and 5, investigating changes before confirming the current configuration state.


Troubleshooting Tree: Configure log settings in Azure Monitor​

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

Color legend:

ColorNode type
Dark blueInitial symptom (entry point)
BlueDiagnostic question (yes/no decision)
RedIdentified cause
GreenRecommended action or resolution
OrangeIntermediate validation or verification

To use this tree when facing a real problem, start with the root node describing the observed symptom and follow the diagnostic questions answering yes or no based on what you can verify directly in the portal or via KQL query. Each path ends with a named cause followed by a specific action and a validation step that confirms if the correction was effective. The goal is to avoid corrective actions before confirming the diagnosis: only proceed to action after precisely identifying the corresponding cause node.