Troubleshooting Lab: Interpret Virtual Network Flow Logs
Diagnostic Scenariosβ
Scenario 1 β Root Causeβ
An operations team configured VNet Flow Logs in a production VNet called vnet-core-prod and enabled Traffic Analytics with a 10-minute aggregation interval. The associated Log Analytics workspace is law-netops-prod. After 48 hours, the team accesses Traffic Analytics in the portal and finds that no flow data is being displayed. The traffic map appears empty and KQL queries in the workspace return zero records for the NTANetAnalytics table.
The analyst verifies the configurations and collects the following information:
VNet Flow Logs Status: Enabled
Storage Account: stflowlogsprod001 (LRS, East US region)
Retention (days): 30
Traffic Analytics: Enabled
Workspace: law-netops-prod (Brazil South region)
Aggregation Interval: 10 minutes
VNet: vnet-core-prod (East US region)
The analyst also confirms that the storage account stflowlogsprod001 is receiving JSON files normally in the insights-logs-flowlogflowevent container, and that the workspace law-netops-prod is active and responding to other queries from other resources.
What is the root cause of the problem?
A) The 10-minute aggregation interval is too short for Traffic Analytics to process and display data in VNets with high flow volume.
B) The Log Analytics workspace is in a different region from the VNet and storage account, which prevents Traffic Analytics from processing the data correctly.
C) The storage account with LRS replication is not supported by VNet Flow Logs, and the written JSON files are discarded before reaching Traffic Analytics.
D) The insights-logs-flowlogflowevent container is the correct destination for NSG Flow Logs, not for VNet Flow Logs, indicating that the configured resource is of the wrong type.
Scenario 2 β Collateral Impactβ
The security team of an organization identified that the VNet Flow Logs from vnet-dmz-prod were being stored in a storage account with a retention policy set to 7 days. To meet a new compliance requirement that requires a minimum retention of 90 days, the administrator changed the retention value directly in the VNet Flow Logs configuration to 90 days and confirmed the change in the portal. The change was successfully applied and the portal displays the new value.
What secondary consequence might this change cause?
A) Flow records from the 7 days prior to the change will be automatically replicated to a cold storage tier, immediately increasing transaction costs.
B) The storage cost of the associated account will progressively increase as log blobs accumulate for 90 days without automatic deletion, potentially exceeding the provisioned budget for the account.
C) Traffic Analytics will stop processing flow data immediately after the retention change, as the service requires retention and aggregation interval to be synchronized.
D) Logs generated before the retention change will be deleted immediately, as the new policy retroactively overwrites the lifecycle of existing blobs.
Scenario 3 β Action Decisionβ
A company's security team needs to urgently identify whether a specific external IP address (198.51.100.42) established connections with any VM in vnet-app-prod in the last 6 hours. VNet Flow Logs is enabled and data is already available in the Log Analytics workspace law-security-prod. The team has access to the Azure portal and Log Analytics. The incident is ongoing and each minute of delay increases the risk.
The cause of the investigation is identified: suspicious traffic from external source. The objective now is to confirm or rule out the presence of this IP in the flow records quickly and accurately.
What is the correct action to take at this moment?
A) Access Traffic Analytics in the portal and use the IP filter in the traffic map view to locate the address 198.51.100.42.
B) Execute a KQL query directly in the workspace law-security-prod filtering by source IP in the flow log tables for the last 6 hours interval.
C) Download the raw JSON files from the storage account for the last 6 hours and perform manual search for the IP in the records.
D) Enable packet capture diagnostics on all VMs in vnet-app-prod to confirm in real time if the IP is still active.
Scenario 4 β Root Causeβ
A network analyst is investigating why a VM called vm-api-01 in the subnet-api of vnet-backend-prod does not appear in the VNet Flow Logs flow records, while all other VMs in the same subnet appear normally. The analyst executes the following KQL query and confirms absence of records for this VM:
NTANetAnalytics
| where TimeGenerated > ago(24h)
| where SrcIp == "10.2.1.15" or DestIp == "10.2.1.15"
| project TimeGenerated, SrcIp, DestIp, FlowStatus, AllowedInFlows, DeniedInFlows
The result returns zero rows. The analyst verifies that the VM is running, that its NIC is associated with the correct subnet, and that IP 10.2.1.15 is indeed the private IP of vm-api-01. VNet Flow Logs is enabled at the vnet-backend-prod VNet level. Traffic Analytics is active with a 10-minute interval. The VM was recently migrated from vnet-legacy-prod to vnet-backend-prod 36 hours ago.
Other information collected by the analyst:
- The NSG associated with the NIC of
vm-api-01was created more than 6 months ago and is active. - The VM has an associated public IP configured as Static.
- The Log Analytics workspace is receiving data from all other VMs normally.
What is the root cause of the problem?
A) The static public IP associated with vm-api-01 prevents VNet Flow Logs from recording the VM's private traffic, as the service prioritizes tracking flows with dynamic public IP.
B) VNet Flow Logs requires a propagation period after migrating a VM between VNets, and the 36 hours elapsed are still within the expected window of up to 72 hours for recording to begin.
C) The KQL query is directed to the NTANetAnalytics table, which is populated by Traffic Analytics and has aggregation latency. Since the VM was migrated 36 hours ago and Traffic Analytics aggregates in intervals, the data may not have been processed for this IP yet.
D) VNet Flow Logs was enabled in the VNet before the VM migration, but the service does not automatically detect new NICs added to the VNet after enablement, requiring re-enablement of the resource.
Answer Key and Explanationsβ
Answer Key β Scenario 1β
Answer: B
Traffic Analytics requires that the Log Analytics workspace be in the same region as the monitored VNet or in a supported region for the pair. When the workspace is in an incompatible region or different from what is expected for processing, the service silently fails in the ingestion and aggregation step, without displaying explicit errors in the portal. The result is exactly what was observed: the storage account receives the data correctly, but Traffic Analytics does not process it.
The decisive clue is in the combination: vnet-core-prod and stflowlogsprod001 in East US, with the workspace in Brazil South. The data flow to the storage account works independently of Traffic Analytics, which explains why the JSONs arrive normally but no data appears in the map.
Distractor A is irrelevant: the 10-minute interval is the smallest available and does not cause total absence of data, only greater visualization latency in low-volume environments. Distractor C is factually incorrect: LRS is supported by VNet Flow Logs. Distractor D represents confusion between VNet Flow Logs and NSG Flow Logs, which are distinct resources with different destination containers, but the statement describes VNet Flow Logs configured correctly regarding the resource type.
Acting on distractor D would lead to reconfiguring a resource that is working correctly, wasting time and not solving the real problem.
Answer Key β Scenario 2β
Answer: B
The retention policy configured in VNet Flow Logs controls how long log blobs are kept in the storage account before being automatically deleted. By increasing from 7 to 90 days, the accumulated data volume in the account will grow continuously throughout the new retention period. For VNets with intensive traffic, this difference can represent a significant and progressive increase in storage cost, especially if the account does not have lifecycle policies or cost alerts configured.
This is the real and relevant collateral impact: the change is technically correct to meet the compliance requirement, but has a direct financial consequence that may not have been anticipated.
Distractor A is incorrect: the VNet Flow Logs retention policy does not trigger automatic replication to cold tiers; blobs are simply kept or deleted according to the deadline. Distractor C is incorrect: retention and aggregation interval are independent settings and do not need to be synchronized. Distractor D is the most dangerous: the new retention policy is not retroactive; it only changes the deletion behavior of future blobs, not existing ones.
Answer Key β Scenario 3β
Answer: B
In an active incident, speed and accuracy are the critical constraints. The Log Analytics workspace already contains the ingested flow data, and a direct KQL query allows filtering by source IP, exact time window, and relevant fields in seconds. This is the fastest, most accurate, and most auditable path to confirm or rule out the presence of the IP.
Distractor A seems reasonable, but Traffic Analytics has aggregation latency of up to 60 minutes and the map interface does not offer the granularity and exactness of a direct KQL query. In an urgent incident, relying on the map increases the risk of false negative due to latency.
Distractor C is technically correct as a last resort method, but is extremely slow and impractical when the workspace is already available with processed data.
Distractor D represents a valid action for real-time investigation, but enabling packet capture on all VMs in a production VNet is disruptive, time-consuming, and unnecessary when flow logs already contain the history of the last 6 hours. Additionally, it would not confirm what already happened, only what is happening now.
Answer Key β Scenario 4β
Answer: C
The NTANetAnalytics table is populated by Traffic Analytics, not directly by VNet Flow Logs. Traffic Analytics aggregates raw flow log data at configured intervals (in this case, 10 minutes) and then processes them to the table. For a VM migrated only 36 hours ago, it is expected that the data is still in the process of aggregation and ingestion in the workspace, especially considering that Traffic Analytics can take up to several hours to stabilize visualization of a new endpoint.
The decisive clue is the combination of two factors: the recent migration (36 hours) and the use of the NTANetAnalytics table, which is the output of Traffic Analytics and not raw flow logs. To confirm if raw data exists, the analyst should query the JSONs in the storage account or use the raw flow logs table, if configured.
Distractor A is factually incorrect: the type of public IP (static or dynamic) does not influence private traffic recording by VNet Flow Logs.
Distractor B invents a non-existent behavior: there is no documentation of a 72-hour propagation window for starting recording of new VMs.
Distractor D represents the most common reasoning error in this scenario: confusing the coverage of VNet Flow Logs, which is at the VNet level and automatically covers all associated NICs, with static recording behavior that would require re-enablement. VNet Flow Logs automatically detects NICs added to the VNet after enablement.
Troubleshooting Tree: Interpret Virtual Network Flow Logsβ
Color legend:
- Dark blue: initial symptom, entry point in the tree
- Blue: diagnostic question node, requires active observation or verification
- Red: identified cause, confirmed root of problem
- Green: recommended action or resolution applicable to context
- Orange: validation or intermediate verification after corrective action
To use this tree when facing a real problem, start at the root node describing the observed symptom: total absence of data, partial data, or unexpected results in queries. Answer each question based on what you observe in the portal, storage account, or Log Analytics workspace. Each branch eliminates a class of cause and directs to the next more specific verification. Do not skip steps: confirming that JSONs reach storage before investigating Traffic Analytics is fundamental to avoid diagnosing the wrong component.