Skip to main content

Theoretical Foundation: Configure and interpret monitoring of virtual machines, storage accounts, and networks by using Azure Monitor Insights


1. Initial Intuition​

Imagine you are responsible for a physical data center. You have panels with lights indicating each server's status, audio alerts when a disk is full, and daily network consumption reports. Without this, you would only discover problems when an application went down.

In Azure, the equivalent of this control panel is Azure Monitor. It collects performance, availability, and behavior data from all your resources, and Azure Monitor Insights is the specialized layer that transforms this raw data into ready-to-use visualizations and analyses, specific by resource type.

The difference between generic Azure Monitor and Azure Monitor Insights is this: Monitor is the collection and analysis engine; Insights is the dashboard already configured and optimized for each resource type (VMs, Storage, Networks).


2. Context​

Azure Monitor exists because cloud resources are dynamic, distributed, and ephemeral. A VM might be consuming 100% CPU without anyone noticing. A storage account might be generating abnormal latency. A VNet might have traffic silently blocked by an NSG.

Without observability, you operate in the dark.

Azure Monitor positions itself as the central observability platform of Azure, collecting three types of data:

  • Metrics: numerical values collected at regular intervals (e.g., CPU %, network bytes).
  • Logs: structured records of events and operations (e.g., VM login, blob operation).
  • Traces: distributed execution traces in applications (more relevant for Application Insights).

Azure Monitor Insights consumes this data and presents it in pre-built visual experiences, without you needing to create dashboards from scratch.

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

3. Building Concepts​

3.1 The Foundation: Metrics vs Logs​

Before diving into specific Insights, it's essential to distinguish these two pillars:

CharacteristicMetricsLogs
Data typeNumerical, time seriesStructured text, records
Default retention93 daysConfigurable (30 to 730 days)
LatencySeconds1 to 2 minutes
CostIncluded (basic)Based on ingested volume
QueryMetrics ExplorerLog Analytics (KQL)
ExamplesCPU %, Disk IOPS, Bytes/sUser login, application error, resource creation

Metrics are ideal for real-time alerts and trend visualizations. Logs are ideal for deep diagnosis, event correlation, and auditing.

3.2 Log Analytics Workspace​

The Log Analytics Workspace is the central repository where Azure Monitor logs are stored and queried. It's an Azure resource that you create in a region and subscription.

Important characteristics:

  • Multiple resources can send data to the same workspace.
  • Queries are made in KQL (Kusto Query Language).
  • The workspace defines data retention policy.
  • VM and Network Insights depend on a configured workspace.

3.3 Diagnostic Settings​

For logs and some detailed metrics to reach the Log Analytics Workspace, you need to configure Diagnostic Settings on each resource.

Each Diagnostic Setting defines:

  • What to collect: log and metric categories.
  • Where to send: Log Analytics Workspace, Storage Account, Event Hub, or Partner Solutions.

Non-obvious point: platform metrics are automatically collected by Azure without configuration. But resource logs and detailed metrics require explicit Diagnostic Settings.


4. VM Insights​

4.1 What it is​

VM Insights is the specialized monitoring experience for virtual machines and Virtual Machine Scale Sets. It provides ready-made performance visualizations and dependency maps between processes and network connections.

4.2 Prerequisite: Azure Monitor Agent​

For VM Insights to work, each VM needs to have the Azure Monitor Agent (AMA) installed. AMA replaces legacy agents (MMA and standalone Dependency Agent) and is the currently recommended approach.

AMA collects:

  • Operating system performance metrics (CPU, memory, disk, network).
  • Windows Event logs and Linux Syslog.
  • Process and network connection data (for the Map feature).

AMA installation can be done via:

  • VM extension in the portal.
  • Azure Policy (for scale deployment).
  • Bicep/ARM/Terraform.

4.3 Data Collection Rules (DCR)​

AMA doesn't collect data autonomously. It needs a Data Collection Rule (DCR) that defines what to collect, how to transform, and where to send.

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

4.4 What VM Insights shows​

VM Insights presents three main tabs:

Performance Tab: Displays ready-made performance charts for each VM or group of VMs:

  • CPU Utilization %
  • Available Memory (MB)
  • Bytes Sent/Received per second
  • Disk I/O (reads and writes)
  • Logical Disk Space Used %

This data comes from metrics collected by AMA and can be filtered by time range, subscription, resource group, or individual VM.

Map Tab: Displays a visual dependency map: what processes are running on the VM and which active TCP network connections exist, including source, destination, and port. This is valuable for:

  • Discovering undocumented communications between services.
  • Mapping dependencies before a migration.
  • Diagnosing connectivity failures between applications.

Overview Tab (Get Started): Shows VM Insights enablement status per VM, indicating which have the agent installed and configured.

4.5 Relevant KQL tables for VMs​

When querying VM data in Log Analytics, the main tables are:

TableContent
PerfPerformance metrics collected by agent (CPU, memory, disk)
EventWindows Event Log events
SyslogLinux system logs
VMConnectionTCP connections to and from the VM (requires Map feature)
VMProcessProcesses running on the VM
HeartbeatAgent heartbeat signal every minute

Example KQL query to check VMs without heartbeat in the last 5 minutes:

Heartbeat
| where TimeGenerated > ago(5m)
| summarize LastHeartbeat = max(TimeGenerated) by Computer
| where LastHeartbeat < ago(5m)

5. Storage Insights​

5.1 What it is​

Storage Insights (part of Azure Monitor for Storage) provides a unified view of Storage Account performance, capacity, and availability. It works for Blob, File, Queue, and Table storage.

5.2 What it monitors​

Storage Insights collects two types of data:

Transaction metrics (collected automatically, without Diagnostic Settings):

  • Availability %
  • Total transactions
  • Average and end-to-end latency (E2E Latency)
  • Error rate (ServerErrors, ClientErrors)

Capacity metrics (collected automatically):

  • Used Capacity
  • Blob Count
  • Container Count

Resource logs (require enabled Diagnostic Settings):

  • StorageRead, StorageWrite, StorageDelete: detailed operations by blob, file, or queue.
  • Includes: source IP, operation, status, response time, object size.

5.3 Storage Insights Structural View​

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

5.4 Relevant KQL tables for Storage​

TableContent
StorageBlobLogsBlob operations (read, write, delete)
StorageFileLogsAzure Files operations
StorageQueueLogsQueue operations
StorageTableLogsTable operations

Example KQL query to identify the top 10 IPs with most blob read operations:

StorageBlobLogs
| where OperationName == "GetBlob"
| summarize Count = count() by CallerIpAddress
| top 10 by Count desc

5.5 Non-obvious Storage Insights behavior​

Storage Insights aggregates data from multiple storage accounts in a single view, allowing performance comparison between accounts. This is valuable in environments with dozens of storage accounts.

Another important point: transaction metrics granularity is 1 minute. For latency analysis in critical operations, this may be sufficient to identify spikes, but not for exact correlation with specific log events.


6. Network Insights​

6.1 What it is​

Network Insights (Azure Monitor for Networks) provides a topological and health view of all Azure network resources in a subscription or defined scope. It's the centralized experience for monitoring VNets, NSGs, Load Balancers, Application Gateways, VPN Gateways, ExpressRoute, Private Endpoints, and much more.

6.2 Network Insights Overview​

Network Insights is organized in tabs:

Overview: Presents a health summary of all network resources grouped by type: how many are healthy, in alert, or critical state.

Connectivity: Allows testing and visualizing connectivity between resources using Connection Monitor. Shows latency, packet loss, and reachability status between source and destination.

Traffic: Integrates data from NSG Flow Logs and Traffic Analytics to visualize traffic patterns. Shows the most active IP pairs, protocols used, and blocked vs allowed flows.

Diagnostic Toolkit: Groups diagnostic tools like IP Flow Verify, Next Hop, Effective Routes, Security Group View, and Packet Capture, all provided by Network Watcher.

Topology: Displays an interactive visual diagram of network topology: VNets, subnets, VMs, Load Balancers, Gateways, and their connections.

6.3 NSG Flow Logs and Traffic Analytics​

NSG Flow Logs is a Network Watcher feature that records information about IP traffic passing through an NSG. Each record includes:

  • Source and destination IP
  • Source and destination port
  • Protocol
  • NSG decision (allow/deny)
  • Direction (inbound/outbound)
  • Volume of bytes and packets (version 2)

Flow Logs are sent to a Storage Account. For interactive analysis, you enable Traffic Analytics, which processes flow logs and sends them to Log Analytics Workspace.

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

Traffic Analytics KQL table:

TableContent
AzureNetworkAnalytics_CLFlows processed by Traffic Analytics

Example query to find traffic blocked by NSGs:

AzureNetworkAnalytics_CL
| where FlowStatus_s == "D"
| summarize BlockedFlows = count() by NSGName_s, DestPort_d
| sort by BlockedFlows desc

6.4 Connection Monitor​

Connection Monitor is a Network Watcher feature that performs continuous connectivity tests between sources and destinations. It replaces legacy Connection Monitor (classic) and Network Performance Monitor features.

Components:

  • Test Group: set of sources and destinations with success criteria.
  • Test Configuration: protocol (TCP, HTTP, ICMP), port, test frequency.
  • Sources: VMs with Azure Monitor Agent, Arc-enabled servers.
  • Destinations: IP address, FQDN, URL, or Azure resource.

Connection Monitor produces metrics for:

  • Checks Failed %: percentage of tests that failed.
  • Round Trip Time (ms): measured latency.

This data appears in the Connectivity tab of Network Insights.

6.5 Network Watcher: Diagnostic Tools​

Network Watcher is the underlying service that provides network diagnostic tools in Azure. It's automatically enabled per region when you create the first network resource.

ToolWhat it doesWhen to use
IP Flow VerifyTests if traffic would be allowed or blocked by NSGVM can't connect on a port
Next HopShows next routing hop for a destinationDiagnose incorrect routing
Effective RoutesLists all effective routes for a NICUnderstand VM's actual routing
Security Group ViewLists all effective NSG rules for a NICSecurity auditing
Packet CaptureCaptures network packets from a VMDeep protocol diagnosis
Connection TroubleshootPoint-in-time connectivity test between source and destinationQuick connectivity check

7. Azure Monitor Alerts​

Alerts are Azure Monitor's automated response layer. They monitor conditions and trigger actions when those conditions are met.

7.1 Alert Components​

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

7.2 Alert Types​

TypeData sourceLatencyTypical use
Metric AlertPlatform MetricsSecondsHigh CPU, full disk, storage latency
Log AlertLog Analytics (KQL)1 to 5 minutesError events, missing heartbeat
Activity Log AlertActivity LogMinutesResource deletion, configuration change
Smart Detection AlertApplication InsightsAutomaticApplication anomalies

7.3 Action Groups​

An Action Group defines actions that will be executed when an alert is triggered. The same Action Group can be reused across multiple Alert Rules.

Action types:

  • Notification: Email, SMS, Azure app push, Voice call.
  • Automated action: Azure Function, Logic App, Webhook, Automation Runbook, ITSM.

Action Groups best practices:

  • Create Action Groups by team or severity, not by resource.
  • A "Critical" Action Group can notify multiple channels simultaneously.
  • Action Groups support Rate Limiting to avoid notification floods.

8. Workbooks​

Workbooks are interactive and parameterizable reports within Azure Monitor. VM Insights, Storage Insights, and Network Insights use Workbooks internally to display their visualizations.

You can:

  • Use ready-made Workbooks (templates) provided by Azure.
  • Customize existing Workbooks.
  • Create Workbooks from scratch combining metrics, KQL logs, text, and parameters.

Workbooks support:

  • Time series charts.
  • KQL result tables.
  • Geographic maps.
  • KPI tiles.
  • Interactive filter parameters (subscription, resource group, time range).

9. Implementation Methods​

9.1 Azure Portal​

When to use: initial configuration, dashboard exploration, ad-hoc analysis, interactive Workbooks.

Path to VM Insights: Azure Monitor > Insights > Virtual Machines

Path to Storage Insights: Azure Monitor > Insights > Storage Accounts

Path to Network Insights: Azure Monitor > Insights > Networks

Limitations: manual configuration for each resource, no repeatability.

9.2 Azure CLI​

Enable Diagnostic Settings via CLI:

az monitor diagnostic-settings create \
--name "MyDiagSettings" \
--resource "/subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.Compute/virtualMachines/{vmName}" \
--workspace "/subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.OperationalInsights/workspaces/{workspaceName}" \
--logs '[{"category": "Administrative", "enabled": true}]' \
--metrics '[{"category": "AllMetrics", "enabled": true}]'

Create an Alert Rule via CLI:

az monitor metrics alert create \
--name "HighCPUAlert" \
--resource-group MyRG \
--scopes "/subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.Compute/virtualMachines/{vmName}" \
--condition "avg Percentage CPU > 90" \
--window-size 5m \
--evaluation-frequency 1m \
--action "/subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.Insights/actionGroups/{agName}"

9.3 Azure Policy for scale enablement​

To ensure all VMs in a subscription or management group have AMA installed and VM Insights enabled, use Azure Policy with DeployIfNotExists effect.

Relevant policies available in the library:

  • Configure Windows virtual machines to run Azure Monitor Agent
  • Configure Linux virtual machines to run Azure Monitor Agent
  • Configure VM Insights data collection rule association

This creates a remediation task that automatically applies the agent and DCR to existing and new VMs.

9.4 Bicep / ARM for Diagnostic Settings​

resource diagnosticSetting 'Microsoft.Insights/diagnosticSettings@2021-05-01-preview' = {
name: 'MyDiagSettings'
scope: storageAccount
properties: {
workspaceId: logAnalyticsWorkspace.id
logs: [
{
category: 'StorageRead'
enabled: true
}
{
category: 'StorageWrite'
enabled: true
}
{
category: 'StorageDelete'
enabled: true
}
]
metrics: [
{
category: 'Transaction'
enabled: true
}
]
}
}

10. Control and Security​

Log Analytics Workspace Access:

  • The workspace has independent RBAC from the resources that send data to it.
  • The Log Analytics Reader role allows querying data but not modifying configurations.
  • The Log Analytics Contributor role allows full workspace management.
  • You can configure table-level RBAC to restrict access to specific tables within the workspace.

Sensitive data in logs:

  • Storage logs may contain client IPs, blob names, and access patterns.
  • NSG Flow Logs contain network traffic information, including internal and external IPs.
  • Define appropriate retention policies (minimum necessary) for sensitive data.

Workspace access diagnostics:

  • Use Azure Monitor Private Link Scope (AMPLS) to ensure that resource data is sent to the workspace via private network, without traffic over the public internet.

11. Decision Making​

When to enable each type of collection​

SituationWhat to enableReason
Alert when VM CPU exceeds 90%Platform Metrics + Metric AlertMetrics have seconds latency, ideal for reactive alerts
Diagnose why a VM went offlineVM Insights + Heartbeat tableHeartbeat records agent availability minute by minute
Audit who deleted a blob in StorageDiagnostic Settings with StorageDelete logsResource logs record operations with identity and timestamp
Identify which IPs are being blocked by NSGNSG Flow Logs + Traffic AnalyticsFlow Logs record each NSG decision
Monitor latency between two servicesConnection MonitorContinuous tests with latency and loss history
Understand VM dependencies before migrationVM Insights Map featureShows TCP connections and active processes in real time
Monitor availability of multiple storage accountsStorage InsightsAggregated view without creating individual dashboards

When to use metrics vs logs for alerts​

CriteriaMetricsLogs (KQL)
Required latencySecondsMinutes
Condition typeNumeric thresholdComplex condition, correlation
ExampleCPU > 90% for 5 minMore than 10 login failures in 1 hour
Evaluation costIncludedBased on volume of data analyzed

12. Best Practices​

Log Analytics Workspace:

  • Use a centralized workspace per environment (prod, dev) instead of one per resource. Facilitates event correlation and reduces management cost.
  • Configure data retention according to compliance requirements. Default is 30 days; many regulations require 90 or 365 days.
  • Monitor data ingestion cost. Use Log Analytics Workspace Insights (yes, there's an Insights for the workspace itself) to identify tables generating more volume.

VM Insights:

  • Deploy AMA via Azure Policy to ensure automatic coverage of new VMs.
  • Use centralized and reusable DCRs, not one per VM.
  • Monitor the Heartbeat table to detect VMs with stopped agent or offline VM.

Storage Insights:

  • Enable resource logs only for storage accounts with sensitive data or audit requirements. Storage logs cost per volume.
  • For general-purpose storage accounts, automatic transaction metrics already cover most availability and latency monitoring cases.

Network Insights:

  • Enable NSG Flow Logs version 2 (not version 1). Version 2 includes bytes and packets data, essential for Traffic Analytics.
  • Configure Traffic Analytics with 10-minute processing interval for more up-to-date visualizations.
  • Use Connection Monitor to monitor critical connectivity paths (VM to database, VM to external endpoint).

Alerts:

  • Avoid creating alerts directly on individual resources. Prefer alerts on resource groups or subscriptions when possible.
  • Configure alert suppression (alert suppression / action rule) for planned maintenance windows.
  • Use consistent alert severities (Sev 0 = critical, Sev 4 = informational) and map to different Action Groups.

13. Common Errors​

Not configuring Diagnostic Settings and expecting logs to arrive automatically: Platform Metrics arrive without configuration, but Resource Logs don't. Many people enable VM Insights and can't see log data because they forgot Diagnostic Settings.

Confusing Log Analytics Workspace with Storage Account as log destination: You can send logs to Storage Account (for cheap archival) or to Log Analytics (for interactive analysis). Logs in Storage don't appear in Network Insights or VM Insights. Logs need to go to Log Analytics Workspace to be queryable via KQL and displayed in Insights.

Installing the legacy agent (MMA) instead of AMA: The Microsoft Monitoring Agent (MMA) is in deprecation process. For new deployments, always use Azure Monitor Agent (AMA) with DCRs.

Forgetting to link Connection Monitor to a Log Analytics Workspace: Without this link, Connection Monitor test data is not stored and doesn't appear in Network Insights.

NSG Flow Logs enabled but Traffic Analytics not configured: Flow Logs stay in Storage Account as JSON files, but don't appear in Network Insights Traffic tab. Traffic Analytics needs to be enabled separately and pointed to a Log Analytics Workspace.

Creating an Alert Rule without Action Group: The alert triggers, is visible in the portal, but no one gets notified. Action Group is mandatory for notifications or automated actions.

Misinterpreting "Available Memory": In VM Insights, "Available Memory" is memory available for new allocations. In Linux, low values don't always indicate a problem, as the kernel uses free memory as disk cache (buffer/cache). The relevant metric in Linux is MemAvailable, not MemFree.


14. Operation and Maintenance​

Check monitoring coverage:

  • In VM Insights (Get Started tab), verify which VMs have the agent installed and configured.
  • Use the query Heartbeat | summarize LastHeartbeat = max(TimeGenerated) by Computer to identify VMs without recent heartbeat.

Check data ingestion:

Usage
| where TimeGenerated > ago(24h)
| summarize TotalGB = sum(Quantity) / 1000 by DataType
| sort by TotalGB desc

Important limits:

ResourceLimit
Maximum retention in Log Analytics730 days (with Archive up to 12 years)
Metric alerts per subscription5,000
Action Groups per subscription2,000
NSG Flow Logs retained in StorageDefined by you (storage cost)
Minimum metric alert evaluation frequency1 minute
Minimum log alert evaluation frequency1 minute

15. Integration and Automation​

Azure Monitor + Azure Automation:

  • Alerts can trigger Azure Automation Runbooks for automatic remediation actions.
  • Example: Persistently high CPU triggers a Runbook that automatically increases VM SKU.

Azure Monitor + Logic Apps:

  • Action Groups can call Logic Apps for ITSM flows, ticket opening, Teams or Slack notifications.

Azure Monitor + Grafana:

  • Azure Managed Grafana integrates natively with Azure Monitor as data source.
  • Allows creating Grafana dashboards using Azure Monitor metrics and logs without exporting data.

Azure Monitor + Microsoft Sentinel:

  • Sentinel (SIEM/SOAR) consumes data from Log Analytics Workspace.
  • NSG Flow Logs, Activity Logs and Storage Resource Logs can feed Sentinel for threat detection.

Continuous data export:

  • Use Data Export Rules in Log Analytics to continuously export specific tables to a Storage Account or Event Hub.
  • This enables integration with external data pipelines (Azure Data Explorer, third-party SIEM, etc.).

16. Final Summary​

Essential points:

  • Azure Monitor is the central observability platform, collecting metrics, logs and traces.
  • Azure Monitor Insights are pre-built experiences for VMs, Storage and Networks, consuming Azure Monitor data.
  • Metrics have seconds latency and 93-day retention. Logs have minutes latency and configurable retention.
  • VM Insights requires Azure Monitor Agent (AMA) and a Data Collection Rule (DCR) on each VM.
  • Storage Insights uses automatic metrics for transaction and capacity; detailed logs require Diagnostic Settings.
  • Network Insights integrates NSG Flow Logs, Traffic Analytics, Connection Monitor and Network Watcher.
  • NSG Flow Logs record allow/deny decisions per IP flow. Traffic Analytics processes these logs for visualization in Log Analytics.
  • Alerts are composed of Alert Rule (condition) and Action Group (action). An Action Group can be reused across multiple rules.
  • Workbooks are the interactive reporting technology used internally by all Insights.

Critical differences:

  • Platform Metrics arrive automatically; Resource Logs need Diagnostic Settings.
  • AMA replaces MMA; new deployments should always use AMA with DCR.
  • NSG Flow Logs in Storage don't appear in Network Insights; Traffic Analytics is required for that.
  • Log Analytics is for interactive analysis; Storage Account is for cheap archival.
  • Metric Alert has seconds latency; Log Alert has minutes latency.

What needs to be remembered for AZ-104:

  • VM Insights requires AMA installed and DCR associated.
  • Diagnostic Settings must be explicitly configured for resource logs.
  • Traffic Analytics requires NSG Flow Logs enabled AND a Log Analytics Workspace configured.
  • Connection Monitor uses Network Watcher and needs AMA on source VMs.
  • Alerts without Action Group trigger but don't notify anyone.
  • Storage Insights shows data from multiple storage accounts in a unified view.
  • The Heartbeat table is the primary way to check agent availability on VMs.