Troubleshooting Lab: Configure Public IP Addresses
Diagnostic Scenariosβ
Scenario 1 β Root Causeβ
The infrastructure team reports that a web application hosted on an Azure VM stopped responding externally after an overnight maintenance operation. The VM was deallocated and restarted as part of a patching window. The NSG associated with the network interface was not changed and continues with the same inbound rules on port 443. The administrator checks the VM status in the portal and confirms it is in the Running state. The external DNS points to the same IP address that worked before.
Output of the command executed after the failure:
az network public-ip show \
--resource-group rg-webapp \
--name pip-webapp \
--query "{ip: ipAddress, method: publicIpAllocationMethod, sku: sku.name}" \
--output table
Ip Method Sku
------------ --------- -----
20.85.110.44 Dynamic Basic
The administrator confirms that the external DNS still points to 20.85.110.67, which was the address before maintenance. The VM was migrated to different hardware during patching. The storage account linked to the VM has LRS redundancy configured.
What is the root cause of the external access failure?
A) The NSG was reset to default state during hardware reallocation and is blocking port 443. B) The public IP address with Dynamic allocation was changed after VM deallocation, and the external DNS still points to the old IP. C) Migration to different hardware generated a new network interface, invalidating the link between the public IP and the VM. D) The Basic SKU does not support restarts on different hardware and needs to be manually recreated after host migrations.
Scenario 2 β Action Decisionβ
The cause of the problem has been identified: the Public IP Address associated with the production Standard Load Balancer is configured with Basic SKU. The team noticed this when trying to enable availability zones on the Load Balancer, an operation that failed with the following message:
Code: PublicIPAddressCannotBeUsedWithStandardLoadBalancer
Message: The public IP address 'pip-lb-prod' with sku 'Basic' cannot be
used with a resource that has sku 'Standard'.
The environment is active production with real user traffic. There is no scheduled maintenance window in the next 6 hours. A second Standard SKU public IP address is available in the same resource group, already created and unassociated. The current Basic IP has the address 40.112.88.203, which is registered in the company's external DNS with a TTL of 3600 seconds.
What is the correct action to take at this time?
A) Update the existing IP SKU from Basic to Standard directly through the Azure portal to avoid address change.
B) Immediately replace the Basic IP with the available Standard IP on the Load Balancer, updating the external DNS afterward.
C) Document the problem, register the available Standard IP as an approved solution, and wait for the next maintenance window to perform the swap with controlled impact.
D) Create a new Standard IP with the same address 40.112.88.203 to avoid DNS updates before performing the swap.
Scenario 3 β Root Causeβ
An administrator receives an alert that a production VM is inaccessible via SSH. He checks the portal and observes that the VM is in Running state. The subnet NSG allows SSH on port 22 from any source. The administrator tries to connect directly through the public IP and receives a timeout. No NSG changes were made in the last 24 hours.
The administrator executes the following command to check the public IP:
az network public-ip show \
--resource-group rg-infra \
--name pip-vm-linux \
--query "{ip: ipAddress, state: provisioningState, sku: sku.name}" \
--output table
Ip State Sku
---- --------- --------
None Succeeded Standard
The VM was created three weeks ago and worked normally. The VM's OS disk was expanded from 64 GB to 128 GB yesterday afternoon. The administrator confirms that the public IP pip-vm-linux appears as an existing resource in the resource group and its provisioningState is Succeeded.
What is the root cause of SSH inaccessibility?
A) OS disk expansion caused a forced reboot that corrupted the VM's internal network configuration.
B) Standard SKU blocks SSH by default and an NSG needs to be associated directly with the network interface, not just the subnet.
C) The Public IP Address exists as a resource but is not associated with any network interface or active resource, resulting in a null ipAddress field.
D) The provisioningState Succeeded indicates the IP was successfully deprovisioned and needs to be recreated.
Scenario 4 β Diagnostic Sequenceβ
An administrator receives the following report: "We created a new Standard Public IP Address and tried to associate it with an existing VM, but the VM continues not responding externally."
The available investigation steps are:
- Step P: Verify if the public IP is actually associated with the correct VM network interface via
az network nic ip-config show - Step Q: Confirm if there is an NSG associated with the network interface or subnet that allows the desired traffic
- Step R: Check the
ipAddressfield of the Public IP resource to confirm if an address was assigned - Step S: Test external connectivity to the assigned IP from a source outside the Azure network
- Step T: Confirm if the VM is in Running state and without health alerts in the portal
Which diagnostic sequence follows the correct logic progression, from control plane to data plane?
A) T -> P -> R -> Q -> S B) S -> R -> P -> Q -> T C) R -> T -> S -> P -> Q D) Q -> P -> T -> R -> S
Answer Key and Explanationsβ
Answer Key β Scenario 1β
Answer: B
The definitive clue is in the command output: the IP is configured with Method: Dynamic and Sku: Basic. With Dynamic allocation, Azure releases the IP address when the VM is deallocated and assigns a new one when restarted. The external DNS still pointed to the previous address 20.85.110.67, while the current IP became 20.85.110.44.
The information about the storage account with LRS redundancy is intentionally irrelevant and has no relation to public IP addressing behavior.
Distractor A is dangerous because an altered NSG is a reasonable hypothesis in post-maintenance failures, but the statement explicitly says the NSG was not changed. Distractor C is plausible for those unfamiliar with deallocation behavior, but the network interface is not recreated during host migrations. Distractor D is false: Basic SKU has no recreation restriction after hardware migration.
Acting based on distractor A would lead the administrator to modify NSG rules that are correct, wasting time and potentially opening unnecessary security gaps.
Answer Key β Scenario 2β
Answer: C
The critical constraint of the scenario is that the environment is in active production and there is no maintenance window in the next 6 hours. Replacing the IP in production without a window implies service interruption, as the IP address will change and the external DNS has a TTL of 3600 seconds, meaning up to 1 hour of unavailability for clients with the cached record.
Distractor A is the most dangerous: there is no in-place SKU upgrade for Public IP Addresses in Azure. The operation is technically impossible and would lead the administrator to waste time trying something the portal simply doesn't allow.
Distractor B represents the correct action at the technically appropriate time (maintenance window), but ignores the active production constraint stated in the scenario. Distractor D is unfeasible because it's not possible to create a new IP with a specific address: Azure assigns addresses from its pool and doesn't allow reserving an IP previously used by another resource.
The correct action preserves production environment stability and plans resolution with controlled impact.
Answer Key β Scenario 3β
Answer: C
The central clue is in the Ip: None field in the command output. A Standard SKU Public IP Address can exist as a valid resource (provisioningState Succeeded) without being associated with any network interface. When not associated, no IP address is assigned to the ipAddress field, making the resource unusable for external communication.
The information about OS disk expansion is intentionally irrelevant. Disk expansion doesn't affect network configuration and doesn't cause forced reboots in Azure VMs.
Distractor B contains a true element (Standard SKU is secure by default and requires NSG), but this isn't the root cause here: the problem is earlier, the IP isn't even associated with the VM. Distractor D represents a misreading of the provisioningState field: Succeeded means the resource was created successfully, not that it was removed. Distractor A would attribute the cause to an irrelevant event intentionally included in the scenario.
The main reasoning error of the distractors is focusing on recent events (disk expansion, NSG behavior) instead of checking the current state of the IP resource association.
Answer Key β Scenario 4β
Answer: A
The correct sequence is T -> P -> R -> Q -> S, which follows the progression from control plane to data plane:
| Order | Step | Logic |
|---|---|---|
| 1 | T | Confirm the VM is operational before investigating network |
| 2 | P | Verify if the association between IP and NIC was done correctly |
| 3 | R | Confirm if the public IP has an assigned address |
| 4 | Q | Check if NSG allows the desired traffic |
| 5 | S | Test real connectivity only after validating all prerequisites |
Starting with external connectivity testing (Step S), as proposed by distractor B, is the most common error: testing before validating configuration generates negative results that don't indicate where the problem is.
Distractor D starts with NSG verification (Q), which is premature: if the IP isn't associated with the NIC, the NSG is irrelevant for immediate diagnosis. The correct sequence ensures each verification only makes sense after the previous one confirms the expected state.
Troubleshooting Tree: Configure Public IP Addressesβ
Legend:
- Dark blue: initial symptom, entry point to the tree
- Blue: diagnostic question, answered with yes or no
- Red: identified cause and associated corrective action
- Green: validation or confirmed resolution
To use this tree when facing a real problem, always start at the root node and answer each question based on what is directly observable: VM state, az network command output, DNS records, and connectivity tests. Follow the branches without skipping steps. Each red node indicates both the cause and corrective action, ensuring diagnosis and resolution work together.