Troubleshooting Lab: Configure Azure DNS
Diagnostic Scenariosβ
Scenario 1 β Root Causeβ
The operations team reports that VMs in a VNet called vnet-app-eastus cannot resolve names of other VMs in the same VNet using the suffix app.internal. The private DNS zone app.internal exists in the subscription and contains correct A records for all relevant VMs. The team reports that the zone was created three days ago and worked normally for two days before stopping.
Initial investigation reveals the following information:
Private zone: app.internal
A records present: vm-web, vm-api, vm-db (all correct)
VNet: vnet-app-eastus (region: East US)
Number of VMs in VNet: 14
Subnet NSG: no rules blocking port 53
When checking the private zone's VNet links, the administrator finds:
Virtual network links:
Name: link-vnet-app
VNet: vnet-app-eastus
Status: Unknown
Auto-registration: Enabled
Registration VNets count: 1
The link was recreated yesterday by the network team, who also changed the VNet address prefix from 10.1.0.0/16 to 10.2.0.0/16 during a maintenance window.
What is the root cause of the name resolution failure?
A) The subnet NSG is blocking DNS queries on port 53, despite not appearing explicitly in the listed rules due to VNet-level rule inheritance.
B) The VNet link has Unknown status because it was recently recreated and is still in the provisioning process; until it reaches Succeeded, resolution via private zone does not work.
C) With auto-registration enabled and 14 VMs in the VNet, the private zone has reached the automatic registration limit, causing silent resolution failure.
D) The VNet address prefix change during maintenance corrupted existing A records in the private zone, which now point to IPs outside the new range.
Scenario 2 β Action Decisionβ
The cause of the problem has been identified: the Azure Traffic Manager profile receiving traffic for contoso.com was configured as a target for a CNAME record at the zone apex (@) hosted in Azure DNS. The operation was accepted via a third-party tool that did not validate the apex restriction, resulting in an invalid state in the zone that causes resolution failure for the root domain.
The environment has the following constraints:
- The domain
contoso.comis used in production with high access volume - The team does not have permission to directly modify records via the Azure portal; changes require approval in a change management process with a 4-hour SLA
- There is an alias record configured for
www.contoso.compointing to the same Traffic Manager profile, and this record is working correctly - The security team requires that any changes to public DNS zones be logged with justification before being applied
What is the correct action to take at this moment?
A) Immediately remove the apex CNAME record via Azure CLI using emergency credentials, without waiting for the change management process, as the production impact justifies the exception.
B) Replicate the functional alias record configuration from www.contoso.com directly to @, replacing the invalid CNAME, using the same existing approval process, and immediately initiate the change request to respect the 4-hour SLA.
C) Create a new DNS zone for contoso.com in parallel, migrate all records to it, and update the external registrar, avoiding the need to modify the problematic zone.
D) Wait for automatic resolution of the invalid state, as Azure DNS has self-correction mechanisms that detect CNAME records at apex and convert them to the correct format after a synchronization cycle.
Scenario 3 β Root Causeβ
A developer reports that the URL api.dev.contoso.com does not resolve from their local machine outside the Azure environment. The infrastructure team states that the entry was created correctly. The administrator collects the following information during the investigation:
# Executed on the developer's machine (outside Azure)
$ nslookup api.dev.contoso.com 8.8.8.8
Server: 8.8.8.8
Address: 8.8.8.8#53
** server can't find api.dev.contoso.com: NXDOMAIN
# Executed on a VM inside the vnet-dev VNet in Azure
$ nslookup api.dev.contoso.com
Server: 168.63.129.16
Address: 168.63.129.16#53
Name: api.dev.contoso.com
Address: 10.10.5.20
The administrator checks existing zones in Azure DNS:
Zones found in subscription:
contoso.com (public)
dev.contoso.com (private, linked to vnet-dev)
The public zone contoso.com has the following records:
www CNAME contoso.azurewebsites.net
@ A 20.10.50.100
The team mentions that the TLS certificate for api.dev.contoso.com was renewed last week without issues.
What is the root cause of the failure observed on the developer's machine?
A) The public resolver 8.8.8.8 is blocked by a corporate firewall policy, preventing external queries from reaching the public zone's authoritative DNS.
B) The zone dev.contoso.com was created as a private zone, so its records are not visible to external resolvers; since there are no records for dev.contoso.com in the public zone, the domain does not resolve outside Azure.
C) The absence of an NS record for dev in the public zone contoso.com prevents delegation, but records exist and resolve correctly; the problem is negative caching in the developer's resolver.
D) The A record for api.dev.contoso.com points to a private IP (10.10.5.20), which is not internet-routable, causing resolution to return NXDOMAIN on external resolvers.
Scenario 4 β Diagnostic Sequenceβ
A team receives the following report: "We created a new private DNS zone and records that should be resolved within the VNet are not working."
The administrator has the following investigation steps available:
- Verify that the VM originating the query is using
168.63.129.16as DNS server - Confirm that the specific record exists in the private zone with correct name and type
- Verify that a VNet link exists between the private zone and the VM's VNet, and that its status is Succeeded
- Test name resolution from a VM inside the VNet using
nslookupordig - Confirm that the private zone was created in the same subscription and that there is no typo in the zone name
What is the correct diagnostic sequence?
A) 1 β 2 β 3 β 5 β 4
B) 4 β 3 β 5 β 2 β 1
C) 5 β 3 β 2 β 1 β 4
D) 2 β 1 β 5 β 4 β 3
Answer Key and Explanationsβ
Answer Key β Scenario 1β
Answer: B
The decisive clue is in the Status: Unknown field of the VNet link. A VNet link to a private DNS zone needs to reach Succeeded status for the zone to be usable for resolution in that VNet. Unknown status indicates that provisioning has not completed or failed silently, which completely interrupts resolution via the private zone.
The information about changing the VNet address prefix is intentional noise in the scenario. Modifying a VNet's address space does not affect DNS records in a private zone; A records contain fixed IPs that were created manually or via auto-registration and are not automatically changed by VNet range modifications.
The most dangerous distractor is D, which connects the prefix change to the DNS problem, leading the administrator to investigate A records when the real problem is at the link layer. Acting based on D would lead to unnecessary auditing of dozens of records without solving the real cause. Distractor C is also superficially plausible, but the auto-registration limit per VNet in Azure DNS is 1000 records, not 14.
Answer Key β Scenario 2β
Answer: B
The critical constraint in the scenario is the change management process with a 4-hour SLA. The correct action is to immediately initiate the change request with the technical solution already identified (replace the invalid CNAME with an alias record pointing to Traffic Manager, replicating the pattern that already works on www) to respect the SLA without bypassing the process.
Alternative A represents the classic "the end justifies the means" trap: production impact creates real urgency, but using emergency credentials to circumvent change management without approval violates security controls and may have disciplinary and audit consequences more severe than the outage itself.
Alternative C is technically valid in other contexts, but creating a new public zone and updating the external registrar implies unpredictable DNS propagation time (up to 48 hours with some registrars), which would worsen the production impact. Alternative D describes functionality that does not exist; Azure DNS does not have self-correction of invalid records.
Answer Key β Scenario 3β
Answer: B
The diagnosis is straightforward when observing the contrast between the two nslookup results: resolution works perfectly within the VNet using Azure's internal resolver (168.63.129.16) and fails completely for external resolvers. This behavior is the exact signature of a private DNS zone: it is visible only to VNets linked to it and is completely invisible to the internet.
The root cause is that dev.contoso.com was created as a private zone, not public. For api.dev.contoso.com to be externally resolvable, it would be necessary to create a public zone dev.contoso.com with appropriate records and add NS delegation records in the public zone contoso.com.
The information about TLS certificate renewal is intentional noise: it is irrelevant to DNS resolution diagnosis and was included to divert attention to the TLS layer.
Distractor D represents an important conceptual error: NXDOMAIN means the name does not exist from the resolver's perspective, not that the returned IP is unreachable. If the external resolver managed to return IP 10.10.5.20, the error would be connectivity, not NXDOMAIN. The NXDOMAIN result itself confirms that the resolver never found the zone.
Answer Key β Scenario 4β
Answer: C
The correct sequence is 5 β 3 β 2 β 1 β 4, which follows progressive diagnostic logic from most structural to most operational:
- Step 5: Confirming the zone exists with the correct name is the absolute prerequisite. A zone with a typo or in the wrong subscription invalidates all following steps.
- Step 3: Checking the VNet link and its status is the second step, as without a functional link with Succeeded status, the zone is inaccessible to the VNet, regardless of records.
- Step 2: Confirming the specific record exists with correct name and type eliminates configuration problems within an already functional zone.
- Step 1: Checking the VM's DNS server confirms it uses Azure's resolver (
168.63.129.16), without which the private zone would never be queried. - Step 4: The practical test with
nslookupordigis the last step, as it validates the result after all structural layers have been verified.
The central error of incorrect sequences is starting with the practical test (step 4) before checking infrastructure, which can create confusion: a negative test does not indicate which layer the problem is in if structural checks were not done first.
Troubleshooting Tree: Configure Azure DNSβ
Color legend:
| Color | Node type |
|---|---|
| Dark blue | Initial symptom (entry point) |
| Blue | Diagnostic question |
| Red | Identified cause |
| Green | Recommended action or resolution |
| Orange | Validation or intermediate verification |
To use this tree when facing a real problem, start with the root node by identifying whether the resolution failure occurs inside or outside Azure. The first bifurcation separates two completely distinct diagnostic universes: private zone problems (visible only within VNets) and public zone problems (visible on the internet). Follow the closed questions in each node by answering what you observe in the environment, without skipping steps. Whenever you reach an orange validation node, execute the indicated command before advancing, as it confirms or rules out the current hypothesis. The path ends when you reach a red cause followed by a green action.