Troubleshooting Lab: Create private endpoints
Diagnostic Scenariosβ
Scenario 1 β Root Causeβ
An operations team reports that an application hosted in the snet-app (10.1.0.0/24) subnet cannot connect to an Azure SQL Database via private endpoint. The team informs that the environment was provisioned two weeks ago and was working normally until yesterday.
The administrator executes the following commands from a VM in the snet-app subnet:
# DNS resolution test
nslookup sqlprod.database.windows.net
Server: 168.63.129.16
Address: 168.63.129.16
Non-authoritative answer:
Name: sqlprod.privatelink.database.windows.net
Address: 13.91.44.120
# TCP connectivity test
Test-NetConnection -ComputerName sqlprod.database.windows.net -Port 1433
ComputerName : sqlprod.database.windows.net
RemoteAddress : 13.91.44.120
RemotePort : 1433
InterfaceAlias : Ethernet
SourceAddress : 10.1.0.5
TcpTestSucceeded : False
The administrator confirms that the private endpoint pe-sqlprod exists, has Succeeded status and has private IP 10.2.1.8 allocated in the snet-data subnet. The SQL Database account has the firewall enabled and public access denied. The Network Security Group of the snet-app subnet allows outbound to any destination on port 1433.
What is the root cause of the connectivity failure?
A) The NSG of the snet-app subnet is blocking outbound traffic to the snet-data subnet.
B) The Private DNS Zone privatelink.database.windows.net is not linked to the virtual network containing the snet-app subnet, causing DNS to resolve to the service's public IP.
C) The SQL Database firewall is blocking connections originating from IP 10.1.0.5.
D) The private endpoint was provisioned in a different subnet from the application, which prevents connectivity between the two subnets.
Scenario 2 β Root Causeβ
A company provisioned an Azure Private Link Service to expose an internal service to external partners. The service is behind a Standard Load Balancer internal in the snet-provider (10.0.1.0/24) subnet. Partners created private endpoints in their virtual networks and administrators confirm that the connection status shows as Approved in the Azure portal.
Despite this, partner A reports that HTTP requests sent to the private endpoint IP (192.168.10.4) return timeout after 30 seconds. Partner B, in another subscription, reports the same behavior. The provider team verifies that the internal Load Balancer shows backends as healthy on the health probe.
The provider administrator queries the subnet configuration and gets the following output:
{
"name": "snet-provider",
"addressPrefix": "10.0.1.0/24",
"privateLinkServiceNetworkPolicies": "Enabled",
"privateEndpointNetworkPolicies": "Disabled",
"networkSecurityGroup": null
}
The network team informs that the snet-provider subnet was created three months ago and no configuration has been changed since the Private Link Service was created.
What is the root cause of the observed behavior?
A) The absence of a Network Security Group in the snet-provider subnet prevents the Private Link Service from forwarding traffic from partners.
B) The Approved connection status indicates only administrative approval; traffic only flows after manual confirmation in the partner portal.
C) The privateLinkServiceNetworkPolicies property is enabled in the provider subnet, which blocks the correct functioning of the Private Link Service.
D) The two partners are in different subscriptions, and the Private Link Service requires peering between subscriptions to forward traffic.
Scenario 3 β Action Decisionβ
The cause has been identified: the Private DNS Zone privatelink.blob.core.windows.net is linked only to the vnet-hub virtual network, but the application that needs private access is in vnet-spoke-prod. Both virtual networks have configured and functional peering. The environment is in production with high availability and the current impact is that file uploads are failing silently, as the application tries the public IP but the Storage firewall rejects the connection.
The responsible architect has permission to modify DNS and virtual network configurations, but cannot restart the application or make changes that require maintenance windows. The security team requires that any change be reversible in less than five minutes if it causes additional impact.
What is the correct action to take at this moment?
A) Create a new Private DNS Zone privatelink.blob.core.windows.net exclusive for vnet-spoke-prod and migrate the existing A records to the new zone.
B) Add a Virtual Network Link from the existing Private DNS Zone privatelink.blob.core.windows.net to the vnet-spoke-prod virtual network.
C) Configure a custom DNS server in vnet-spoke-prod with a forwarder pointing to IP 168.63.129.16 and restart the DNS service of the spoke VMs.
D) Enable the DNS proxy attribute in the Azure Firewall of vnet-hub and create a DNS rule to forward queries from vnet-spoke-prod to the private zone.
Scenario 4 β Diagnostic Sequenceβ
An administrator receives the following alert at 2:32 PM:
"Reports application cannot read data from Azure Storage Account
stprodreports. Error reported: connection timeout."
The environment has a private endpoint pe-storage-reports configured for the blob sub-resource. The application runs on VMs in the snet-reports (10.3.0.0/24) subnet. Recently, public access to Storage was enabled for an external audit, and the security team requested it be disabled immediately after the audit ended.
The available investigation steps are:
- Verify if the private endpoint
pe-storage-reportshas Succeeded status and if the private IP is allocated. - Execute
nslookup stprodreports.blob.core.windows.netfrom a VM insnet-reportsto verify which IP is being resolved. - Verify if the Storage Account firewall is configured to deny public access and if selected virtual networks or private endpoints are allowed.
- Query the NSG logs associated with the
snet-reportssubnet to verify if there is outbound traffic blocking on port 443. - Confirm if the Private DNS Zone
privatelink.blob.core.windows.nethas an A record pointing to the endpoint's private IP.
What is the correct progressive diagnostic sequence for this scenario?
A) 1 β 3 β 2 β 5 β 4
B) 2 β 5 β 1 β 3 β 4
C) 3 β 1 β 4 β 2 β 5
D) 4 β 2 β 1 β 5 β 3
Answer Key and Explanationsβ
Answer Key β Scenario 1β
Answer: B
Explanation:
- The
nslookupoutput is the decisive clue: DNS is returning IP13.91.44.120, which is a public address. This means name resolution is not being intercepted by the Private DNS Zone and forwarded to the private IP10.2.1.8. The symptom of "was working and stopped" may have occurred due to accidental unlinking of the DNS zone or a change in the DNS server configured on the VNet. - The irrelevant information in this scenario is the fact that the
snet-appNSG allows outbound on port 1433. This data is true and correct, but it's not the cause of the problem, as the traffic doesn't even reach the network layer in the correct direction: DNS has already forwarded the connection to the public IP. - Alternative A is a classic distractor: the administrator sees "connection failure" and immediately suspects the NSG, without first checking where the traffic is being directed. Alternative D represents a conceptual mistake: private endpoints can reside in any subnet of the same VNet or peered VNets; the endpoint subnet doesn't need to be the same as the application. The most dangerous distractor is C: acting on the SQL firewall without verifying the real cause would delay resolution and could expose the environment to unnecessary reconfigurations.
Answer Key β Scenario 2β
Answer: C
Explanation:
- The
privateLinkServiceNetworkPoliciesproperty controls whether network policies (system routes and NSGs) are applied to traffic directed to the Private Link Service within the subnet. When this property isEnabled, Azure applies policies that interfere with the internal routing of the Private Link Service, preventing traffic from consumers from reaching the Load Balancer correctly. For the Private Link Service to function, this property must be configured asDisabledin the provider subnet. - The irrelevant information is the fact that "no configuration has been changed since creation". This statement may lead the reader to rule out environment configurations as the cause, but the problem is exactly in how the subnet was originally created, with the incorrect property from the beginning.
- Alternative A is dismissible from the statement: the absence of NSG doesn't prevent Private Link Service functionality. Alternative B represents a misunderstanding of the approval flow: Approved status means the network connection is established and traffic can flow, not that additional confirmation is needed. Alternative D is conceptually wrong: the Private Link Service was created precisely to allow consumption between different subscriptions without peering.
Answer Key β Scenario 3β
Answer: B
Explanation:
- Adding a Virtual Network Link to the existing Private DNS Zone is the correct action because it directly resolves the problem at its cause: the zone is not linked to
vnet-spoke-prod. This operation is non-destructive, doesn't require resource restarts, takes less than a minute to propagate, and is fully reversible by simply removing the link, meeting all declared restrictions. - Alternative A is technically possible but incorrect in this context: creating a second zone with the same name would cause resolution conflicts and could generate unpredictable behavior. Additionally, migrating A records is an unnecessarily complex operation for a fix that can be resolved with a link.
- Alternative C introduces an operational dependency that violates the restriction of not restarting services: configuring custom DNS normally requires VMs to renew network configurations or restart the DNS client. Alternative D is a more complex architectural solution that requires Azure Firewall as a DNS proxy, which goes well beyond the scope of the necessary correction and doesn't meet the quick reversibility criterion.
Answer Key β Scenario 4β
Answer: B
Explanation:
- The correct sequence is 2 β 5 β 1 β 3 β 4 because it follows the logic of progressive diagnosis from symptom to cause, layer by layer.
- Step 2 (nslookup) should be first: it immediately determines whether traffic is being routed to the private IP or public IP. This information divides the problem space into two completely distinct paths. If DNS returns public IP, the problem is in name resolution; if it returns private IP, the problem is in network connectivity.
- Step 5 comes next: if DNS returned private IP, verify if the A record in the Private DNS Zone is correct, confirming that the zone is integrated.
- Step 1 validates if the private endpoint itself is healthy and allocated.
- Step 3 verifies the Storage firewall configuration, which is especially relevant given the recent audit context (public access was enabled and then disabled, and may have been left in an inconsistent state).
- Step 4 (NSG logs) is last because NSG is rarely the cause in well-configured private endpoint scenarios, and its analysis is more time-consuming. Starting with it would be a waste of time given the more likely causes.
- Alternative A starts with step 3 (firewall), which is the most attractive distractor given the audit context, but verifies the firewall before even knowing which IP the traffic is being sent to, which is a diagnostic methodology error.
Troubleshooting Tree: Create private endpointsβ
Color Legend:
| Color | Node Type |
|---|---|
| Dark blue (navy) | Initial symptom, investigation entry point |
| Blue (blue) | Objective diagnostic question, verifiable in practice |
| Red | Identified cause requiring correction |
| Green | Recommended action or confirmed resolution state |
| Orange | Intermediate validation or state verification |
To use this tree when facing a real problem, always start with the root node (connectivity failure) and answer each question based on what you observe in the environment, without assuming the cause. The first bifurcation, which checks if DNS resolves to private IP, is the most important: it immediately separates DNS layer problems from network layer and resource configuration problems. Follow the path corresponding to what you observed until you reach a cause or action node. If the action taken doesn't resolve the problem, return to the previous node and follow the next possible branch, treating the tree as a progressive checklist and not as a mandatory linear sequence.