Troubleshooting Lab: Configure access to private endpoints

Diagnostic Scenarios

Scenario 1 — Root Cause

A VM hosted in VNet-Prod (East US region) is trying to access an Azure SQL Database via hostname. The Private Endpoint was created in the snet-endpoints subnet of the same VNet three days ago and shows Succeeded status in the portal. Public network access to the SQL Database has been disabled. The VM uses Azure's default DNS server (168.63.129.16).

The responsible engineer runs the following diagnostic from the VM:

nslookup meu-servidor.database.windows.net

Server:  168.63.129.16
Address: 168.63.129.16

Non-authoritative answer:
Name:    meu-servidor.privatelink.database.windows.net
Address: 20.42.65.90

Then tests connectivity:

Test-NetConnection -ComputerName 20.42.65.90 -Port 1433

TcpTestSucceeded : False

The engineer checks the portal and confirms that the private IP allocated to the Private Endpoint is 10.1.2.5. The VNet has no User Defined Routes configured. The snet-endpoints subnet has a network policy for Private Endpoints with default value.

What is the root cause of the connectivity failure?

A. Azure DNS cannot resolve Private Endpoints names when public access to the resource is disabled, so it returns an incorrect public IP.

B. The Private DNS Zone is not linked to the VNet, causing DNS to return the SQL Database's public IP instead of the endpoint's private IP.

C. A Network Security Group applied to the snet-endpoints subnet is blocking inbound traffic on port 1433.

D. The network policy for Private Endpoints on the subnet is preventing traffic from reaching the endpoint's private IP.

Scenario 2 — Action Decision

The problem cause has been identified: the Private DNS Zone privatelink.database.windows.net exists in the subscription, contains the correct A record pointing to 10.1.2.5, but has no Virtual Network Link configured.

The environment is critical production. The scheduled maintenance window is at 11 PM. It's 2 PM. The security team informed that any changes to private DNS zones require prior approval from a Change Advisory Board (CAB), but there's an emergency approval process available with 2-hour SLA.

Access to the SQL Database via public endpoint is completely disabled and cannot be re-enabled without higher-level approval. Production applications are failing.

What is the correct action to take at this moment?

A. Recreate the Private Endpoint with DNS integration option enabled, which will automatically create and link the zone, eliminating the need for CAB approval.

B. Trigger the CAB emergency approval process to add the Virtual Network Link to the existing Private DNS Zone, as this is the surgical fix with lowest risk.

C. Wait for the 11 PM maintenance window and perform the DNS zone linking within the standard change process.

D. Configure a custom DNS server in the VNet pointing directly to IP 10.1.2.5 as a static entry, bypassing the Private DNS Zone dependency.

Scenario 3 — Root Cause

A data team configured a Private Endpoint for a storage account (conta-datalake) pointing to the dfs sub-resource (Azure Data Lake Storage Gen2). The Private DNS Zone privatelink.dfs.core.windows.net was created and correctly linked to the VNet. Applications using the DFS endpoint work without issues.

Weeks later, a backup application tries to access the same data via Blob protocol:

nslookup conta-datalake.blob.core.windows.net

Server:  168.63.129.16
Address: 168.63.129.16

Non-authoritative answer:
Name:    conta-datalake.privatelink.blob.core.windows.net
Address: 52.239.184.16

The backup application fails when trying to write files. The team verifies that public access to the storage account is disabled. The subnet where the application is hosted has internet outbound access via NAT Gateway, which was recently added to allow package updates on VMs.

What is the root cause of the backup application failure?

A. The NAT Gateway added to the subnet is intercepting traffic destined for the Private Endpoint and routing it to the internet, preventing private access.

B. There is no Private Endpoint created for the blob sub-resource of the same storage account, so DNS resolves to the public IP, which is inaccessible.

C. The Private DNS Zone privatelink.dfs.core.windows.net is conflicting with the blob subdomain resolution, causing DNS response failure.

D. Disabled public access prevents any application from accessing the blob sub-resource even if a Private Endpoint is created later.

Scenario 4 — Diagnostic Sequence

A VM in a VNet connected via VNet Peering to another VNet (where the Private Endpoint is provisioned) cannot reach the service exposed by the endpoint. The peering is in Connected state on both sides. The VM can ping other VMs in the remote VNet without issues.

The available investigation steps are:

Verify if the Private DNS Zone is linked to the VM's VNet (not just the endpoint's VNet)
Confirm that peering is active and in Connected state
Run nslookup from the VM and verify if the name resolves to private or public IP
Check if there's an NSG on the VM's subnet blocking traffic on the service port
Confirm the private IP allocated to the Private Endpoint in the Azure portal

What is the correct investigation sequence for this scenario?

A. 2 → 1 → 3 → 5 → 4

B. 3 → 1 → 5 → 4 → 2

C. 2 → 3 → 1 → 5 → 4

D. 1 → 3 → 5 → 2 → 4

Answer Key and Explanations

Answer Key — Scenario 1

Answer: C

The decisive clue is in the two commands executed in sequence. The nslookup returned 20.42.65.90, which is the SQL Database's public IP, confirming that DNS is not resolving to the private IP. However, the question is not about DNS: the engineer tested connectivity directly to this public IP via Test-NetConnection. The TCP failure to this IP is expected since public access is disabled.

The correct diagnosis requires realizing that the real access problem would only be discovered by testing the private IP 10.1.2.5. The statement informs that the snet-endpoints subnet has the Private Endpoints network policy with default value, and this is exactly the trigger: the network policy (PrivateEndpointNetworkPolicies) in default state may prevent NSGs and UDRs from being applied, but when enabled in restrictive mode, it can block traffic. However, the most precise and verifiable cause with the given information is an NSG on the subnet blocking port 1433, since the private IP was never tested directly.

Alternative B would be the cause of DNS failure (public IP returned), not TCP connectivity failure. The NAT Gateway information doesn't appear in this scenario; the engineer never tested the private IP 10.1.2.5. Alternative D confuses the network policy behavior, which by default doesn't block traffic. Alternative A is technically incorrect: Azure DNS resolves Private Endpoints regardless of public access status.

The most dangerous error would be choosing B and going to fix DNS without ever testing the private IP directly, which would delay the real diagnosis.

Answer Key — Scenario 2

Answer: B

The cause is identified and precise: missing Virtual Network Link in the existing Private DNS Zone. The surgical fix is adding this link, an atomic and very low-risk operation that doesn't change DNS records, doesn't recreate resources, and doesn't affect other zones or VNets.

The emergency approval process with 2-hour SLA exists exactly for situations like this: production impacted, cause identified, low-risk fix. Triggering this process is the only response that respects all scenario constraints.

Alternative A seems reasonable but violates a critical constraint: recreating the Private Endpoint would involve deleting and recreating a production resource in use, which is a higher impact and risk operation than simply adding a DNS link. Additionally, endpoint recreation would also need CAB approval.

Alternative C ignores production impact: applications are failing now, and waiting 9 hours without triggering the emergency process is operational negligence.

Alternative D is technically dangerous: configuring a static DNS entry for the endpoint IP bypasses the entire resolution infrastructure, makes the environment fragile to future endpoint IP changes, and doesn't follow any Microsoft-recommended practices.

Answer Key — Scenario 3

Answer: B

The root cause is straightforward: a Private Endpoint created for the dfs sub-resource doesn't cover the blob sub-resource. Each storage account sub-resource has its own FQDN, its own Private Endpoint, and its own Private DNS Zone. The nslookup returns the public IP for blob.core.windows.net because there's no private A record for this subdomain in any zone linked to the VNet.

The NAT Gateway information is deliberately irrelevant in this scenario. NAT Gateways affect outbound traffic to the internet, but Private Endpoints are accessed via internal VNet routes and don't go through the NAT Gateway. Including this information forces the reader to test whether they'll associate the failure with a recent infrastructure change, which is a classic diagnostic error: temporal correlation without real causality.

Alternative C is technically impossible: private DNS zones for different subdomains (dfs vs blob) are independent and don't interfere with each other. Alternative D is incorrect: disabling public access doesn't prevent creating new Private Endpoints nor block already existing endpoints.

Answer Key — Scenario 4

Answer: C

The correct sequence is 2 → 3 → 1 → 5 → 4, which follows the logic of progressive hypothesis elimination from simplest to most specific.

The first step is confirming that peering is active (step 2), because without layer 3 connectivity no other diagnosis makes sense. The statement already informs that peering is Connected, but in a real diagnosis this validation is always the starting point before investigating upper layers.

With base connectivity confirmed, step 3 (run nslookup) immediately reveals whether the problem is DNS resolution or TCP connectivity. This distinction defines the next path: if DNS resolves to public IP, the problem is zone linking (step 1). If it resolves to private IP but TCP fails, the problem is at the network layer (step 4).

Step 5 (confirm private IP in portal) serves as reference to validate if the IP returned by DNS is correct before investigating NSGs.

Alternative B starts with nslookup without first confirming basic connectivity, which can generate false diagnoses. Alternative A confirms peering but jumps directly to the DNS zone before even observing DNS behavior with nslookup. Alternative D starts with the DNS zone without any evidence that the problem is there.

Troubleshooting Tree: Configure access to private endpoints

100%

Scroll para zoom · Arraste para mover · 📱 Pinch para zoom no celular

Color Legend:

Color	Node Type
Dark blue	Initial symptom (entry point)
Blue	Diagnostic question (yes/no decision)
Red	Identified cause
Green	Recommended action or resolution
Orange	Intermediate verification or validation

To use this tree when facing a real problem, start at the root node describing the observed symptom and answer each diagnostic question based on what you can verify directly in the environment. The first branching point, whether the name resolves to a private or public IP, is the most important: it immediately separates DNS problems from network connectivity problems, which require completely different investigation paths. Follow the branches until you reach an identified cause node (red) and execute the corresponding action (green). Intermediate validation nodes (orange) indicate where you need to collect evidence before advancing to the next question.

Diagnostic Scenarios​

Scenario 1 — Root Cause​

Scenario 2 — Action Decision​

Scenario 3 — Root Cause​

Scenario 4 — Diagnostic Sequence​

Answer Key and Explanations​

Answer Key — Scenario 1​

Answer Key — Scenario 2​

Answer Key — Scenario 3​

Answer Key — Scenario 4​

Troubleshooting Tree: Configure access to private endpoints​