Skip to main content

Troubleshooting Lab: Choose an Azure Load Balancer SKU and tier

Diagnostic Scenarios​

Scenario 1 β€” Root Cause​

The operations team reports that after a maintenance window last Friday, connections to a set of backend VMs stopped being established. The Load Balancer was recently recreated. The responsible engineer confirms that the load balancing rules, health probe, and backend pool are correctly configured and that the VMs are responding locally.

The connectivity check command output returns the following:

$ Test-NetConnection -ComputerName 20.x.x.x -Port 443
ComputerName : 20.x.x.x
RemoteAddress : 20.x.x.x
RemotePort : 443
InterfaceAlias : Ethernet
SourceAddress : 10.0.1.5
TcpTestSucceeded : False

Additional information collected by the engineer:

  • The recreated Load Balancer is Standard SKU
  • The associated public IP is Standard type
  • The VMs are in the subnet-backend subnet (10.0.1.0/24)
  • The VMs' Availability Set was not changed during maintenance
  • No Network Security Group was associated with the subnet or NICs after the Load Balancer recreation

What is the root cause of the connectivity failure?

A) The Standard public IP is not correctly associated with the load balancing rule.

B) The Standard SKU blocks all inbound traffic by default and no NSG was configured to allow the flow.

C) The Availability Set was deconfigured during recreation, removing the VMs from the backend pool.

D) The HTTP health probe is silently failing, preventing VMs from being marked as healthy.


Scenario 2 β€” Action Decision​

The cause has been identified: the existing Load Balancer is Basic SKU and the team needs to migrate to Standard SKU to meet a new contractual SLA requirement. The environment is in production with active traffic. The migration has been approved for immediate execution.

Known constraints:

  • The public IP associated with the current Load Balancer is Basic type
  • The backend pool VMs are in a subnet without associated NSG
  • There is no scheduled maintenance window; any interruption must be minimized
  • The security team has not yet reviewed the necessary traffic rules

What is the correct action to take at this time?

A) Recreate the Load Balancer as Standard immediately, since SKU migration is the approved priority.

B) Suspend the migration until the NSG is created and reviewed by the security team, and until the public IP is updated to Standard, to avoid total traffic blocking after the change.

C) Migrate only the public IP to Standard first and then recreate the Load Balancer, ignoring the NSG for now.

D) Create a second Standard Load Balancer in parallel and change DNS to the new IP, without terminating the Basic Load Balancer.


Scenario 3 β€” Root Cause​

An architect reports that a Standard Load Balancer configured with Global tier is not distributing traffic among regional backends as expected. Traffic from clients in Europe is reaching exclusively the East US region, ignoring the West Europe region, which has healthy backends with lower latency.

Portal logs show:

Backend Pool (Global LB):
- Regional LB East US β€” Status: Healthy
- Regional LB West Europe β€” Status: Healthy

Incoming Requests (last 1h):
- Routed to East US: 100%
- Routed to West Europe: 0%

Additional information:

  • Both regional Load Balancers are Standard SKU
  • The West Europe region Load Balancer was added to the backend pool 20 minutes ago
  • The Global Load Balancer public IP is Standard type
  • The monitoring server time zone is configured as UTC+0
  • The Global Load Balancer load balancing rules point to the correct port

What is the most likely root cause for the observed behavior?

A) The Global Load Balancer public IP should be Basic type to support geographic routing.

B) The West Europe regional Load Balancer was recently added to the backend pool and traffic has not yet converged to reflect its availability in global routing.

C) The Global Load Balancer load balancing rules are configured for session affinity, pinning existing clients to the East US endpoint.

D) The West Europe regional Load Balancer is the wrong SKU; only Basic Load Balancers can be used as backends for a Global Load Balancer.


Scenario 4 β€” Diagnostic Sequence​

An engineer receives the following report: "The Load Balancer stopped responding after a configuration change. We don't know what was changed."

The following investigation steps are available, but out of order:

  1. Check if there's an NSG associated with the subnet or NIC that is blocking traffic
  2. Query the Load Balancer's Activity Log to identify which operation was executed and by whom
  3. Confirm if the health probe is marking the backend VMs as Healthy
  4. Validate if the load balancing rules are associated with the correct frontend IP
  5. Verify if the Load Balancer SKU is compatible with the associated public IP SKU

What is the correct diagnostic sequence for this scenario?

A) 3 β†’ 1 β†’ 4 β†’ 2 β†’ 5

B) 2 β†’ 5 β†’ 1 β†’ 3 β†’ 4

C) 1 β†’ 3 β†’ 2 β†’ 5 β†’ 4

D) 5 β†’ 4 β†’ 3 β†’ 1 β†’ 2


Answer Key and Explanations​

Answer Key β€” Scenario 1​

Answer: B

The Azure Load Balancer Standard SKU adopts a closed security posture by default. All inbound traffic is blocked until a Network Security Group is explicitly associated with the subnet or NIC of the backend instances. The problem statement confirms that no NSG was configured after the Load Balancer recreation, which is the determining clue.

The information about the Availability Set is irrelevant: it was not changed and does not interfere with the Standard SKU security behavior. The health probe may be failing as a consequence of the NSG blocking, but it's not the root cause; treating the health probe as the cause would confuse a secondary symptom with the problem's origin. The Standard public IP association is correct according to the problem statement, making A incorrect. The most dangerous distractor is D, as an operator could waste time adjusting the health probe without ever resolving the actual blocking issue.

Answer Key β€” Scenario 2​

Answer: B

The critical constraint in this scenario is twofold: the absence of NSG and the Basic public IP. When migrating to Standard SKU without NSG, traffic would be immediately blocked by the Standard's default security behavior. Additionally, a Standard Load Balancer cannot be associated with a Basic public IP. Executing the migration without resolving both dependencies would result in complete service interruption in production.

Alternative A ignores both critical constraints. Alternative C solves only one of them (the IP), but leaves the environment exposed to post-migration traffic blocking. Alternative D could be a valid strategy in a planned migration context, but doesn't address the NSG and IP problems before the switch, besides not being the correct action for the described moment. The correct sequence is to prepare the NSG and IP before executing any Load Balancer changes.

Answer Key β€” Scenario 3​

Answer: B

The Global tier of Azure Load Balancer uses anycast-based routing and BGP route propagation. When a new regional backend is added to the pool, there is a convergence period during which traffic may not be immediately distributed to the new endpoint. The fact that the West Europe Load Balancer was added only 20 minutes ago is the central clue in the problem statement.

The information about the monitoring server time zone is irrelevant and was deliberately included to test filtering capability. The Standard public IP is the correct requirement for Global tier, making A incorrect. Alternative D reverses reality: only Standard (not Basic) Load Balancers can be backends for a Global Load Balancer. Alternative C would describe a real problem in other contexts, but no session affinity configuration was mentioned in the problem statement.

Answer Key β€” Scenario 4​

Answer: B

The correct sequence is: 2 β†’ 5 β†’ 1 β†’ 3 β†’ 4.

The correct starting point is the Activity Log (step 2), since the problem statement explicitly mentions there was an unknown configuration change. Identifying what was changed is the first step to direct all subsequent diagnosis. Next, checking SKU compatibility between the Load Balancer and public IP (step 5) is a structural cause that can prevent any functionality. Then, investigating NSG blocking (step 1) and health probe status (step 3) allows eliminating network layer and backend availability causes. Finally, validating load balancing rules and frontend IP (step 4) is the final refinement.

The other sequences make the mistake of starting with investigation of secondary symptoms without first understanding the change context, leading to circular diagnosis and unnecessary time consumption in production environments.


Troubleshooting Tree: Choose an Azure Load Balancer SKU and tier​

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

Color Legend:

ColorNode Type
Dark blue (navy)Initial symptom β€” investigation entry point
BlueDiagnostic question β€” objective and observable verification
RedIdentified cause β€” confirmed problem origin
GreenRecommended action or resolution
OrangeIntermediate validation β€” state awaiting confirmation

To use this tree when facing a real problem, start with the root node describing the observed symptom and follow the branches by answering each diagnostic question based on what you can verify directly in the portal or via command. Each answer eliminates an entire branch of hypotheses. The goal is to reach a red node (cause) or green node (action) through the shortest possible path, without skipping steps that could reveal prior structural causes, such as SKU incompatibility or NSG absence.