Skip to main content

Troubleshooting Lab: Select a Virtual WAN SKU

Diagnostic Scenarios​

Scenario 1 β€” Root Cause​

A network team reports that two branches connected via site-to-site VPN to the same virtual hub can communicate with resources in spoke VNets, but cannot exchange traffic directly between themselves. The environment was provisioned three weeks ago and worked as expected during initial testing with a single branch.

Information collected by the team:

Virtual WAN SKU: Basic
Virtual hub: West Europe
Active connections:
- Branch-SP --> Hub (site-to-site, status: Connected)
- Branch-RJ --> Hub (site-to-site, status: Connected)
- VNet-Prod --> Hub (peering, status: Connected)

Latency Branch-SP --> VNet-Prod: 18ms (OK)
Latency Branch-RJ --> VNet-Prod: 22ms (OK)
Latency Branch-SP --> Branch-RJ: timeout

The team also observes that the bandwidth configured per tunnel is below the maximum supported limit and that the branch routes appear correctly in the hub's route table.

What is the root cause of the communication failure between branches?

A) Branch routes are not being redistributed correctly; BGP needs to be configured between VPN gateways.

B) The Basic SKU does not support transitive routing between branches; branch-to-branch traffic is not forwarded by the hub in this SKU.

C) The bandwidth per tunnel is undersized, causing packet drops only on the branch-to-branch path as it is the longest path.

D) Site-to-site connections are on overlapping subnets, which prevents routing between branches even with an active hub.


Scenario 2 β€” Action Decision​

The problem's cause has been identified: the production Virtual WAN operates with Basic SKU and the organization now requires User VPN (Point-to-Site) connectivity for remote workers, while maintaining existing site-to-site connections without interruption.

The environment has the following constraints:

Environment:
Current Virtual WAN SKU: Basic
Active site-to-site connections: 7 branches
Required availability SLA: 99.9%
Available maintenance window: next 14 days
Available permissions: Network Contributor (no Owner)

The responsible architect presents four action options. Which one is correct?

A) Create a new P2S gateway in the current hub with Basic SKU, as the portal allows resource creation even without native support for this connection type.

B) Upgrade the Virtual WAN to Standard SKU within the maintenance window, as the upgrade is supported without needing to recreate the resource and without mandatory impact on existing connections.

C) Create a second Virtual WAN with Standard SKU in parallel, manually migrate connections and decommission the Basic WAN after complete migration.

D) Request permission elevation to Owner before any action, as SKU upgrade requires Owner permission on the subscription.


Scenario 3 β€” Root Cause​

An architect reports that after creating a new virtual hub in an additional region within an existing Standard Virtual WAN, traffic between VNets connected to the new hub and VNets connected to the original hub is not flowing. Routing worked normally before adding the new hub.

Logs collected via Azure Monitor:

Hub-EastUS (original):
VNet-A peering: Connected
VNet-B peering: Connected
Routing status: Active

Hub-WestEurope (new):
VNet-C peering: Connected
Routing status: Active

Connectivity test:
VNet-A --> VNet-C: Failed (no route)
VNet-B --> VNet-C: Failed (no route)
VNet-A --> VNet-B: OK
VNet-C local --> local resources: OK

Additional information: the team confirms both hubs are in the same Virtual WAN instance, Standard SKU is active, and no NSG blocks inter-hub traffic. The West Europe hub was created two days ago and the team has not yet performed any hub-to-hub connectivity configuration.

What is the root cause of the failure?

A) NSGs in spoke VNets are blocking inter-hub traffic; the team's verbal confirmation does not replace validation via Network Watcher.

B) Standard SKU allows multiple hubs, but transitive routing between distinct hubs is not automatic; explicit connectivity between hubs within the Virtual WAN must be established.

C) The West Europe hub is in incomplete provisioning state; "Active" status in the portal may be displayed before complete deployment.

D) Spoke VNets connected to different hubs cannot exchange traffic in a Standard Virtual WAN; each hub operates as an isolated routing domain by design.


Scenario 4 β€” Collateral Impact​

An organization operates a Virtual WAN with Basic SKU and decides to upgrade to Standard SKU to enable ExpressRoute connectivity. The upgrade is executed successfully during the maintenance window and the ExpressRoute connection is provisioned correctly shortly after.

Three days after the upgrade, the operations team opens a ticket reporting that a rollback request was made by the board for budgetary reasons, asking to return to Basic SKU to reduce costs.

The responsible engineer confirms the upgrade was completed and the new ExpressRoute connections are in production.

What is the most relevant direct consequence of this rollback request?

A) The downgrade can be executed, but requires prior removal of ExpressRoute connections and manual reconfiguration of all existing site-to-site connections.

B) Standard to Basic downgrade is not supported by the platform; to return to Basic SKU would require recreating the Virtual WAN from scratch, with impact on all active connections.

C) The downgrade is supported, but results in temporary suspension of site-to-site connections while the platform performs internal hub resizing.

D) The downgrade can be requested via Azure Support, which evaluates case-by-case if the environment meets prerequisites for SKU reversal.


Answer Key and Explanations​

Answer Key β€” Scenario 1​

Answer: B

The definitive clue in the scenario is the Basic SKU combined with the specific symptom of failure exclusively on the branch-to-branch path, while communication with VNets works normally. This pattern is the exact signature of Basic SKU limitation: it forwards traffic from branch to VNet, but does not route traffic transitively between two branches connected to the same hub.

The information about bandwidth below maximum limit is the irrelevant data inserted intentionally. It has no relation to the observed problem, as the timeout occurs consistently, not intermittently, which rules out drops due to congestion.

Alternative A is the most dangerous distractor: routes appear correctly in the hub table, which would eliminate BGP as a cause. Alternative D would be invalid because overlapping subnets would also prevent communication with VNets, which is not occurring here.

Acting based on alternative A would lead the team to configure BGP unnecessarily, without solving the problem and consuming maintenance time.


Answer Key β€” Scenario 2​

Answer: B

The upgrade from Basic to Standard is supported by the platform without needing to recreate the Virtual WAN and without mandatory impact on existing site-to-site connections. Network Contributor permission is sufficient to execute the upgrade; Owner is not a requirement.

Alternative A is technically impossible: Basic SKU does not support User VPN and the portal blocks resource creation, not just hides the option.

Alternative C is valid in scenarios where in-place upgrade is not supported, but here it is supported; creating a parallel WAN would introduce unnecessary complexity and risk, besides impacting SLA during migration of the 7 active connections.

Alternative D is the distractor that exploits uncertainty about permissions. Network Contributor covers network operations, including Virtual WAN SKU upgrades. Waiting for permission elevation would delay the solution without technical justification.


Answer Key β€” Scenario 3​

Answer: B

In Standard Virtual WAN, the existence of multiple hubs in the same instance does not imply automatic connectivity between them. Inter-hub connection must be explicitly established in the portal or via API, configuring peering between hubs within the Virtual WAN.

The clue in the scenario is the statement that "no hub-to-hub connectivity configuration was performed". This data, combined with the failure pattern (intra-hub traffic works, inter-hub traffic fails), points directly to missing configuration, not resource failure.

Alternative A is the distractor that exploits the tendency to distrust verbal statements and seek additional validation. However, the failure pattern (all inter-hub paths fail, no intra-hub paths fail) makes NSG an unlikely hypothesis as root cause.

Alternative D represents a serious conceptual misconception: hubs in the same Standard Virtual WAN can exchange traffic, but require explicit configuration. Believing they are isolated by design would lead to the incorrect conclusion that the architecture needs redesign.


Answer Key β€” Scenario 4​

Answer: B

Downgrade from Standard to Basic is not supported by the Azure platform. This is a design restriction, not a configurable or reversible limitation via support. The only way to return to Basic SKU would be to recreate the Virtual WAN entirely, which would imply reconfiguring all active connections, including the 7 site-to-site connections and the newly provisioned ExpressRoute connection.

Alternative A is the most dangerous distractor because it describes a process that seems reasonable ("remove connections and downgrade"), but the premise is false: downgrade does not exist as a supported operation, with or without prior connection removal.

Alternative C uses language of temporary degradation to make downgrade plausible. This is a common distractor pattern: suggesting the action exists but has an acceptable cost.

Alternative D appeals to the perception that Microsoft support can enable undocumented operations. In practice, SKU restrictions like this are platform-level and are not changed by support requests.

The real consequence of not understanding this limitation before upgrade is exactly the described scenario: a subsequent budgetary decision that becomes irreversible without severe operational impact.


Troubleshooting Tree: Select a Virtual WAN SKU​

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

Color legend:

ColorNode type
Dark blueInitial symptom (entry point)
BlueDiagnostic question (decision)
RedIdentified cause
GreenRecommended action or resolution
OrangeValidation or intermediate verification

To use this tree when facing a real problem, identify the observed symptom and start from the root node. At each decision node, answer the question based on what is directly verifiable in the portal or via commands: current SKU, type of failing resource, presence of multiple hubs, and existence of inter-hub configuration. Follow the branch corresponding to your answer until reaching an identified cause or recommended action node. Never skip steps: a branch-to-branch symptom can look the same as an inter-hub routing failure, and the distinction between them determines whether the solution is an SKU upgrade or peering configuration.