Skip to main content

Troubleshooting Lab: Deploy resources by using an Azure Resource Manager template or a Bicep file

Diagnostic Scenarios​

Scenario 1 β€” Root Cause​

An infrastructure team executes the deployment of an ARM template via Azure CLI to create a storage account in a development subscription. The template was successfully used the previous week in the same subscription. This time, the command returns an error immediately, without creating the resource group.

The environment uses a corporate naming policy applied via Azure Policy with deny effect. The storage account follows the pattern stgdeveastus001. The responsible engineer has the Contributor role in the subscription. The target region is eastus. The installed Azure CLI version is 2.57.0.

Executed command:

az deployment sub create \
--location eastus \
--template-file main.bicep \
--parameters @params.dev.json

Output:

ERROR: {"code":"InvalidDeploymentParameterValue","message":"The value '...' provided for the deployment parameter 'location' is not valid."}

The params.dev.json file contains:

{
"location": {
"value": "East US"
}
}

What is the root cause of the error?

a) The Contributor role does not have permission to execute deployments at subscription scope (sub).

b) The location parameter value uses formatting with space and uppercase letters instead of the canonical identifier expected by the template.

c) The Azure Policy with deny effect is blocking the storage account deployment due to naming violation.

d) The Azure CLI version is outdated and does not support Bicep deployments at subscription scope.


Scenario 2 β€” Root Cause​

An engineer attempts to deploy a complete solution using an ARM template with nested templates. The deployment starts successfully, all resources appear as Running in the portal for a few minutes, and then the deployment fails with Conflict status.

The template creates, in the following declared order: a VNet, a subnet, an NSG, and a NIC associated with the subnet. The subscription has no locks. The engineer has the Owner role. The deployment logs in the portal show:

Resource Microsoft.Network/networkInterfaces/nic-prod-001
OperationId: a3f2...
StatusCode: Conflict
Message: Another operation on this or dependent resource is in progress.

In the last 30 days, there have been no changes to the subscription policies. The diagnostic storage account referenced in the template exists and is accessible. The template does not use the dependsOn property on any resource.

What is the root cause of the failure?

a) The deployment failed because the diagnostic storage account was being accessed simultaneously by another process.

b) The Owner role is not sufficient to create network resources in the resource group scope when the deployment is initiated at subscription scope.

c) The NIC is trying to associate with the subnet before it and the VNet have completed provisioning, since there are no dependsOn defined between them.

d) The Conflict status indicates that a NIC with the same name already exists in the subscription, causing resource collision.


Scenario 3 β€” Action Decision​

During a scheduled 30-minute production maintenance window, an engineer identifies that a Bicep template deployment failed on resource number 7 of 12. The cause was confirmed: the sku parameter of the App Service Plan was passed as F1 (free tier), which is not allowed in this subscription by an Azure Policy with deny effect. Resources 1 through 6 were created successfully and are in use. The maintenance window ends in 12 minutes.

The engineer has access to the parameter file and the Git repository. The CI/CD pipeline that triggered the deployment is accessible. The environment does not allow direct manual changes to resources (enforced via deployIfNotExists policy with active auditing). There is no pending change request approval for this deployment.

What is the correct action to take at this moment?

a) Fix the sku parameter value directly in the Azure portal and re-run the deployment through the portal to take advantage of the remaining time in the window.

b) Update the parameter file with the S1 value, commit it, open a change request, and wait for approval before re-executing.

c) Fix the sku parameter in the local parameter file and re-run the deployment via CLI within the window, without changing the repository.

d) Log the incident, end the window without completing the deployment, and schedule a new window after fixing the parameter in the repository and revalidating the template.


Scenario 4 β€” Collateral Impact​

An engineer solves a recurring deployment problem by adding "mode": "Complete" to an ARM template for a resource group that manages 14 production resources. The declared template covers only the 9 resources that needed to be updated. The deployment executes successfully and the 9 resources are updated as expected.

What secondary consequence can this action cause?

a) Deployment in Complete mode requires all resources to be in Succeeded state before starting, which may cause failure if any resource is in maintenance state.

b) The 5 resources present in the resource group but absent from the template will be deleted by Azure Resource Manager at the end of the deployment.

c) Azure Resource Manager will create copies of the 5 undeclared resources in the template as orphaned resources in another resource group.

d) Complete mode prevents future deployments in Incremental mode from being executed in the same resource group until the lock is manually removed.


Answer Key and Explanations​

Answer Key β€” Scenario 1​

Answer: b

The InvalidDeploymentParameterValue error indicates that the value provided for the location parameter is invalid. The decisive clue is in the params.dev.json file: the value "East US" uses spaces and uppercase letters, while ARM and Bicep expect the canonical region identifier, which is "eastus" (all lowercase, no spaces). This format discrepancy is sufficient to fail parameter validation before the deployment is even submitted to the platform.

The information about Azure Policy with deny effect is irrelevant in this scenario: the failure occurs during parameter validation phase, before any resource is evaluated by the policy. If the policy were the cause, the error would have code RequestDisallowedByPolicy, not InvalidDeploymentParameterValue.

The most dangerous distractor is alternative c. In environments with active Azure Policy, it's tempting to attribute any failure to the policy, especially when the statement explicitly mentions a naming policy. The correct reasoning is to always observe the error code and message before assuming the cause.

The Contributor role is sufficient for deployments at subscription scope and CLI version 2.57.0 supports Bicep normally. Both distractors represent the error of focusing on environment details instead of carefully reading the error message.


Answer Key β€” Scenario 2​

Answer: c

The log indicates that the NIC attempted to be created while the subnet (or VNet) was still provisioning. Azure Resource Manager, by default, deploys resources in parallel when there are no explicit dependencies declared. Without dependsOn, the NIC starts its creation before the VNet and subnet are in Succeeded state, which generates the concurrent operation conflict on dependent resources.

The clue in the statement is direct: the template does not use the dependsOn property on any resource, and the error message literally says Another operation on this or dependent resource is in progress.

The information about the diagnostic storage account is irrelevant: it exists, is accessible, and does not appear in the reported error chain.

The most dangerous distractor is alternative d. The Conflict message might suggest name collision, but HTTP code 409 (Conflict) in the context of Azure network operations frequently indicates concurrent operation conflict, not resource duplication. Acting on this distractor would lead the engineer to search for and delete a non-existent resource, wasting time.


Answer Key β€” Scenario 3​

Answer: d

The set of scenario constraints is determinant: the environment does not allow direct manual changes (deployIfNotExists policy with active auditing), there is no approved change request, and the remaining time is insufficient for a complete cycle of correction, commit, approval, and safe re-execution.

The correct action is to end the window in a controlled manner, log the incident, and schedule a new window after correcting and revalidating the template in the official repository.

Alternative a violates the direct manual changes restriction. Alternative c produces a state of divergence between what was executed and what is versioned in the repository, which is especially dangerous in environments with active auditing. Alternative b would be correct in a context without a maintenance window about to end, but opening a change request with 12 minutes remaining and waiting for approval is not viable and may result in an even riskier partial deployment.

The most dangerous distractor is c: fixing locally and re-executing seems efficient, but creates divergence between the real infrastructure state and the repository, breaking the infrastructure as code principle and making future audits difficult.


Answer Key β€” Scenario 4​

Answer: b

Complete mode instructs Azure Resource Manager to reconcile the resource group state with what is declared in the template. Resources present in the resource group but absent from the template are deleted. Since the template covers only 9 of the 14 existing resources, the remaining 5 will be removed at the end of the successful deployment.

This is one of the most destructive and silent collateral impacts of ARM, because the deployment reports success, not error. No alert is issued about the deleted resources during execution.

The other distractors describe behaviors that do not exist in ARM: Complete mode does not check the state of existing resources before starting, does not create orphaned resources in other resource groups, and does not prevent future incremental deployments.

The most dangerous distractor is a, as it describes a behavior that seems plausible for a mode that "completes" the state. In practice, Complete mode does not check pre-conditions of existing resources.


Troubleshooting Tree: Deploy with ARM Templates and Bicep​

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

Color legend:

  • Dark blue: initial symptom (entry point)
  • Blue: diagnostic question (decision node)
  • Orange: intermediate verification or state to validate
  • Red: identified cause
  • Green: recommended action or resolution

When facing a real problem, start from the root node and answer each question based on what is directly observable: the moment the error occurs, the returned code, the behavior of resources during and after deployment. Each branch eliminates an entire class of causes. Never jump to a cause without going through intermediate questions, as visually similar symptoms like Conflict and Policy have completely different origins and corrections.