Skip to main content

Theoretical Foundation: Deploy and Configure an Azure Virtual Machine Scale Sets


1. Initial Intuition​

Imagine you manage a call center. During weekday business hours, you need 50 agents. On weekends, 10 are sufficient. During seasonal promotions, you might need 200. Hiring 200 full-time employees to cover the peak would be absurdly expensive. The real solution is to have a pool of temporary workers you call as demand requires.

In Azure, a Virtual Machine Scale Set (VMSS) is exactly that pool: a group of identical VMs that can grow or shrink automatically based on demand, creating and deleting instances as needed, without manual intervention.

While creating individual VMs and configuring them manually works for small and static environments, VMSS is the solution for workloads that vary over time and need auto scaling, centralized configuration management, and distributed high availability.


2. Context​

Why VMSS exists as a separate concept from individual VMs​

Managing 50 individual VMs means: 50 manual image updates, 50 reboots to apply patches, 50 Load Balancer rules to configure, and the entire process repeated each time demand changes.

VMSS solves this with a single object that manages all instances as a unit. You define the model (image, size, network configuration) once, and VMSS ensures all instances comply with that model.

What depends on VMSS​

  • Load Balancer or Application Gateway: to distribute traffic across instances
  • Auto scaling rules: to define when to create or destroy instances
  • Azure Monitor: as a source of metrics for scaling decisions
  • Managed Disks: each instance has its own disks
  • VNet and subnets: where instances are provisioned

3. Concept Construction​

3.1 Orchestration Modes​

VMSS has two orchestration modes with different philosophies:

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

For AZ-104 and most practical auto scaling scenarios, the focus is on Uniform Orchestration.

3.2 Main components of a Uniform VMSS​

Instance Model: defines how each VM will be created. Includes size, OS image, network configuration, extensions, and initialization scripts. All instances are identical to this model.

Capacity: the current number of instances. Can be fixed (manual scaling) or variable (auto scaling).

Upgrade Policy: defines how new model versions are applied to existing instances.

Scaling Rules: metric-based conditions that determine when to create or destroy instances.

3.3 Upgrade Policies​

When you update the VMSS model (new OS image, new configuration), the Upgrade Policy defines how this update propagates:

PolicyBehaviorUse case
AutomaticAzure automatically updates instances in batches, without interventionDev/test environments, workloads tolerant of restarts
RollingUpdates a percentage of instances at a time, with health checks between batchesProduction where you can't bring everything down at once
ManualInstances are not updated; you manually update instance by instanceMaximum control, special scenarios

Rolling Upgrade is the recommended mode for production as it combines automation with safety:

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

3.4 Auto Scaling: how it works​

Auto scaling in VMSS has two components: scale-out (add instances) and scale-in (remove instances), each with their independent rules.

A scaling rule defines:

  • Metric: what to measure (CPU, memory, queue length, etc.)
  • Threshold: the value that triggers the action
  • Operator: greater than, less than, equal to
  • Action: add or remove N instances (or N%)
  • Cooldown: waiting period after an action before evaluating again
100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

3.5 Scale-in Policy​

When VMSS needs to remove instances, the Scale-in Policy defines which instances are chosen for removal:

PolicyBehavior
DefaultRemoves instances in zones with more instances, prioritizing the oldest ones
OldestVMAlways removes the oldest instance (regardless of zone)
NewestVMAlways removes the newest instance

OldestVM is useful to force gradual updates: old instances (with outdated models) are removed first when scale-in occurs.

3.6 Instance Protection​

You can mark individual instances for protection against scale-in or against model updates:

  • Protect from scale-in: instance is never automatically removed
  • Protect from scale-set actions: instance doesn't receive model updates

Useful for instances running long-duration processes that shouldn't be interrupted.

3.7 Overprovisioning​

When VMSS creates new instances (scale-out), it can create more than requested to ensure the target number is reached even if some fail during initialization. Azure deletes the extra instances after confirming the target number was reached healthily.

By default, overprovisioning creates ~20% more. This speeds up scaling but can generate brief charges for extra instances that are deleted within minutes.


4. Structural View​

Complete VMSS architecture​

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

VMSS instance lifecycle​

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

5. Practical Operation​

Health Probes and Application Health Extension​

For Rolling Upgrades and intelligent auto scaling to work, VMSS needs to know if each instance is healthy. There are two mechanisms:

Load Balancer Health Probe: the LB periodically checks if the instance is responding on the configured port. Simple but only checks network connectivity.

Application Health Extension: an extension installed on the VM that checks application health via HTTP/HTTPS and reports to VMSS. Checks actual application health, not just connectivity.

# The extension reports: {"ApplicationHealthState": "Healthy"} or {"ApplicationHealthState": "Unhealthy"}
# Configured endpoint: http://localhost:8080/health

Non-obvious behaviors​

Uniform VMSS doesn't support changing the OS disk size of an individual instance. You change the model, and instances are recreated with the new size on the next update. It's not possible to change just one instance.

Instances have numeric IDs, not conventional names. Instances in a VMSS named vmss-web will be vmss-web_0, vmss-web_1, vmss-web_2. IDs are not sequential after deletions: if vmss-web_1 is deleted and then a new instance is created, it might receive ID vmss-web_3, not vmss-web_1.

Cooldown prevents oscillation but can delay response to sudden spikes. A 5-minute cooldown means that after a scale-out, VMSS won't scale again for 5 minutes, even if load continues growing. For sudden spikes, configure shorter cooldowns with more aggressive rules.

Scale-in of stateful instances can cause data loss. In stateful workloads (user sessions, ongoing processes), removing an instance in the middle of an operation can lose work in progress. For stateful workloads, use Drain (empty connections before removing) or configure termination webhooks.

Overprovisioning can be problematic with slow initializations. If the application takes 10 minutes to initialize and overprovisioning creates 10 extra instances that will be deleted, this wastes 10 Γ— 10 minutes of initialization. For applications with slow initialization, consider disabling overprovisioning.


6. Implementation Methods​

Azure Portal​

When to use: initial creation, exploring options, configuring autoscaling via visual interface

Create VMSS via portal:

  1. Portal > Virtual machine scale sets > + Create
  2. Define: subscription, RG, name, region, availability zones
  3. Orchestration mode: Uniform or Flexible
  4. Select image and size
  5. Configure admin credentials
  6. Disks tab: OS disk type
  7. Networking tab: VNet, subnet, LB
  8. Scaling tab: initial capacity, min/max, upgrade policy
  9. Health tab: enable health monitoring
  10. Review + Create

To configure Auto Scaling via portal after creation:

  • VMSS > Scaling > Custom autoscale > add scale-out and scale-in rules

Azure CLI​

# Create basic VMSS with autoscale
az vmss create \
--resource-group "rg-producao" \
--name "vmss-web" \
--image "Ubuntu2204" \
--vm-sku "Standard_D2s_v5" \
--instance-count 3 \
--admin-username "azureadmin" \
--ssh-key-values ~/.ssh/id_rsa.pub \
--zones 1 2 3 \
--upgrade-policy-mode Rolling \
--load-balancer "lb-web" \
--backend-pool-name "bepool-web" \
--location "brazilsouth"

# Check created instances
az vmss list-instances \
--resource-group "rg-producao" \
--name "vmss-web" \
--output table

# Configure Autoscale: scale-out when CPU > 75%
az monitor autoscale create \
--resource-group "rg-producao" \
--resource "vmss-web" \
--resource-type "Microsoft.Compute/virtualMachineScaleSets" \
--name "autoscale-vmss-web" \
--min-count 2 \
--max-count 10 \
--count 3

# Add scale-out rule
az monitor autoscale rule create \
--resource-group "rg-producao" \
--autoscale-name "autoscale-vmss-web" \
--scale out 2 \
--condition "Percentage CPU > 75 avg 5m"

# Add scale-in rule
az monitor autoscale rule create \
--resource-group "rg-producao" \
--autoscale-name "autoscale-vmss-web" \
--scale in 1 \
--condition "Percentage CPU < 25 avg 10m"

# Manually scale to 5 instances
az vmss scale \
--resource-group "rg-producao" \
--name "vmss-web" \
--new-capacity 5

# Update the model (ex: new image)
az vmss update \
--resource-group "rg-producao" \
--name "vmss-web" \
--set virtualMachineProfile.storageProfile.imageReference.version="latest"

# With Manual upgrade policy: update specific instance
az vmss update-instances \
--resource-group "rg-producao" \
--name "vmss-web" \
--instance-ids 1 2

# With Manual upgrade policy: update all instances
az vmss update-instances \
--resource-group "rg-producao" \
--name "vmss-web" \
--instance-ids "*"

# Reimage an instance (recreate from scratch from the model)
az vmss reimage \
--resource-group "rg-producao" \
--name "vmss-web" \
--instance-id 3

# Reimage all instances
az vmss reimage \
--resource-group "rg-producao" \
--name "vmss-web"

# Configure scale-in policy to remove oldest VMs first
az vmss update \
--resource-group "rg-producao" \
--name "vmss-web" \
--scale-in-policy OldestVM

# Apply scale-in protection to a specific instance
az vmss update \
--resource-group "rg-producao" \
--name "vmss-web" \
--instance-id 2 \
--protect-from-scale-in true

# View instance health status
az vmss get-instance-view \
--resource-group "rg-producao" \
--name "vmss-web" \
--instance-id 0 \
--query "vmHealth"

# View autoscale configuration
az monitor autoscale show \
--resource-group "rg-producao" \
--name "autoscale-vmss-web" \
--output json

# Configure schedule scaling (ex: scale at 8am and reduce at 6pm)
az monitor autoscale profile create \
--autoscale-name "autoscale-vmss-web" \
--resource-group "rg-producao" \
--name "horario-comercial" \
--min-count 5 \
--max-count 10 \
--count 5 \
--start "2026-01-01 08:00" \
--end "2027-12-31 18:00" \
--recurrence week mo tu we th fr \
--timezone "E. South America Standard Time"

Azure PowerShell​

# Create VMSS configuration
$vmssConfig = New-AzVmssConfig `
-Location "brazilsouth" `
-SkuCapacity 3 `
-SkuName "Standard_D2s_v5" `
-UpgradePolicyMode Rolling `
-Zone @("1", "2", "3")

# Configure image
$vmssConfig = Set-AzVmssStorageProfile `
-VirtualMachineScaleSet $vmssConfig `
-ImageReferencePublisher "Canonical" `
-ImageReferenceOffer "0001-com-ubuntu-server-jammy" `
-ImageReferenceSku "22_04-lts-gen2" `
-ImageReferenceVersion "latest" `
-OsDiskCreateOption FromImage `
-ManagedDisk @{storageAccountType = "Premium_LRS"}

# Configure OS
$vmssConfig = Set-AzVmssOsProfile `
-VirtualMachineScaleSet $vmssConfig `
-ComputerNamePrefix "web" `
-AdminUsername "azureadmin" `
-AdminPassword "<secure-password>"

# Configure network
$ipConfig = New-AzVmssIpConfig `
-Name "ipconfig" `
-SubnetId $subnetId `
-LoadBalancerBackendAddressPoolsId $lbBackendPoolId

$vmssConfig = Add-AzVmssNetworkInterfaceConfiguration `
-VirtualMachineScaleSet $vmssConfig `
-Name "nicconfig" `
-Primary $true `
-IpConfiguration $ipConfig

# Create VMSS
$vmss = New-AzVmss `
-ResourceGroupName "rg-producao" `
-VMScaleSetName "vmss-web" `
-VirtualMachineScaleSet $vmssConfig

# Configure Autoscale
$scaleOutRule = New-AzAutoscaleRule `
-MetricName "Percentage CPU" `
-MetricResourceId $vmss.Id `
-TimeGrain ([TimeSpan]::FromMinutes(1)) `
-Statistic Average `
-TimeWindow ([TimeSpan]::FromMinutes(5)) `
-TimeAggregationOperator Average `
-Operator GreaterThan `
-Threshold 75 `
-ScaleActionCooldown ([TimeSpan]::FromMinutes(5)) `
-ScaleActionDirection Increase `
-ScaleActionValue 2 `
-ScaleActionType ChangeCount

$scaleInRule = New-AzAutoscaleRule `
-MetricName "Percentage CPU" `
-MetricResourceId $vmss.Id `
-TimeGrain ([TimeSpan]::FromMinutes(1)) `
-Statistic Average `
-TimeWindow ([TimeSpan]::FromMinutes(10)) `
-TimeAggregationOperator Average `
-Operator LessThan `
-Threshold 25 `
-ScaleActionCooldown ([TimeSpan]::FromMinutes(5)) `
-ScaleActionDirection Decrease `
-ScaleActionValue 1 `
-ScaleActionType ChangeCount

$profile = New-AzAutoscaleProfile `
-DefaultCapacity 3 `
-MaximumCapacity 10 `
-MinimumCapacity 2 `
-Rule @($scaleOutRule, $scaleInRule) `
-Name "default-profile"

Add-AzAutoscaleSetting `
-ResourceGroupName "rg-producao" `
-Location "brazilsouth" `
-Name "autoscale-vmss-web" `
-TargetResourceId $vmss.Id `
-AutoscaleProfile $profile

Bicep​

// VMSS with autoscale
resource vmss 'Microsoft.Compute/virtualMachineScaleSets@2023-03-01' = {
name: 'vmss-web'
location: 'brazilsouth'
zones: ['1', '2', '3']
sku: {
name: 'Standard_D2s_v5'
capacity: 3
}
properties: {
upgradePolicy: {
mode: 'Rolling'
rollingUpgradePolicy: {
maxBatchInstancePercent: 20
maxUnhealthyInstancePercent: 20
maxUnhealthyUpgradedInstancePercent: 20
```bash
pauseTimeBetweenBatches: 'PT30S'
}
}
automaticRepairsPolicy: {
enabled: true
gracePeriod: 'PT10M'
}
virtualMachineProfile: {
osProfile: {
computerNamePrefix: 'web'
adminUsername: adminUsername
adminPassword: adminPassword
linuxConfiguration: {
disablePasswordAuthentication: true
ssh: {
publicKeys: [
{
path: '/home/azureadmin/.ssh/authorized_keys'
keyData: sshPublicKey
}
]
}
}
}
storageProfile: {
imageReference: {
publisher: 'Canonical'
offer: '0001-com-ubuntu-server-jammy'
sku: '22_04-lts-gen2'
version: 'latest'
}
osDisk: {
createOption: 'FromImage'
managedDisk: {
storageAccountType: 'Premium_LRS'
}
}
}
networkProfile: {
networkInterfaceConfigurations: [
{
name: 'nicconfig'
properties: {
primary: true
ipConfigurations: [
{
name: 'ipconfig'
properties: {
subnet: {
id: subnetId
}
loadBalancerBackendAddressPools: [
{
id: lbBackendPoolId
}
]
}
}
]
}
}
]
}
extensionProfile: {
extensions: [
{
name: 'ApplicationHealthExtension'
properties: {
publisher: 'Microsoft.ManagedServices'
type: 'ApplicationHealthLinux'
typeHandlerVersion: '1.0'
settings: {
protocol: 'http'
port: 8080
requestPath: '/health'
}
}
}
]
}
}
scaleInPolicy: {
rules: ['OldestVM']
}
overprovision: true
}
}

// Auto Scale settings
resource autoscale 'Microsoft.Insights/autoscaleSettings@2022-10-01' = {
name: 'autoscale-vmss-web'
location: 'brazilsouth'
properties: {
enabled: true
targetResourceUri: vmss.id
profiles: [
{
name: 'default'
capacity: {
default: '3'
minimum: '2'
maximum: '10'
}
rules: [
{
metricTrigger: {
metricName: 'Percentage CPU'
metricResourceUri: vmss.id
timeGrain: 'PT1M'
statistic: 'Average'
timeWindow: 'PT5M'
timeAggregation: 'Average'
operator: 'GreaterThan'
threshold: 75
}
scaleAction: {
direction: 'Increase'
type: 'ChangeCount'
value: '2'
cooldown: 'PT5M'
}
}
{
metricTrigger: {
metricName: 'Percentage CPU'
metricResourceUri: vmss.id
timeGrain: 'PT1M'
statistic: 'Average'
timeWindow: 'PT10M'
timeAggregation: 'Average'
operator: 'LessThan'
threshold: 25
}
scaleAction: {
direction: 'Decrease'
type: 'ChangeCount'
value: '1'
cooldown: 'PT5M'
}
}
]
}
]
}
}

7. Control and Security​

Managed Identity for VMSS​

# Assign System-Assigned Managed Identity to VMSS
az vmss identity assign \
--resource-group "rg-producao" \
--name "vmss-web" \
--identities [system]

# Grant permission to VMSS to access Key Vault
VMSS_IDENTITY=$(az vmss show \
--resource-group "rg-producao" \
--name "vmss-web" \
--query "identity.principalId" --output tsv)

az keyvault set-policy \
--name "kv-producao" \
--object-id "$VMSS_IDENTITY" \
--secret-permissions get list

VMSS with Azure Policy​

# Audit VMSS without autoscale configured
az graph query -q "
Resources
| where type == 'microsoft.compute/virtualmachinescalesets'
| where isnull(properties.scaleInPolicy)
| project name, resourceGroup, location"

# Check instances with outdated model (need update)
az vmss list-instances \
--resource-group "rg-producao" \
--name "vmss-web" \
--query "[?latestModelApplied == false].{ID: instanceId, Name: osProfile.computerName}" \
--output table

Automatic Repairs​

The VMSS can be configured to automatically repair unhealthy instances: if an instance reports unhealthy status for a configured period (grace period), the VMSS deletes and recreates that instance automatically.

# Enable automatic repairs
az vmss update \
--resource-group "rg-producao" \
--name "vmss-web" \
--enable-automatic-repairs true \
--automatic-repairs-grace-period "PT10M"

8. Decision Making​

VMSS vs. individual VMs​

SituationChoiceReason
Web app with variable trafficVMSSAutomatic scaling adjusts capacity
Primary stateful databaseIndividual VMsVMSS not ideal for state; use AS or AZ
Batch processing in batchesVMSS with queue-based scalingScales to process queue, reduces to zero after
Dev environments with 2-3 VMsIndividual VMsVMSS overhead unnecessary for small volumes
High availability APIVMSS with zonesAutomatic distribution + scaling
Application needing heterogeneous VMsVMSS FlexibleAllows different configurations per instance

Upgrade Policy by scenario​

ScenarioPolicyReason
Production API that can't have total downtimeRollingUpdates in batches with health checks
Dev/test environmentAutomaticConvenience, tolerant of interruptions
Critical application with controlled deployManualMaximum control over when and which instances
Black Friday (freeze changes during peak)ManualNo automatic updates during peak

Auto Scale configuration​

WorkloadScale-out thresholdScale-in thresholdCooldown
REST API (CPU bound)CPU > 75% for 5minCPU < 25% for 10min5min
Queue worker (queue depth)Queue > 100 messagesQueue < 10 messages3min
App with predictable spikesSchedule scaling for specific timesScheduleN/A
ML inference (GPU)GPU util > 80%GPU util < 20%10min

9. Best Practices​

Always configure min and max capacity. A VMSS without limits can scale indefinitely, generating unexpected costs. Always define minCount based on the minimum number for HA and maxCount based on the maximum acceptable cost capacity.

Use at least 2 rules: one for scale-out and one for scale-in. A VMSS with only scale-out rule will grow indefinitely and never shrink. One with only scale-in can self-destruct. Always configure both.

Configure scale-in conservatively (lower threshold, longer window). Aggressive scale-in can cause flapping: the system scales out, then in, then out again in quick cycles. Use a low threshold (CPU < 25%) and a longer time window (10+ minutes) for scale-in.

Use Application Health Extension instead of just LB Health Probe. The extension checks if the application is actually working, not just if the VM is responding on the port. Rolling upgrades and automatic repairs are much more reliable with application health.

For workloads with session state, use sticky sessions or external session store. If users have active sessions and the VMSS scales in, the session may be on an instance that will be removed. Use Azure Redis Cache as external session store or configure sticky sessions on the LB.

Test scaling behavior before production. Run load tests that trigger controlled scale-out and scale-in. Verify that health checks work, that the LB distributes traffic correctly, and that there are no errors during transition.

Use Custom Script Extension or cloud-init model for initial configuration. Don't include sensitive configurations in the VMSS model. Use cloud-init (Linux) or Custom Script Extension to configure applications on first boot, pulling configurations from Key Vault.


10. Common Errors​

ErrorWhy it happensHow to avoid
VMSS oscillating (flapping) between scale-out and scale-inCooldown too short or thresholds too closeIncrease cooldown and separate thresholds
Instances with outdated model in productionManual upgrade policy with forgotten applicationUse Rolling for production with health checks
Scale-out doesn't happen even with high CPUAutoscale created but not linked to VMSS correctlyCheck targetResourceUri in autoscale setting
Instances removed with active sessionsScale-in without connection drainingUse external session store or sticky sessions
Overprovisioning with slow initialization generating costApp with 20+ minute initialization with active overprovisionDisable overprovision for apps with slow init
maxCount too low preventing spike responsemaxCount defined arbitrarilyCalculate maxCount based on capacity tests
VMSS without health extension with rolling upgradeRolling upgrades without verified health can bring down all instancesAlways enable Application Health Extension with rolling upgrades
Scaling by CPU without considering other bottlenecksApplication has database bottleneck, not CPUIdentify the real bottleneck; consider custom metrics

The most critical error​

Configuring a scale-out rule without the corresponding scale-in, or configuring scale-in with too high threshold (e.g., CPU < 80% reduces instances). In the first case, the VMSS will grow to maxCount and stay there forever. In the second, it will reduce instances before the load actually decreases, causing immediate performance degradation.


11. Operation and Maintenance​

Monitor VMSS state​

# Status of all instances
az vmss list-instances \
--resource-group "rg-producao" \
--name "vmss-web" \
--query "[].{
ID: instanceId,
Name: osProfile.computerName,
ModelUpdated: latestModelApplied,
State: provisioningState
}" \
--output table

# View scaling history
az monitor activity-log list \
--resource-group "rg-producao" \
--query "[?operationName.value=='Microsoft.Compute/virtualMachineScaleSets/write'].{
Time: eventTimestamp,
Operation: operationName.value,
Status: status.value
}" \
--output table

# View current autoscale configuration
az monitor autoscale show \
--resource-group "rg-producao" \
--name "autoscale-vmss-web" \
--query "{Min: profiles[0].capacity.minimum, Max: profiles[0].capacity.maximum, Default: profiles[0].capacity.default}" \
--output json

# Check for unhealthy instances
az vmss list-instances \
--resource-group "rg-producao" \
--name "vmss-web" \
--query "[?provisioningState != 'Succeeded'].{ID: instanceId, State: provisioningState}" \
--output table

Important limits​

LimitValue
Instances per VMSS (Uniform)1,000 with Managed Disks; 600 without
Instances per VMSS (Flexible)1,000
Autoscale rules per profile10
Autoscale profiles per VMSS20
Fault Domains (with zones)1-5 (configurable)
Instances that can be updated simultaneously (Rolling)Configurable, default 20%

12. Integration and Automation​

VMSS with Azure Service Bus for event-driven scaling​

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular
# Configure autoscale with custom metric (Service Bus queue)
az monitor autoscale rule create \
--resource-group "rg-producao" \
--autoscale-name "autoscale-vmss-workers" \
--condition "ActiveMessages > 100 avg 5m where EntityName == fila-trabalho" \
--scale out 3

Deploy pipeline with zero-downtime via Rolling Upgrade​

# Azure DevOps Pipeline: update VMSS image with zero-downtime
steps:
- task: AzureCLI@2
displayName: 'Update VMSS Image'
inputs:
azureSubscription: 'prod-subscription'
scriptType: 'bash'
scriptLocation: 'inlineScript'
inlineScript: |
# Update image reference in model
az vmss update \
--resource-group rg-producao \
--name vmss-web \
--set virtualMachineProfile.storageProfile.imageReference.version=$(IMAGE_VERSION)

# With Rolling upgrade policy, Azure starts automatically
# Wait for rolling upgrade completion
az vmss rolling-upgrade start \
--resource-group rg-producao \
--name vmss-web

# Monitor progress
while true; do
STATUS=$(az vmss rolling-upgrade get-latest \
--resource-group rg-producao \
--name vmss-web \
--query "runningStatus.code" -o tsv)
echo "Status: $STATUS"
if [ "$STATUS" == "RollingForwardCompleted" ]; then
echo "Rolling upgrade completed successfully"
break
elif [ "$STATUS" == "Cancelled" ] || [ "$STATUS" == "Faulted" ]; then
echo "Rolling upgrade failed: $STATUS"
exit 1
fi
sleep 30
done

13. Final Summary​

Essential points:

  • VMSS is a group of identical VMs managed as a unit, with automatic scaling based on metrics
  • Uniform mode is for identical instances with automatic scaling; Flexible mode allows heterogeneous instances
  • Upgrade Policy defines how model changes propagate: Automatic (immediate), Rolling (in batches with health checks), Manual (you control)
  • Auto Scaling requires at least one scale-out AND one scale-in rule; cooldown prevents flapping
  • Scale-in Policy defines which instance is removed: Default (by zone and age), OldestVM, NewestVM
  • Application Health Extension is necessary for Rolling Upgrades and Automatic Repairs to work correctly
  • Overprovisioning creates extra instances temporarily to ensure the target number is reached

Critical differences:

  • VMSS vs. individual VMs: VMSS is for elastic workloads that vary over time; individual VMs are for static or stateful workloads
  • Uniform vs. Flexible: Uniform = identical instances with native rolling upgrade; Flexible = heterogeneous instances, closer to a modern Availability Set
  • Scale-out cooldown vs. Scale-in cooldown: are configured separately; scale-in generally needs a longer cooldown to avoid flapping
  • Update vs. Reimage: Update applies model changes to the existing instance; Reimage recreates the instance from scratch based on the model

What needs to be remembered for AZ-104:

  • Command to scale manually: az vmss scale --new-capacity <N>
  • Command to view instances: az vmss list-instances
  • Command to update instances (Manual policy): az vmss update-instances --instance-ids "*"
  • Autoscale is a separate resource from VMSS: Microsoft.Insights/autoscaleSettings
  • VMSS with zones requires that managed disks are also zone-aware
  • Limit of 1,000 instances per VMSS with Managed Disks
  • Rolling Upgrade without Application Health Extension can result in updating unhealthy instances, potentially bringing down the entire tier
  • VMSS instances have names in the format {vmssname}_{id} where IDs are not necessarily sequential