Theoretical Foundation: Deploy and Configure an Azure Virtual Machine Scale Sets

1. Initial Intuition

Imagine you manage a call center. During weekday business hours, you need 50 agents. On weekends, 10 are sufficient. During seasonal promotions, you might need 200. Hiring 200 full-time employees to cover the peak would be absurdly expensive. The real solution is to have a pool of temporary workers you call as demand requires.

In Azure, a Virtual Machine Scale Set (VMSS) is exactly that pool: a group of identical VMs that can grow or shrink automatically based on demand, creating and deleting instances as needed, without manual intervention.

While creating individual VMs and configuring them manually works for small and static environments, VMSS is the solution for workloads that vary over time and need auto scaling, centralized configuration management, and distributed high availability.

2. Context

Why VMSS exists as a separate concept from individual VMs

Managing 50 individual VMs means: 50 manual image updates, 50 reboots to apply patches, 50 Load Balancer rules to configure, and the entire process repeated each time demand changes.

VMSS solves this with a single object that manages all instances as a unit. You define the model (image, size, network configuration) once, and VMSS ensures all instances comply with that model.

What depends on VMSS

Load Balancer or Application Gateway: to distribute traffic across instances
Auto scaling rules: to define when to create or destroy instances
Azure Monitor: as a source of metrics for scaling decisions
Managed Disks: each instance has its own disks
VNet and subnets: where instances are provisioned

3. Concept Construction

3.1 Orchestration Modes

VMSS has two orchestration modes with different philosophies:

100%

Scroll para zoom · Arraste para mover · 📱 Pinch para zoom no celular

For AZ-104 and most practical auto scaling scenarios, the focus is on Uniform Orchestration.

3.2 Main components of a Uniform VMSS

Instance Model: defines how each VM will be created. Includes size, OS image, network configuration, extensions, and initialization scripts. All instances are identical to this model.

Capacity: the current number of instances. Can be fixed (manual scaling) or variable (auto scaling).

Upgrade Policy: defines how new model versions are applied to existing instances.

Scaling Rules: metric-based conditions that determine when to create or destroy instances.

3.3 Upgrade Policies

When you update the VMSS model (new OS image, new configuration), the Upgrade Policy defines how this update propagates:

Policy	Behavior	Use case
Automatic	Azure automatically updates instances in batches, without intervention	Dev/test environments, workloads tolerant of restarts
Rolling	Updates a percentage of instances at a time, with health checks between batches	Production where you can't bring everything down at once
Manual	Instances are not updated; you manually update instance by instance	Maximum control, special scenarios

Rolling Upgrade is the recommended mode for production as it combines automation with safety:

100%

Scroll para zoom · Arraste para mover · 📱 Pinch para zoom no celular

3.4 Auto Scaling: how it works

Auto scaling in VMSS has two components: scale-out (add instances) and scale-in (remove instances), each with their independent rules.

A scaling rule defines:

Metric: what to measure (CPU, memory, queue length, etc.)
Threshold: the value that triggers the action
Operator: greater than, less than, equal to
Action: add or remove N instances (or N%)
Cooldown: waiting period after an action before evaluating again

100%

Scroll para zoom · Arraste para mover · 📱 Pinch para zoom no celular

3.5 Scale-in Policy

When VMSS needs to remove instances, the Scale-in Policy defines which instances are chosen for removal:

Policy	Behavior
Default	Removes instances in zones with more instances, prioritizing the oldest ones
OldestVM	Always removes the oldest instance (regardless of zone)
NewestVM	Always removes the newest instance

OldestVM is useful to force gradual updates: old instances (with outdated models) are removed first when scale-in occurs.

3.6 Instance Protection

You can mark individual instances for protection against scale-in or against model updates:

Protect from scale-in: instance is never automatically removed
Protect from scale-set actions: instance doesn't receive model updates

Useful for instances running long-duration processes that shouldn't be interrupted.

3.7 Overprovisioning

When VMSS creates new instances (scale-out), it can create more than requested to ensure the target number is reached even if some fail during initialization. Azure deletes the extra instances after confirming the target number was reached healthily.

By default, overprovisioning creates ~20% more. This speeds up scaling but can generate brief charges for extra instances that are deleted within minutes.

4. Structural View

Complete VMSS architecture

100%

Scroll para zoom · Arraste para mover · 📱 Pinch para zoom no celular

VMSS instance lifecycle

100%

Scroll para zoom · Arraste para mover · 📱 Pinch para zoom no celular

5. Practical Operation

Health Probes and Application Health Extension

For Rolling Upgrades and intelligent auto scaling to work, VMSS needs to know if each instance is healthy. There are two mechanisms:

Load Balancer Health Probe: the LB periodically checks if the instance is responding on the configured port. Simple but only checks network connectivity.

Application Health Extension: an extension installed on the VM that checks application health via HTTP/HTTPS and reports to VMSS. Checks actual application health, not just connectivity.

# The extension reports: {"ApplicationHealthState": "Healthy"} or {"ApplicationHealthState": "Unhealthy"}
# Configured endpoint: http://localhost:8080/health

Non-obvious behaviors

Uniform VMSS doesn't support changing the OS disk size of an individual instance. You change the model, and instances are recreated with the new size on the next update. It's not possible to change just one instance.

Instances have numeric IDs, not conventional names. Instances in a VMSS named vmss-web will be vmss-web_0, vmss-web_1, vmss-web_2. IDs are not sequential after deletions: if vmss-web_1 is deleted and then a new instance is created, it might receive ID vmss-web_3, not vmss-web_1.

Cooldown prevents oscillation but can delay response to sudden spikes. A 5-minute cooldown means that after a scale-out, VMSS won't scale again for 5 minutes, even if load continues growing. For sudden spikes, configure shorter cooldowns with more aggressive rules.

Scale-in of stateful instances can cause data loss. In stateful workloads (user sessions, ongoing processes), removing an instance in the middle of an operation can lose work in progress. For stateful workloads, use Drain (empty connections before removing) or configure termination webhooks.

Overprovisioning can be problematic with slow initializations. If the application takes 10 minutes to initialize and overprovisioning creates 10 extra instances that will be deleted, this wastes 10 × 10 minutes of initialization. For applications with slow initialization, consider disabling overprovisioning.

6. Implementation Methods

Azure Portal

When to use: initial creation, exploring options, configuring autoscaling via visual interface

Create VMSS via portal:

Portal > Virtual machine scale sets > + Create
Define: subscription, RG, name, region, availability zones
Orchestration mode: Uniform or Flexible
Select image and size
Configure admin credentials
Disks tab: OS disk type
Networking tab: VNet, subnet, LB
Scaling tab: initial capacity, min/max, upgrade policy
Health tab: enable health monitoring
Review + Create

To configure Auto Scaling via portal after creation:

VMSS > Scaling > Custom autoscale > add scale-out and scale-in rules

Azure CLI

# Create basic VMSS with autoscale
az vmss create \
  --resource-group "rg-producao" \
  --name "vmss-web" \
  --image "Ubuntu2204" \
  --vm-sku "Standard_D2s_v5" \
  --instance-count 3 \
  --admin-username "azureadmin" \
  --ssh-key-values ~/.ssh/id_rsa.pub \
  --zones 1 2 3 \
  --upgrade-policy-mode Rolling \
  --load-balancer "lb-web" \
  --backend-pool-name "bepool-web" \
  --location "brazilsouth"

# Check created instances
az vmss list-instances \
  --resource-group "rg-producao" \
  --name "vmss-web" \
  --output table

# Configure Autoscale: scale-out when CPU > 75%
az monitor autoscale create \
  --resource-group "rg-producao" \
  --resource "vmss-web" \
  --resource-type "Microsoft.Compute/virtualMachineScaleSets" \
  --name "autoscale-vmss-web" \
  --min-count 2 \
  --max-count 10 \
  --count 3

# Add scale-out rule
az monitor autoscale rule create \
  --resource-group "rg-producao" \
  --autoscale-name "autoscale-vmss-web" \
  --scale out 2 \
  --condition "Percentage CPU > 75 avg 5m"

# Add scale-in rule
az monitor autoscale rule create \
  --resource-group "rg-producao" \
  --autoscale-name "autoscale-vmss-web" \
  --scale in 1 \
  --condition "Percentage CPU < 25 avg 10m"

# Manually scale to 5 instances
az vmss scale \
  --resource-group "rg-producao" \
  --name "vmss-web" \
  --new-capacity 5

# Update the model (ex: new image)
az vmss update \
  --resource-group "rg-producao" \
  --name "vmss-web" \
  --set virtualMachineProfile.storageProfile.imageReference.version="latest"

# With Manual upgrade policy: update specific instance
az vmss update-instances \
  --resource-group "rg-producao" \
  --name "vmss-web" \
  --instance-ids 1 2

# With Manual upgrade policy: update all instances
az vmss update-instances \
  --resource-group "rg-producao" \
  --name "vmss-web" \
  --instance-ids "*"

# Reimage an instance (recreate from scratch from the model)
az vmss reimage \
  --resource-group "rg-producao" \
  --name "vmss-web" \
  --instance-id 3

# Reimage all instances
az vmss reimage \
  --resource-group "rg-producao" \
  --name "vmss-web"

# Configure scale-in policy to remove oldest VMs first
az vmss update \
  --resource-group "rg-producao" \
  --name "vmss-web" \
  --scale-in-policy OldestVM

# Apply scale-in protection to a specific instance
az vmss update \
  --resource-group "rg-producao" \
  --name "vmss-web" \
  --instance-id 2 \
  --protect-from-scale-in true

# View instance health status
az vmss get-instance-view \
  --resource-group "rg-producao" \
  --name "vmss-web" \
  --instance-id 0 \
  --query "vmHealth"

# View autoscale configuration
az monitor autoscale show \
  --resource-group "rg-producao" \
  --name "autoscale-vmss-web" \
  --output json

# Configure schedule scaling (ex: scale at 8am and reduce at 6pm)
az monitor autoscale profile create \
  --autoscale-name "autoscale-vmss-web" \
  --resource-group "rg-producao" \
  --name "horario-comercial" \
  --min-count 5 \
  --max-count 10 \
  --count 5 \
  --start "2026-01-01 08:00" \
  --end "2027-12-31 18:00" \
  --recurrence week mo tu we th fr \
  --timezone "E. South America Standard Time"

Azure PowerShell

# Create VMSS configuration
$vmssConfig = New-AzVmssConfig `
  -Location "brazilsouth" `
  -SkuCapacity 3 `
  -SkuName "Standard_D2s_v5" `
  -UpgradePolicyMode Rolling `
  -Zone @("1", "2", "3")

# Configure image
$vmssConfig = Set-AzVmssStorageProfile `
  -VirtualMachineScaleSet $vmssConfig `
  -ImageReferencePublisher "Canonical" `
  -ImageReferenceOffer "0001-com-ubuntu-server-jammy" `
  -ImageReferenceSku "22_04-lts-gen2" `
  -ImageReferenceVersion "latest" `
  -OsDiskCreateOption FromImage `
  -ManagedDisk @{storageAccountType = "Premium_LRS"}

# Configure OS
$vmssConfig = Set-AzVmssOsProfile `
  -VirtualMachineScaleSet $vmssConfig `
  -ComputerNamePrefix "web" `
  -AdminUsername "azureadmin" `
  -AdminPassword "<secure-password>"

# Configure network
$ipConfig = New-AzVmssIpConfig `
  -Name "ipconfig" `
  -SubnetId $subnetId `
  -LoadBalancerBackendAddressPoolsId $lbBackendPoolId

$vmssConfig = Add-AzVmssNetworkInterfaceConfiguration `
  -VirtualMachineScaleSet $vmssConfig `
  -Name "nicconfig" `
  -Primary $true `
  -IpConfiguration $ipConfig

# Create VMSS
$vmss = New-AzVmss `
  -ResourceGroupName "rg-producao" `
  -VMScaleSetName "vmss-web" `
  -VirtualMachineScaleSet $vmssConfig

# Configure Autoscale
$scaleOutRule = New-AzAutoscaleRule `
  -MetricName "Percentage CPU" `
  -MetricResourceId $vmss.Id `
  -TimeGrain ([TimeSpan]::FromMinutes(1)) `
  -Statistic Average `
  -TimeWindow ([TimeSpan]::FromMinutes(5)) `
  -TimeAggregationOperator Average `
  -Operator GreaterThan `
  -Threshold 75 `
  -ScaleActionCooldown ([TimeSpan]::FromMinutes(5)) `
  -ScaleActionDirection Increase `
  -ScaleActionValue 2 `
  -ScaleActionType ChangeCount

$scaleInRule = New-AzAutoscaleRule `
  -MetricName "Percentage CPU" `
  -MetricResourceId $vmss.Id `
  -TimeGrain ([TimeSpan]::FromMinutes(1)) `
  -Statistic Average `
  -TimeWindow ([TimeSpan]::FromMinutes(10)) `
  -TimeAggregationOperator Average `
  -Operator LessThan `
  -Threshold 25 `
  -ScaleActionCooldown ([TimeSpan]::FromMinutes(5)) `
  -ScaleActionDirection Decrease `
  -ScaleActionValue 1 `
  -ScaleActionType ChangeCount

$profile = New-AzAutoscaleProfile `
  -DefaultCapacity 3 `
  -MaximumCapacity 10 `
  -MinimumCapacity 2 `
  -Rule @($scaleOutRule, $scaleInRule) `
  -Name "default-profile"

Add-AzAutoscaleSetting `
  -ResourceGroupName "rg-producao" `
  -Location "brazilsouth" `
  -Name "autoscale-vmss-web" `
  -TargetResourceId $vmss.Id `
  -AutoscaleProfile $profile

Bicep

// VMSS with autoscale
resource vmss 'Microsoft.Compute/virtualMachineScaleSets@2023-03-01' = {
  name: 'vmss-web'
  location: 'brazilsouth'
  zones: ['1', '2', '3']
  sku: {
    name: 'Standard_D2s_v5'
    capacity: 3
  }
  properties: {
    upgradePolicy: {
      mode: 'Rolling'
      rollingUpgradePolicy: {
        maxBatchInstancePercent: 20
        maxUnhealthyInstancePercent: 20
        maxUnhealthyUpgradedInstancePercent: 20
```bash
        pauseTimeBetweenBatches: 'PT30S'
      }
    }
    automaticRepairsPolicy: {
      enabled: true
      gracePeriod: 'PT10M'
    }
    virtualMachineProfile: {
      osProfile: {
        computerNamePrefix: 'web'
        adminUsername: adminUsername
        adminPassword: adminPassword
        linuxConfiguration: {
          disablePasswordAuthentication: true
          ssh: {
            publicKeys: [
              {
                path: '/home/azureadmin/.ssh/authorized_keys'
                keyData: sshPublicKey
              }
            ]
          }
        }
      }
      storageProfile: {
        imageReference: {
          publisher: 'Canonical'
          offer: '0001-com-ubuntu-server-jammy'
          sku: '22_04-lts-gen2'
          version: 'latest'
        }
        osDisk: {
          createOption: 'FromImage'
          managedDisk: {
            storageAccountType: 'Premium_LRS'
          }
        }
      }
      networkProfile: {
        networkInterfaceConfigurations: [
          {
            name: 'nicconfig'
            properties: {
              primary: true
              ipConfigurations: [
                {
                  name: 'ipconfig'
                  properties: {
                    subnet: {
                      id: subnetId
                    }
                    loadBalancerBackendAddressPools: [
                      {
                        id: lbBackendPoolId
                      }
                    ]
                  }
                }
              ]
            }
          }
        ]
      }
      extensionProfile: {
        extensions: [
          {
            name: 'ApplicationHealthExtension'
            properties: {
              publisher: 'Microsoft.ManagedServices'
              type: 'ApplicationHealthLinux'
              typeHandlerVersion: '1.0'
              settings: {
                protocol: 'http'
                port: 8080
                requestPath: '/health'
              }
            }
          }
        ]
      }
    }
    scaleInPolicy: {
      rules: ['OldestVM']
    }
    overprovision: true
  }
}

// Auto Scale settings
resource autoscale 'Microsoft.Insights/autoscaleSettings@2022-10-01' = {
  name: 'autoscale-vmss-web'
  location: 'brazilsouth'
  properties: {
    enabled: true
    targetResourceUri: vmss.id
    profiles: [
      {
        name: 'default'
        capacity: {
          default: '3'
          minimum: '2'
          maximum: '10'
        }
        rules: [
          {
            metricTrigger: {
              metricName: 'Percentage CPU'
              metricResourceUri: vmss.id
              timeGrain: 'PT1M'
              statistic: 'Average'
              timeWindow: 'PT5M'
              timeAggregation: 'Average'
              operator: 'GreaterThan'
              threshold: 75
            }
            scaleAction: {
              direction: 'Increase'
              type: 'ChangeCount'
              value: '2'
              cooldown: 'PT5M'
            }
          }
          {
            metricTrigger: {
              metricName: 'Percentage CPU'
              metricResourceUri: vmss.id
              timeGrain: 'PT1M'
              statistic: 'Average'
              timeWindow: 'PT10M'
              timeAggregation: 'Average'
              operator: 'LessThan'
              threshold: 25
            }
            scaleAction: {
              direction: 'Decrease'
              type: 'ChangeCount'
              value: '1'
              cooldown: 'PT5M'
            }
          }
        ]
      }
    ]
  }
}

7. Control and Security

Managed Identity for VMSS

# Assign System-Assigned Managed Identity to VMSS
az vmss identity assign \
  --resource-group "rg-producao" \
  --name "vmss-web" \
  --identities [system]

# Grant permission to VMSS to access Key Vault
VMSS_IDENTITY=$(az vmss show \
  --resource-group "rg-producao" \
  --name "vmss-web" \
  --query "identity.principalId" --output tsv)

az keyvault set-policy \
  --name "kv-producao" \
  --object-id "$VMSS_IDENTITY" \
  --secret-permissions get list

VMSS with Azure Policy

# Audit VMSS without autoscale configured
az graph query -q "
Resources
| where type == 'microsoft.compute/virtualmachinescalesets'
| where isnull(properties.scaleInPolicy)
| project name, resourceGroup, location"

# Check instances with outdated model (need update)
az vmss list-instances \
  --resource-group "rg-producao" \
  --name "vmss-web" \
  --query "[?latestModelApplied == false].{ID: instanceId, Name: osProfile.computerName}" \
  --output table

Automatic Repairs

The VMSS can be configured to automatically repair unhealthy instances: if an instance reports unhealthy status for a configured period (grace period), the VMSS deletes and recreates that instance automatically.

# Enable automatic repairs
az vmss update \
  --resource-group "rg-producao" \
  --name "vmss-web" \
  --enable-automatic-repairs true \
  --automatic-repairs-grace-period "PT10M"

8. Decision Making

VMSS vs. individual VMs

Situation	Choice	Reason
Web app with variable traffic	VMSS	Automatic scaling adjusts capacity
Primary stateful database	Individual VMs	VMSS not ideal for state; use AS or AZ
Batch processing in batches	VMSS with queue-based scaling	Scales to process queue, reduces to zero after
Dev environments with 2-3 VMs	Individual VMs	VMSS overhead unnecessary for small volumes
High availability API	VMSS with zones	Automatic distribution + scaling
Application needing heterogeneous VMs	VMSS Flexible	Allows different configurations per instance

Upgrade Policy by scenario

Scenario	Policy	Reason
Production API that can't have total downtime	Rolling	Updates in batches with health checks
Dev/test environment	Automatic	Convenience, tolerant of interruptions
Critical application with controlled deploy	Manual	Maximum control over when and which instances
Black Friday (freeze changes during peak)	Manual	No automatic updates during peak

Auto Scale configuration

Workload	Scale-out threshold	Scale-in threshold	Cooldown
REST API (CPU bound)	CPU > 75% for 5min	CPU < 25% for 10min	5min
Queue worker (queue depth)	Queue > 100 messages	Queue < 10 messages	3min
App with predictable spikes	Schedule scaling for specific times	Schedule	N/A
ML inference (GPU)	GPU util > 80%	GPU util < 20%	10min

9. Best Practices

Always configure min and max capacity. A VMSS without limits can scale indefinitely, generating unexpected costs. Always define minCount based on the minimum number for HA and maxCount based on the maximum acceptable cost capacity.

Use at least 2 rules: one for scale-out and one for scale-in. A VMSS with only scale-out rule will grow indefinitely and never shrink. One with only scale-in can self-destruct. Always configure both.

Configure scale-in conservatively (lower threshold, longer window). Aggressive scale-in can cause flapping: the system scales out, then in, then out again in quick cycles. Use a low threshold (CPU < 25%) and a longer time window (10+ minutes) for scale-in.

Use Application Health Extension instead of just LB Health Probe. The extension checks if the application is actually working, not just if the VM is responding on the port. Rolling upgrades and automatic repairs are much more reliable with application health.

For workloads with session state, use sticky sessions or external session store. If users have active sessions and the VMSS scales in, the session may be on an instance that will be removed. Use Azure Redis Cache as external session store or configure sticky sessions on the LB.

Test scaling behavior before production. Run load tests that trigger controlled scale-out and scale-in. Verify that health checks work, that the LB distributes traffic correctly, and that there are no errors during transition.

Use Custom Script Extension or cloud-init model for initial configuration. Don't include sensitive configurations in the VMSS model. Use cloud-init (Linux) or Custom Script Extension to configure applications on first boot, pulling configurations from Key Vault.

10. Common Errors

Error	Why it happens	How to avoid
VMSS oscillating (flapping) between scale-out and scale-in	Cooldown too short or thresholds too close	Increase cooldown and separate thresholds
Instances with outdated model in production	Manual upgrade policy with forgotten application	Use Rolling for production with health checks
Scale-out doesn't happen even with high CPU	Autoscale created but not linked to VMSS correctly	Check targetResourceUri in autoscale setting
Instances removed with active sessions	Scale-in without connection draining	Use external session store or sticky sessions
Overprovisioning with slow initialization generating cost	App with 20+ minute initialization with active overprovision	Disable overprovision for apps with slow init
maxCount too low preventing spike response	maxCount defined arbitrarily	Calculate maxCount based on capacity tests
VMSS without health extension with rolling upgrade	Rolling upgrades without verified health can bring down all instances	Always enable Application Health Extension with rolling upgrades
Scaling by CPU without considering other bottlenecks	Application has database bottleneck, not CPU	Identify the real bottleneck; consider custom metrics

The most critical error

Configuring a scale-out rule without the corresponding scale-in, or configuring scale-in with too high threshold (e.g., CPU < 80% reduces instances). In the first case, the VMSS will grow to maxCount and stay there forever. In the second, it will reduce instances before the load actually decreases, causing immediate performance degradation.

11. Operation and Maintenance

Monitor VMSS state

# Status of all instances
az vmss list-instances \
  --resource-group "rg-producao" \
  --name "vmss-web" \
  --query "[].{
    ID: instanceId,
    Name: osProfile.computerName,
    ModelUpdated: latestModelApplied,
    State: provisioningState
  }" \
  --output table

# View scaling history
az monitor activity-log list \
  --resource-group "rg-producao" \
  --query "[?operationName.value=='Microsoft.Compute/virtualMachineScaleSets/write'].{
    Time: eventTimestamp,
    Operation: operationName.value,
    Status: status.value
  }" \
  --output table

# View current autoscale configuration
az monitor autoscale show \
  --resource-group "rg-producao" \
  --name "autoscale-vmss-web" \
  --query "{Min: profiles[0].capacity.minimum, Max: profiles[0].capacity.maximum, Default: profiles[0].capacity.default}" \
  --output json

# Check for unhealthy instances
az vmss list-instances \
  --resource-group "rg-producao" \
  --name "vmss-web" \
  --query "[?provisioningState != 'Succeeded'].{ID: instanceId, State: provisioningState}" \
  --output table

Important limits

Limit	Value
Instances per VMSS (Uniform)	1,000 with Managed Disks; 600 without
Instances per VMSS (Flexible)	1,000
Autoscale rules per profile	10
Autoscale profiles per VMSS	20
Fault Domains (with zones)	1-5 (configurable)
Instances that can be updated simultaneously (Rolling)	Configurable, default 20%

12. Integration and Automation

VMSS with Azure Service Bus for event-driven scaling

100%

Scroll para zoom · Arraste para mover · 📱 Pinch para zoom no celular

# Configure autoscale with custom metric (Service Bus queue)
az monitor autoscale rule create \
  --resource-group "rg-producao" \
  --autoscale-name "autoscale-vmss-workers" \
  --condition "ActiveMessages > 100 avg 5m where EntityName == fila-trabalho" \
  --scale out 3

Deploy pipeline with zero-downtime via Rolling Upgrade

# Azure DevOps Pipeline: update VMSS image with zero-downtime
steps:
  - task: AzureCLI@2
    displayName: 'Update VMSS Image'
    inputs:
      azureSubscription: 'prod-subscription'
      scriptType: 'bash'
      scriptLocation: 'inlineScript'
      inlineScript: |
        # Update image reference in model
        az vmss update \
          --resource-group rg-producao \
          --name vmss-web \
          --set virtualMachineProfile.storageProfile.imageReference.version=$(IMAGE_VERSION)
        
        # With Rolling upgrade policy, Azure starts automatically
        # Wait for rolling upgrade completion
        az vmss rolling-upgrade start \
          --resource-group rg-producao \
          --name vmss-web
        
        # Monitor progress
        while true; do
          STATUS=$(az vmss rolling-upgrade get-latest \
            --resource-group rg-producao \
            --name vmss-web \
            --query "runningStatus.code" -o tsv)
          echo "Status: $STATUS"
          if [ "$STATUS" == "RollingForwardCompleted" ]; then
            echo "Rolling upgrade completed successfully"
            break
          elif [ "$STATUS" == "Cancelled" ] || [ "$STATUS" == "Faulted" ]; then
            echo "Rolling upgrade failed: $STATUS"
            exit 1
          fi
          sleep 30
        done

13. Final Summary

Essential points:

VMSS is a group of identical VMs managed as a unit, with automatic scaling based on metrics
Uniform mode is for identical instances with automatic scaling; Flexible mode allows heterogeneous instances
Upgrade Policy defines how model changes propagate: Automatic (immediate), Rolling (in batches with health checks), Manual (you control)
Auto Scaling requires at least one scale-out AND one scale-in rule; cooldown prevents flapping
Scale-in Policy defines which instance is removed: Default (by zone and age), OldestVM, NewestVM
Application Health Extension is necessary for Rolling Upgrades and Automatic Repairs to work correctly
Overprovisioning creates extra instances temporarily to ensure the target number is reached

Critical differences:

VMSS vs. individual VMs: VMSS is for elastic workloads that vary over time; individual VMs are for static or stateful workloads
Uniform vs. Flexible: Uniform = identical instances with native rolling upgrade; Flexible = heterogeneous instances, closer to a modern Availability Set
Scale-out cooldown vs. Scale-in cooldown: are configured separately; scale-in generally needs a longer cooldown to avoid flapping
Update vs. Reimage: Update applies model changes to the existing instance; Reimage recreates the instance from scratch based on the model

What needs to be remembered for AZ-104:

Command to scale manually: az vmss scale --new-capacity <N>
Command to view instances: az vmss list-instances
Command to update instances (Manual policy): az vmss update-instances --instance-ids "*"
Autoscale is a separate resource from VMSS: Microsoft.Insights/autoscaleSettings
VMSS with zones requires that managed disks are also zone-aware
Limit of 1,000 instances per VMSS with Managed Disks
Rolling Upgrade without Application Health Extension can result in updating unhealthy instances, potentially bringing down the entire tier
VMSS instances have names in the format {vmssname}_{id} where IDs are not necessarily sequential

1. Initial Intuition​

2. Context​

Why VMSS exists as a separate concept from individual VMs​

What depends on VMSS​

3. Concept Construction​

3.1 Orchestration Modes​

3.2 Main components of a Uniform VMSS​

3.3 Upgrade Policies​

3.4 Auto Scaling: how it works​

3.5 Scale-in Policy​

3.6 Instance Protection​

3.7 Overprovisioning​

4. Structural View​

Complete VMSS architecture​

VMSS instance lifecycle​

5. Practical Operation​

Health Probes and Application Health Extension​

Non-obvious behaviors​

6. Implementation Methods​

Azure Portal​

Azure CLI​

Azure PowerShell​

Bicep​

7. Control and Security​

Managed Identity for VMSS​

VMSS with Azure Policy​

Automatic Repairs​

8. Decision Making​

VMSS vs. individual VMs​

Upgrade Policy by scenario​

Auto Scale configuration​

9. Best Practices​

10. Common Errors​

The most critical error​

11. Operation and Maintenance​

Monitor VMSS state​

Important limits​

12. Integration and Automation​

VMSS with Azure Service Bus for event-driven scaling​

Deploy pipeline with zero-downtime via Rolling Upgrade​

13. Final Summary​

1. Initial Intuition

2. Context

Why VMSS exists as a separate concept from individual VMs

What depends on VMSS

3. Concept Construction

3.1 Orchestration Modes

3.2 Main components of a Uniform VMSS

3.3 Upgrade Policies

3.4 Auto Scaling: how it works

3.5 Scale-in Policy

3.6 Instance Protection

3.7 Overprovisioning

4. Structural View

Complete VMSS architecture

VMSS instance lifecycle

5. Practical Operation

Health Probes and Application Health Extension

Non-obvious behaviors

6. Implementation Methods

Azure Portal

Azure CLI

Azure PowerShell

Bicep

7. Control and Security

Managed Identity for VMSS

VMSS with Azure Policy

Automatic Repairs

8. Decision Making

VMSS vs. individual VMs

Upgrade Policy by scenario

Auto Scale configuration

9. Best Practices

10. Common Errors

The most critical error

11. Operation and Maintenance

Monitor VMSS state

Important limits

12. Integration and Automation

VMSS with Azure Service Bus for event-driven scaling

Deploy pipeline with zero-downtime via Rolling Upgrade

13. Final Summary