Theoretical Foundation: Configure Scaling for an App Service Plan

1. Initial Intuition

Imagine you have a restaurant. The restaurant has a kitchen (the App Service Plan) and several dishes on the menu (the Web Apps, APIs, Function Apps hosted on it). The kitchen defines the maximum production capacity: how many chefs are available, what equipment is available. If demand increases, you can:

Scale vertically (scale up): replace your small kitchen with a larger one, with better equipment and more chefs
Scale horizontally (scale out): open identical kitchens in parallel, each with the same equipment

In Azure App Service, the App Service Plan is exactly that kitchen: it defines the infrastructure resources (CPU, memory, number of instances) that all apps hosted on it share. Configuring scaling for an App Service Plan means defining the rules that determine when and how this kitchen grows or shrinks.

2. Context

The relationship between App Service Plan and Apps

100%

Scroll para zoom · Arraste para mover · 📱 Pinch para zoom no celular

Critical point: all apps within an App Service Plan share the same resources. If the Plan has 3 instances and the Web App consumes all CPU of one instance, the other apps on the same Plan are affected. This is why apps with different criticality levels should be in separate Plans.

Why configuring scaling is important

The App Service Plan has a fixed cost based on tier and number of instances, regardless of how many requests the apps receive. Without automatic scaling:

You provision for peak: pay the maximum cost all month long
You provision for average: during peaks, apps become slow or unavailable

With well-configured automatic scaling, the Plan grows during peaks and shrinks during low-demand hours, optimizing cost and performance simultaneously.

3. Building the Concepts

3.1 App Service Plan Tiers and scaling capability

The App Service Plan tier determines what is possible in terms of scaling:

Tier	Scale Up (SKUs)	Scale Out (instances)	Auto Scale	Deploy Slots
Free (F1)	No	1 (fixed)	No	0
Shared (D1)	No	1 (fixed)	No	0
Basic (B1/B2/B3)	Yes	Up to 3	Manual only	0
Standard (S1/S2/S3)	Yes	Up to 10	Yes	5
Premium v3 (P0v3 to P3v3)	Yes	Up to 30	Yes	20
Isolated v2 (I1v2 to I3v2)	Yes	Up to 100	Yes	20

Auto Scaling (automatic scaling) is only available from the Standard tier onwards. The Free, Shared and Basic tiers only support manual scaling.

3.2 Scale Up vs. Scale Out

Scale Up (Vertical Scaling): changes the App Service Plan SKU to one with more CPU and memory. For example, from B1 (1 vCPU, 1.75 GB) to P2v3 (2 vCPU, 8 GB). All apps in the Plan immediately receive more resources.

Scale Out (Horizontal Scaling): increases the number of App Service Plan instances. With 3 instances, apps have 3 workers processing requests in parallel. Azure automatically distributes requests across instances.

100%

Scroll para zoom · Arraste para mover · 📱 Pinch para zoom no celular

3.3 Manual Scaling vs. Auto Scaling

Manual Scaling: you define a fixed number of instances. The Plan maintains exactly that number indefinitely, regardless of load.

Auto Scaling: Azure automatically adjusts the number of instances based on rules or metrics. It consists of:

Scale-out rule (when to add instances): condition that triggers instance increase Scale-in rule (when to remove instances): condition that triggers instance reduction Min/Max instances: limits that auto scaling respects Default capacity: number of instances when no rule is active

3.4 Types of Auto Scaling in App Service

App Service has two auto scaling engines that need to be distinguished:

Classic Autoscale (Azure Monitor Autoscale):

Based on Azure Monitor metrics
Customizable rules for CPU, memory, custom metrics
Configurable cooldown
Scheduling (scale at specific times)
Available in Standard and above

Automatic Scaling (preview/GA in 2024):

New native App Service mechanism
Based on concurrent HTTP requests
Simpler, no need to configure explicit rules
Azure automatically manages scaling based on traffic
Available only in Premium v3 and Isolated v2

For AZ-104, the focus is on Classic Autoscale which is more widely available and more frequently tested.

3.5 Structure of an Auto Scale rule (Classic)

100%

Scroll para zoom · Arraste para mover · 📱 Pinch para zoom no celular

3.6 Auto Scale Profiles

A Profile groups rules and capacities for a specific context. There are three types:

Default Profile: always active when no other profile applies. Every autoscale configuration must have a default profile.

Fixed Date Profile: active on a specific date/period (e.g., Black Friday 11/25). Replaces the default during that period.

Recurrence Profile: active at recurring times (e.g., Monday to Friday 8am to 6pm). Replaces the default during these periods.

4. Structural View

Complete scaling flow in App Service

100%

Scroll para zoom · Arraste para mover · 📱 Pinch para zoom no celular

Scaling hierarchy in App Service

100%

Scroll para zoom · Arraste para mover · 📱 Pinch para zoom no celular

5. How it Works in Practice

When rules are evaluated

The Auto Scale Engine evaluates rules every 30-60 seconds. For each evaluation, it checks if any scale-out or scale-in condition is true and acts according to the rules.

Behavior with multiple scale-out rules: If ANY scale-out rule is true, scaling occurs. Scale-out rules are combined with OR logic: just one needs to be true.

Behavior with multiple scale-in rules: ALL scale-in rules must be true simultaneously for scale-in to occur. Scale-in rules are combined with AND logic: all must be true.

This asymmetric behavior is intentional: Azure is conservative in scale-in to avoid removing instances precipitously.

Cooldown and Flap Avoidance

The cooldown is the period after a scaling action during which no new scaling action is taken. This prevents rapid oscillations (scale-out, scale-in, scale-out in rapid sequence).

By default:

Scale-out cooldown: 5 minutes
Scale-in cooldown: 5 minutes

For workloads with very rapid peaks, it may be necessary to reduce the scale-out cooldown.

Non-obvious behaviors

Scale Up requires a brief app restart. When changing the App Service Plan SKU, workers are reallocated to different hardware. Apps experience a brief interruption during this transition. Use deployment slots for zero-downtime during scale-up.

Scale out does not require restart. Adding instances is transparent to apps. New instances are created in parallel and start receiving traffic from the load balancer without interrupting existing ones.

All apps in the Plan scale together during scale out. If the Plan has 3 apps and scales from 2 to 3 instances, all 3 apps get 3 instances. It's not possible to have one app with 3 instances and another with 2 in the same Plan.

The App Service Plan charges by number of instances, not by usage. A P2v3 Plan with 3 active instances charges the same amount whether the instances are at 100% CPU or 0%. Auto scaling doesn't reduce cost during peaks, but reduces cost during valleys by scaling in to the minimum.

The default profile is the fallback, not the primary. A common confusion: the "default" profile is activated when no other profile (time-specific or date-specific) applies. It's the base state, not a special configuration.

Metrics available for scaling

Metric	Azure Name	Typical Use
CPU Percentage	`CpuPercentage`	Most common; scales on processing load
Memory Percentage	`MemoryPercentage`	Scales when apps consume too much memory
Disk Queue Length	`DiskQueueLength`	I/O intensive
Http Queue Length	`HttpQueueLength`	Requests waiting for available worker
Bytes Received	`BytesReceived`	Scales by incoming traffic volume
Bytes Sent	`BytesSent`	Scales by response volume

HttpQueueLength is often the most useful metric for web apps: it indicates there are requests waiting in the IIS/Kestrel queue because there are no available workers. When this metric grows, it's a clear sign that more instances are needed.

6. Implementation Methods

Azure Portal

When to use: initial configuration, specific adjustments, viewing scaling history

For Scale Up (change SKU):

Portal > App Service Plan > Scale up (App Service plan)
Select the new tier/SKU
Apply

For manual Scale Out:

Portal > App Service Plan > Scale out (App Service plan)
Select Manual scale
Define number of instances
Save

For Auto Scale:

Portal > App Service Plan > Scale out (App Service plan)
Select Custom autoscale
Define Min/Max/Default instances
Add scale-out rule:
- Metric source: App Service Plan
- Metric: CpuPercentage
- Operator: Greater than
- Threshold: 75
- Duration: 5 minutes
- Action: Increase count by 2
- Cool down: 5 minutes
Add scale-in rule (symmetric)
Save

Azure CLI

# Scale Up: change the App Service Plan SKU
az appservice plan update \
  --resource-group "rg-webapp" \
  --name "asp-producao" \
  --sku P2V3

# Manual Scale Out: set fixed number of instances
az appservice plan update \
  --resource-group "rg-webapp" \
  --name "asp-producao" \
  --number-of-workers 3

# View current App Service Plan configuration
az appservice plan show \
  --resource-group "rg-webapp" \
  --name "asp-producao" \
  --query "{SKU: sku.name, Workers: sku.capacity, Tier: sku.tier}" \
  --output json

# Configure Autoscale via Azure Monitor
ASP_ID=$(az appservice plan show \
  --resource-group "rg-webapp" \
  --name "asp-producao" \
  --query "id" --output tsv)

# Create autoscale configuration with default profile
az monitor autoscale create \
  --resource-group "rg-webapp" \
  --resource "$ASP_ID" \
  --resource-type "Microsoft.Web/serverFarms" \
  --name "autoscale-asp-producao" \
  --min-count 2 \
  --max-count 10 \
  --count 3

# Add scale-out rule (CPU > 75%)
az monitor autoscale rule create \
  --resource-group "rg-webapp" \
  --autoscale-name "autoscale-asp-producao" \
  --scale out 2 \
  --condition "CpuPercentage > 75 avg 5m"

# Add scale-in rule (CPU < 25%)
az monitor autoscale rule create \
  --resource-group "rg-webapp" \
  --autoscale-name "autoscale-asp-producao" \
  --scale in 1 \
  --condition "CpuPercentage < 25 avg 10m"

# Add rule based on HttpQueueLength
az monitor autoscale rule create \
  --resource-group "rg-webapp" \
  --autoscale-name "autoscale-asp-producao" \
  --scale out 3 \
  --condition "HttpQueueLength > 100 avg 3m"

# Create business hours profile (more instances Mon to Fri)
az monitor autoscale profile create \
  --autoscale-name "autoscale-asp-producao" \
  --resource-group "rg-webapp" \
  --name "horario-comercial" \
  --min-count 4 \
  --max-count 10 \
  --count 4 \
  --start "2026-01-01 08:00" \
  --end "2027-12-31 18:00" \
  --recurrence week mo tu we th fr \
  --timezone "E. South America Standard Time"

# Create Black Friday profile (specific date, more capacity)
az monitor autoscale profile create \
  --autoscale-name "autoscale-asp-producao" \
  --resource-group "rg-webapp" \
  --name "black-friday-2026" \
  --min-count 8 \
  --max-count 20 \
  --count 10 \
  --start "2026-11-27 00:00" \
  --end "2026-11-28 23:59" \
  --timezone "E. South America Standard Time"

# View scaling history
az monitor autoscale history \
  --resource-group "rg-webapp" \
  --name "autoscale-asp-producao" \
  --output table

# Check current autoscale configuration
az monitor autoscale show \
  --resource-group "rg-webapp" \
  --name "autoscale-asp-producao" \
  --output json

# Temporarily disable autoscale (keeps rules)
az monitor autoscale update \
  --resource-group "rg-webapp" \
  --name "autoscale-asp-producao" \
  --enabled false

# Re-enable autoscale
az monitor autoscale update \
  --resource-group "rg-webapp" \
  --name "autoscale-asp-producao" \
  --enabled true

Azure PowerShell

# Scale Up: change SKU
Set-AzAppServicePlan `
  -ResourceGroupName "rg-webapp" `
  -Name "asp-producao" `
  -Tier "PremiumV3" `
  -WorkerSize "Medium"  # P2v3

# Manual Scale Out
Set-AzAppServicePlan `
  -ResourceGroupName "rg-webapp" `
  -Name "asp-producao" `
  -NumberofWorkers 3

# Create autoscale rules
$aspId = (Get-AzAppServicePlan -ResourceGroupName "rg-webapp" -Name "asp-producao").Id

# Scale-out rule
$scaleOutRule = New-AzAutoscaleRule `
  -MetricName "CpuPercentage" `
  -MetricResourceId $aspId `
  -TimeGrain ([TimeSpan]::FromMinutes(1)) `
  -Statistic Average `
  -TimeWindow ([TimeSpan]::FromMinutes(5)) `
  -TimeAggregationOperator Average `
  -Operator GreaterThan `
  -Threshold 75 `
  -ScaleActionCooldown ([TimeSpan]::FromMinutes(5)) `
  -ScaleActionDirection Increase `
  -ScaleActionValue 2 `
  -ScaleActionType ChangeCount

# Scale-in rule
$scaleInRule = New-AzAutoscaleRule `
  -MetricName "CpuPercentage" `
  -MetricResourceId $aspId `
  -TimeGrain ([TimeSpan]::FromMinutes(1)) `
  -Statistic Average `
  -TimeWindow ([TimeSpan]::FromMinutes(10)) `
  -TimeAggregationOperator Average `
  -Operator LessThan `
  -Threshold 25 `
  -ScaleActionCooldown ([TimeSpan]::FromMinutes(5)) `
  -ScaleActionDirection Decrease `
  -ScaleActionValue 1 `
  -ScaleActionType ChangeCount

# Default profile
$profile = New-AzAutoscaleProfile `
  -DefaultCapacity 3 `
  -MaximumCapacity 10 `
  -MinimumCapacity 2 `
  -Rule @($scaleOutRule, $scaleInRule) `
  -Name "default"

# Apply autoscale configuration
Add-AzAutoscaleSetting `
  -ResourceGroupName "rg-webapp" `
  -Location "brazilsouth" `
  -Name "autoscale-asp-producao" `
  -TargetResourceId $aspId `
  -AutoscaleProfile $profile

Bicep

// App Service Plan
resource appServicePlan 'Microsoft.Web/serverfarms@2022-09-01' = {
  name: 'asp-producao'
  location: 'brazilsouth'
  sku: {
    name: 'P2V3'
    tier: 'PremiumV3'
    capacity: 3  // Initial number of instances
  }
  properties: {
    reserved: false  // false = Windows; true = Linux
  }
}

// Autoscale Settings
resource autoscale 'Microsoft.Insights/autoscaleSettings@2022-10-01' = {
  name: 'autoscale-asp-producao'
  location: 'brazilsouth'
  properties: {
    enabled: true
    targetResourceUri: appServicePlan.id
    profiles: [
      {
        name: 'default'
        capacity: {
default: '3'
          minimum: '2'
          maximum: '10'
        }
        rules: [
          // Scale-out: CPU > 75% for 5 minutes
          {
            metricTrigger: {
              metricName: 'CpuPercentage'
              metricResourceUri: appServicePlan.id
              timeGrain: 'PT1M'
              statistic: 'Average'
              timeWindow: 'PT5M'
              timeAggregation: 'Average'
              operator: 'GreaterThan'
              threshold: 75
            }
            scaleAction: {
              direction: 'Increase'
              type: 'ChangeCount'
              value: '2'
              cooldown: 'PT5M'
            }
          }
          // Scale-out: HttpQueueLength > 100 for 3 minutes
          {
            metricTrigger: {
              metricName: 'HttpQueueLength'
              metricResourceUri: appServicePlan.id
              timeGrain: 'PT1M'
              statistic: 'Average'
              timeWindow: 'PT3M'
              timeAggregation: 'Average'
              operator: 'GreaterThan'
              threshold: 100
            }
            scaleAction: {
              direction: 'Increase'
              type: 'ChangeCount'
              value: '3'
              cooldown: 'PT3M'
            }
          }
          // Scale-in: CPU < 25% for 10 minutes
          {
            metricTrigger: {
              metricName: 'CpuPercentage'
              metricResourceUri: appServicePlan.id
              timeGrain: 'PT1M'
              statistic: 'Average'
              timeWindow: 'PT10M'
              timeAggregation: 'Average'
              operator: 'LessThan'
              threshold: 25
            }
            scaleAction: {
              direction: 'Decrease'
              type: 'ChangeCount'
              value: '1'
              cooldown: 'PT10M'
            }
          }
        ]
      }
      // Profile for business hours
      {
        name: 'business-hours'
        capacity: {
          default: '4'
          minimum: '4'
          maximum: '15'
        }
        rules: []  // No additional rules; fixed scale of 4 during hours
        recurrence: {
          frequency: 'Week'
          schedule: {
            timeZone: 'E. South America Standard Time'
            days: ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday']
            hours: [8]
            minutes: [0]
          }
        }
      }
    ]
  }
}

7. Control and Security

Monitor and audit scaling decisions

# View scaling event history
az monitor autoscale history \
  --resource-group "rg-webapp" \
  --name "autoscale-asp-producao" \
  --start-time "$(date -u -d '7 days ago' +%Y-%m-%dT%H:%M:%SZ)" \
  --end-time "$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
  --output table

# View current instance count
az appservice plan show \
  --resource-group "rg-webapp" \
  --name "asp-producao" \
  --query "sku.capacity" \
  --output tsv

# Alerts when scaling reaches maxCount
az monitor autoscale settings create-notification \
  --autoscale-name "autoscale-asp-producao" \
  --resource-group "rg-webapp" \
  --send-to-subscription-administrator true \
  --webhooks "https://hooks.slack.com/services/..."

RBAC for scaling control

To prevent development teams from accidentally altering production scaling:

# Create custom role that allows viewing but not altering scaling
az role definition create --role-definition '{
  "Name": "App Service Scaling Viewer",
  "Actions": [
    "Microsoft.Web/serverFarms/read",
    "Microsoft.Insights/autoscaleSettings/read"
  ],
  "NotActions": [
    "Microsoft.Web/serverFarms/write",
    "Microsoft.Insights/autoscaleSettings/write"
  ],
  "AssignableScopes": ["/subscriptions/<sub-id>"]
}'

8. Decision Making

Scale Up vs. Scale Out

Situation	Choice	Reason
Single-threaded app that doesn't parallelize	Scale Up	More CPU per instance, no benefit from multiple instances
Stateless web app with many requests	Scale Out	Load distribution, better cost-effectiveness
App with insufficient memory causing OOM	Scale Up	More RAM per instance
Predictable spike of concurrent users	Scale Out	More workers in parallel
Legacy app with concurrency issues	Scale Up	Avoid distribution that exposes state problems

Manual vs. Autoscale

Situation	Choice	Reason
Predictable and constant load	Manual	Avoids unnecessary autoscale overhead
Variable load throughout the day	Autoscale with schedule	Predictable, without complex metrics
Unpredictable and variable load	Autoscale with metrics	Responds to real conditions
Black Friday / special event	Fixed Date Profile	Pre-provision maximum capacity
Dev/test with low usage	Manual with minimum	Controlled cost without automatic scaling

Autoscale metric choice

Metric	When to use	Caution
CpuPercentage	CPU-bound apps (processing, calculations)	May not reflect I/O or memory issues
HttpQueueLength	Web APIs with many simultaneous requests	Ideal for web apps; indicates direct bottleneck
MemoryPercentage	Apps that load large datasets into memory	Scaling doesn't resolve memory leaks
Multiple metrics	Complex apps with mixed variation	Combine CPU and HttpQueueLength

9. Best Practices

Always configure a conservative scale-in rule. Aggressive scale-in may remove instances prematurely during still-high load. Use a longer time window (10-15 minutes) and lower threshold (CPU < 20-25%) for scale-in than for scale-out.

Configure autoscale notifications. Receive emails or webhooks when autoscale acts. This allows you to identify if autoscale is responding appropriately, if it's oscillating, or if it has reached maxCount and can no longer scale.

Test scaling before production. Run load tests that force scale-out and scale-in. Verify that cooldown is adequate, that rules are triggered at the right moments, and that apps don't have shared state issues between instances.

Use schedule profiles for predictable loads. If you know the peak is Monday to Friday from 9 AM to 5 PM, configure a schedule profile that increases the minimum to a reasonable number during those hours. This is more reliable than waiting for autoscale to react to the spike.

Separate apps of different criticalities into different Plans. If one app consumes all Plan resources during a spike, other apps on the same Plan are degraded. Production and dev/test apps should never share the same Plan.

Configure minimum with high availability in mind. A minimum of 1 instance means if that instance fails, the app becomes unavailable until autoscale creates a new one. For production, configure minimum >= 2 to ensure HA.

10. Common Errors

Error	Why it happens	How to avoid
Auto Scale doesn't work on Basic tier	Basic doesn't support autoscale	Use Standard or higher for autoscale
Scale-in removes instances during still-high load	Scale-in threshold too high or window too short	Use conservative threshold (< 25%) and long window (10+ min)
Autoscale oscillates endlessly (flapping)	Cooldown too short or overlapping thresholds	Increase cooldown; ensure thresholds don't overlap
Scale-out doesn't happen in time for sudden spike	Cooldown too long or evaluation window too long	Reduce evaluation window for rapid spikes
All apps degraded when one spikes	Different apps sharing the same Plan	Separate apps of different criticalities into different Plans
Scale Up causes downtime in apps	Not preparing for restart during scale up	Use deployment slots for zero-downtime
Wrong metric source in rules	Selecting VM or subscription instead of App Service Plan	Verify that metricResourceUri points to the Plan, not the app
MaxCount too low without alert	Autoscale reaches limit without being able to scale further	Configure alert when instances == maxCount

The most costly error

Not configuring scale-in, resulting in an App Service Plan that grows during spikes but never shrinks. The Plan remains at maximum instance count indefinitely, even during periods of very low load. With 10 P2v3 instances running 24/7 when 3 would suffice, unnecessary costs can be thousands of reais per month.

11. Operation and Maintenance

Check scaling state and history

# View current Plan instance count
az appservice plan show \
  --resource-group "rg-webapp" \
  --name "asp-producao" \
  --query "{SKU: sku.name, Instances: sku.capacity, Tier: sku.tier}" \
  --output json

# View Plan CPU metrics for the last 4 hours
az monitor metrics list \
  --resource "/subscriptions/<sub-id>/resourceGroups/rg-webapp/providers/Microsoft.Web/serverFarms/asp-producao" \
  --metric "CpuPercentage" \
  --interval PT5M \
  --aggregation Average \
  --start-time "$(date -u -d '4 hours ago' +%Y-%m-%dT%H:%M:%SZ)" \
  --end-time "$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
  --output table

# View HttpQueueLength for the last 4 hours
az monitor metrics list \
  --resource "/subscriptions/<sub-id>/resourceGroups/rg-webapp/providers/Microsoft.Web/serverFarms/asp-producao" \
  --metric "HttpQueueLength" \
  --interval PT1M \
  --aggregation Average \
  --start-time "$(date -u -d '4 hours ago' +%Y-%m-%dT%H:%M:%SZ)" \
  --end-time "$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
  --output table

Important limits

Limit	Value
Maximum instances per Plan (Standard)	10
Maximum instances per Plan (Premium v3)	30
Maximum instances per Plan (Isolated v2)	100
Apps per App Service Plan	No defined limit (practical: dozens)
Rules per autoscale profile	10
Profiles per autoscale setting	20
Recommended minimum for production	2 (for HA)

12. Integration and Automation

Scaling with Application Insights metrics

For scaling based on application performance metrics (not just server metrics):

# Create autoscale rule using custom Application Insights metric
# (requires correct metric namespace)
az monitor autoscale rule create \
  --resource-group "rg-webapp" \
  --autoscale-name "autoscale-asp-producao" \
  --condition "requests/duration > 2000 avg 5m" \
  --scale out 2 \
  --metric-resource-id "/subscriptions/<sub-id>/resourceGroups/rg-webapp/providers/microsoft.insights/components/appinsights-ecommerce"

Deploy pipeline with pre-configured scaling

# Azure DevOps: pre-scale before deploy, post-scale for production
steps:
  - task: AzureCLI@2
    displayName: 'Pre-scale for deploy'
    inputs:
      scriptType: 'bash'
      inlineScript: |
        # Increase instances before deploy to absorb traffic during warmup
        az appservice plan update \
          --resource-group rg-webapp \
          --name asp-producao \
          --number-of-workers 6

  - task: AzureWebApp@1
    displayName: 'Deploy application'
    inputs:
      appName: 'ecommerce-api'
      package: '$(Build.ArtifactStagingDirectory)/**/*.zip'

  - task: AzureCLI@2
    displayName: 'Restore autoscale post-deploy'
    inputs:
      scriptType: 'bash'
      inlineScript: |
        # Re-enable autoscale (disabled for manual scaling above)
        az monitor autoscale update \
          --resource-group rg-webapp \
          --name autoscale-asp-producao \
          --enabled true

13. Final Summary

Essential points:

The App Service Plan defines the tier (CPU/memory per instance) and number of instances; all apps on the Plan share these resources
Scale Up changes the SKU (ex: S1 to P2v3): more CPU and memory per instance; requires brief restart
Scale Out changes the number of instances: more workers in parallel; no app restart
Auto Scale is available only from Standard tier onwards; Free, Shared and Basic support only manual scaling
Scale-out rules are combined with OR (any one triggers scale-out)
Scale-in rules are combined with AND (all must be true for scale-in)
Profiles allow different configurations by schedule (recurrence) or specific date (fixed date)

Critical differences:

Scale Up vs. Scale Out: scale up increases resources per instance (vertical); scale out increases number of instances (horizontal)
Manual vs. Autoscale: manual maintains fixed number; autoscale varies based on metrics or schedule
Default Profile vs. other profiles: default is the fallback when no schedule/date profile is active
Classic Autoscale vs. Automatic Scaling: Classic uses Azure Monitor metric-based rules (available on Standard); Automatic is the new native mechanism based on HTTP requests (only Premium v3+)

What needs to be remembered for AZ-104:

Autoscale requires Standard tier or higher; Basic supports only maximum 3 instances with manual scaling
The autoscale setting is a separate resource (Microsoft.Insights/autoscaleSettings), not an App Service Plan property
HttpQueueLength metric indicates requests waiting for workers; it's the most direct metric for web apps
Default CoolDown is 5 minutes for both scale-out and scale-in
Maximum horizontal scaling: 10 instances (Standard), 30 (Premium v3), 100 (Isolated v2)
The autoscale setting target resource must be the App Service Plan (Microsoft.Web/serverFarms), not the Web App

1. Initial Intuition​

2. Context​

The relationship between App Service Plan and Apps​

Why configuring scaling is important​

3. Building the Concepts​

3.1 App Service Plan Tiers and scaling capability​

3.2 Scale Up vs. Scale Out​

3.3 Manual Scaling vs. Auto Scaling​

3.4 Types of Auto Scaling in App Service​

3.5 Structure of an Auto Scale rule (Classic)​

3.6 Auto Scale Profiles​

4. Structural View​

Complete scaling flow in App Service​

Scaling hierarchy in App Service​

5. How it Works in Practice​

When rules are evaluated​

Cooldown and Flap Avoidance​

Non-obvious behaviors​

Metrics available for scaling​

6. Implementation Methods​

Azure Portal​

Azure CLI​

Azure PowerShell​

Bicep​

7. Control and Security​

Monitor and audit scaling decisions​

RBAC for scaling control​

8. Decision Making​

Scale Up vs. Scale Out​

Manual vs. Autoscale​

Autoscale metric choice​

9. Best Practices​

10. Common Errors​

The most costly error​

11. Operation and Maintenance​

Check scaling state and history​

Important limits​

12. Integration and Automation​

Scaling with Application Insights metrics​

Deploy pipeline with pre-configured scaling​

13. Final Summary​

1. Initial Intuition

2. Context

The relationship between App Service Plan and Apps

Why configuring scaling is important

3. Building the Concepts

3.1 App Service Plan Tiers and scaling capability

3.2 Scale Up vs. Scale Out

3.3 Manual Scaling vs. Auto Scaling

3.4 Types of Auto Scaling in App Service

3.5 Structure of an Auto Scale rule (Classic)

3.6 Auto Scale Profiles

4. Structural View

Complete scaling flow in App Service

Scaling hierarchy in App Service

5. How it Works in Practice

When rules are evaluated

Cooldown and Flap Avoidance

Non-obvious behaviors

Metrics available for scaling

6. Implementation Methods

Azure Portal

Azure CLI

Azure PowerShell

Bicep

7. Control and Security

Monitor and audit scaling decisions

RBAC for scaling control

8. Decision Making

Scale Up vs. Scale Out

Manual vs. Autoscale

Autoscale metric choice

9. Best Practices

10. Common Errors

The most costly error

11. Operation and Maintenance

Check scaling state and history

Important limits

12. Integration and Automation

Scaling with Application Insights metrics

Deploy pipeline with pre-configured scaling

13. Final Summary