Theoretical Foundation: Configure Scaling for an App Service Plan
1. Initial Intuitionβ
Imagine you have a restaurant. The restaurant has a kitchen (the App Service Plan) and several dishes on the menu (the Web Apps, APIs, Function Apps hosted on it). The kitchen defines the maximum production capacity: how many chefs are available, what equipment is available. If demand increases, you can:
- Scale vertically (scale up): replace your small kitchen with a larger one, with better equipment and more chefs
- Scale horizontally (scale out): open identical kitchens in parallel, each with the same equipment
In Azure App Service, the App Service Plan is exactly that kitchen: it defines the infrastructure resources (CPU, memory, number of instances) that all apps hosted on it share. Configuring scaling for an App Service Plan means defining the rules that determine when and how this kitchen grows or shrinks.
2. Contextβ
The relationship between App Service Plan and Appsβ
Critical point: all apps within an App Service Plan share the same resources. If the Plan has 3 instances and the Web App consumes all CPU of one instance, the other apps on the same Plan are affected. This is why apps with different criticality levels should be in separate Plans.
Why configuring scaling is importantβ
The App Service Plan has a fixed cost based on tier and number of instances, regardless of how many requests the apps receive. Without automatic scaling:
- You provision for peak: pay the maximum cost all month long
- You provision for average: during peaks, apps become slow or unavailable
With well-configured automatic scaling, the Plan grows during peaks and shrinks during low-demand hours, optimizing cost and performance simultaneously.
3. Building the Conceptsβ
3.1 App Service Plan Tiers and scaling capabilityβ
The App Service Plan tier determines what is possible in terms of scaling:
| Tier | Scale Up (SKUs) | Scale Out (instances) | Auto Scale | Deploy Slots |
|---|---|---|---|---|
| Free (F1) | No | 1 (fixed) | No | 0 |
| Shared (D1) | No | 1 (fixed) | No | 0 |
| Basic (B1/B2/B3) | Yes | Up to 3 | Manual only | 0 |
| Standard (S1/S2/S3) | Yes | Up to 10 | Yes | 5 |
| Premium v3 (P0v3 to P3v3) | Yes | Up to 30 | Yes | 20 |
| Isolated v2 (I1v2 to I3v2) | Yes | Up to 100 | Yes | 20 |
Auto Scaling (automatic scaling) is only available from the Standard tier onwards. The Free, Shared and Basic tiers only support manual scaling.
3.2 Scale Up vs. Scale Outβ
Scale Up (Vertical Scaling): changes the App Service Plan SKU to one with more CPU and memory. For example, from B1 (1 vCPU, 1.75 GB) to P2v3 (2 vCPU, 8 GB). All apps in the Plan immediately receive more resources.
Scale Out (Horizontal Scaling): increases the number of App Service Plan instances. With 3 instances, apps have 3 workers processing requests in parallel. Azure automatically distributes requests across instances.
3.3 Manual Scaling vs. Auto Scalingβ
Manual Scaling: you define a fixed number of instances. The Plan maintains exactly that number indefinitely, regardless of load.
Auto Scaling: Azure automatically adjusts the number of instances based on rules or metrics. It consists of:
Scale-out rule (when to add instances): condition that triggers instance increase Scale-in rule (when to remove instances): condition that triggers instance reduction Min/Max instances: limits that auto scaling respects Default capacity: number of instances when no rule is active
3.4 Types of Auto Scaling in App Serviceβ
App Service has two auto scaling engines that need to be distinguished:
Classic Autoscale (Azure Monitor Autoscale):
- Based on Azure Monitor metrics
- Customizable rules for CPU, memory, custom metrics
- Configurable cooldown
- Scheduling (scale at specific times)
- Available in Standard and above
Automatic Scaling (preview/GA in 2024):
- New native App Service mechanism
- Based on concurrent HTTP requests
- Simpler, no need to configure explicit rules
- Azure automatically manages scaling based on traffic
- Available only in Premium v3 and Isolated v2
For AZ-104, the focus is on Classic Autoscale which is more widely available and more frequently tested.
3.5 Structure of an Auto Scale rule (Classic)β
3.6 Auto Scale Profilesβ
A Profile groups rules and capacities for a specific context. There are three types:
Default Profile: always active when no other profile applies. Every autoscale configuration must have a default profile.
Fixed Date Profile: active on a specific date/period (e.g., Black Friday 11/25). Replaces the default during that period.
Recurrence Profile: active at recurring times (e.g., Monday to Friday 8am to 6pm). Replaces the default during these periods.
4. Structural Viewβ
Complete scaling flow in App Serviceβ
Scaling hierarchy in App Serviceβ
5. How it Works in Practiceβ
When rules are evaluatedβ
The Auto Scale Engine evaluates rules every 30-60 seconds. For each evaluation, it checks if any scale-out or scale-in condition is true and acts according to the rules.
Behavior with multiple scale-out rules: If ANY scale-out rule is true, scaling occurs. Scale-out rules are combined with OR logic: just one needs to be true.
Behavior with multiple scale-in rules: ALL scale-in rules must be true simultaneously for scale-in to occur. Scale-in rules are combined with AND logic: all must be true.
This asymmetric behavior is intentional: Azure is conservative in scale-in to avoid removing instances precipitously.
Cooldown and Flap Avoidanceβ
The cooldown is the period after a scaling action during which no new scaling action is taken. This prevents rapid oscillations (scale-out, scale-in, scale-out in rapid sequence).
By default:
- Scale-out cooldown: 5 minutes
- Scale-in cooldown: 5 minutes
For workloads with very rapid peaks, it may be necessary to reduce the scale-out cooldown.
Non-obvious behaviorsβ
Scale Up requires a brief app restart. When changing the App Service Plan SKU, workers are reallocated to different hardware. Apps experience a brief interruption during this transition. Use deployment slots for zero-downtime during scale-up.
Scale out does not require restart. Adding instances is transparent to apps. New instances are created in parallel and start receiving traffic from the load balancer without interrupting existing ones.
All apps in the Plan scale together during scale out. If the Plan has 3 apps and scales from 2 to 3 instances, all 3 apps get 3 instances. It's not possible to have one app with 3 instances and another with 2 in the same Plan.
The App Service Plan charges by number of instances, not by usage. A P2v3 Plan with 3 active instances charges the same amount whether the instances are at 100% CPU or 0%. Auto scaling doesn't reduce cost during peaks, but reduces cost during valleys by scaling in to the minimum.
The default profile is the fallback, not the primary. A common confusion: the "default" profile is activated when no other profile (time-specific or date-specific) applies. It's the base state, not a special configuration.
Metrics available for scalingβ
| Metric | Azure Name | Typical Use |
|---|---|---|
| CPU Percentage | CpuPercentage | Most common; scales on processing load |
| Memory Percentage | MemoryPercentage | Scales when apps consume too much memory |
| Disk Queue Length | DiskQueueLength | I/O intensive |
| Http Queue Length | HttpQueueLength | Requests waiting for available worker |
| Bytes Received | BytesReceived | Scales by incoming traffic volume |
| Bytes Sent | BytesSent | Scales by response volume |
HttpQueueLength is often the most useful metric for web apps: it indicates there are requests waiting in the IIS/Kestrel queue because there are no available workers. When this metric grows, it's a clear sign that more instances are needed.
6. Implementation Methodsβ
Azure Portalβ
When to use: initial configuration, specific adjustments, viewing scaling history
For Scale Up (change SKU):
- Portal > App Service Plan > Scale up (App Service plan)
- Select the new tier/SKU
- Apply
For manual Scale Out:
- Portal > App Service Plan > Scale out (App Service plan)
- Select Manual scale
- Define number of instances
- Save
For Auto Scale:
- Portal > App Service Plan > Scale out (App Service plan)
- Select Custom autoscale
- Define Min/Max/Default instances
- Add scale-out rule:
- Metric source: App Service Plan
- Metric: CpuPercentage
- Operator: Greater than
- Threshold: 75
- Duration: 5 minutes
- Action: Increase count by 2
- Cool down: 5 minutes
- Add scale-in rule (symmetric)
- Save
Azure CLIβ
# Scale Up: change the App Service Plan SKU
az appservice plan update \
--resource-group "rg-webapp" \
--name "asp-producao" \
--sku P2V3
# Manual Scale Out: set fixed number of instances
az appservice plan update \
--resource-group "rg-webapp" \
--name "asp-producao" \
--number-of-workers 3
# View current App Service Plan configuration
az appservice plan show \
--resource-group "rg-webapp" \
--name "asp-producao" \
--query "{SKU: sku.name, Workers: sku.capacity, Tier: sku.tier}" \
--output json
# Configure Autoscale via Azure Monitor
ASP_ID=$(az appservice plan show \
--resource-group "rg-webapp" \
--name "asp-producao" \
--query "id" --output tsv)
# Create autoscale configuration with default profile
az monitor autoscale create \
--resource-group "rg-webapp" \
--resource "$ASP_ID" \
--resource-type "Microsoft.Web/serverFarms" \
--name "autoscale-asp-producao" \
--min-count 2 \
--max-count 10 \
--count 3
# Add scale-out rule (CPU > 75%)
az monitor autoscale rule create \
--resource-group "rg-webapp" \
--autoscale-name "autoscale-asp-producao" \
--scale out 2 \
--condition "CpuPercentage > 75 avg 5m"
# Add scale-in rule (CPU < 25%)
az monitor autoscale rule create \
--resource-group "rg-webapp" \
--autoscale-name "autoscale-asp-producao" \
--scale in 1 \
--condition "CpuPercentage < 25 avg 10m"
# Add rule based on HttpQueueLength
az monitor autoscale rule create \
--resource-group "rg-webapp" \
--autoscale-name "autoscale-asp-producao" \
--scale out 3 \
--condition "HttpQueueLength > 100 avg 3m"
# Create business hours profile (more instances Mon to Fri)
az monitor autoscale profile create \
--autoscale-name "autoscale-asp-producao" \
--resource-group "rg-webapp" \
--name "horario-comercial" \
--min-count 4 \
--max-count 10 \
--count 4 \
--start "2026-01-01 08:00" \
--end "2027-12-31 18:00" \
--recurrence week mo tu we th fr \
--timezone "E. South America Standard Time"
# Create Black Friday profile (specific date, more capacity)
az monitor autoscale profile create \
--autoscale-name "autoscale-asp-producao" \
--resource-group "rg-webapp" \
--name "black-friday-2026" \
--min-count 8 \
--max-count 20 \
--count 10 \
--start "2026-11-27 00:00" \
--end "2026-11-28 23:59" \
--timezone "E. South America Standard Time"
# View scaling history
az monitor autoscale history \
--resource-group "rg-webapp" \
--name "autoscale-asp-producao" \
--output table
# Check current autoscale configuration
az monitor autoscale show \
--resource-group "rg-webapp" \
--name "autoscale-asp-producao" \
--output json
# Temporarily disable autoscale (keeps rules)
az monitor autoscale update \
--resource-group "rg-webapp" \
--name "autoscale-asp-producao" \
--enabled false
# Re-enable autoscale
az monitor autoscale update \
--resource-group "rg-webapp" \
--name "autoscale-asp-producao" \
--enabled true
Azure PowerShellβ
# Scale Up: change SKU
Set-AzAppServicePlan `
-ResourceGroupName "rg-webapp" `
-Name "asp-producao" `
-Tier "PremiumV3" `
-WorkerSize "Medium" # P2v3
# Manual Scale Out
Set-AzAppServicePlan `
-ResourceGroupName "rg-webapp" `
-Name "asp-producao" `
-NumberofWorkers 3
# Create autoscale rules
$aspId = (Get-AzAppServicePlan -ResourceGroupName "rg-webapp" -Name "asp-producao").Id
# Scale-out rule
$scaleOutRule = New-AzAutoscaleRule `
-MetricName "CpuPercentage" `
-MetricResourceId $aspId `
-TimeGrain ([TimeSpan]::FromMinutes(1)) `
-Statistic Average `
-TimeWindow ([TimeSpan]::FromMinutes(5)) `
-TimeAggregationOperator Average `
-Operator GreaterThan `
-Threshold 75 `
-ScaleActionCooldown ([TimeSpan]::FromMinutes(5)) `
-ScaleActionDirection Increase `
-ScaleActionValue 2 `
-ScaleActionType ChangeCount
# Scale-in rule
$scaleInRule = New-AzAutoscaleRule `
-MetricName "CpuPercentage" `
-MetricResourceId $aspId `
-TimeGrain ([TimeSpan]::FromMinutes(1)) `
-Statistic Average `
-TimeWindow ([TimeSpan]::FromMinutes(10)) `
-TimeAggregationOperator Average `
-Operator LessThan `
-Threshold 25 `
-ScaleActionCooldown ([TimeSpan]::FromMinutes(5)) `
-ScaleActionDirection Decrease `
-ScaleActionValue 1 `
-ScaleActionType ChangeCount
# Default profile
$profile = New-AzAutoscaleProfile `
-DefaultCapacity 3 `
-MaximumCapacity 10 `
-MinimumCapacity 2 `
-Rule @($scaleOutRule, $scaleInRule) `
-Name "default"
# Apply autoscale configuration
Add-AzAutoscaleSetting `
-ResourceGroupName "rg-webapp" `
-Location "brazilsouth" `
-Name "autoscale-asp-producao" `
-TargetResourceId $aspId `
-AutoscaleProfile $profile
Bicepβ
// App Service Plan
resource appServicePlan 'Microsoft.Web/serverfarms@2022-09-01' = {
name: 'asp-producao'
location: 'brazilsouth'
sku: {
name: 'P2V3'
tier: 'PremiumV3'
capacity: 3 // Initial number of instances
}
properties: {
reserved: false // false = Windows; true = Linux
}
}
// Autoscale Settings
resource autoscale 'Microsoft.Insights/autoscaleSettings@2022-10-01' = {
name: 'autoscale-asp-producao'
location: 'brazilsouth'
properties: {
enabled: true
targetResourceUri: appServicePlan.id
profiles: [
{
name: 'default'
capacity: {
default: '3'
minimum: '2'
maximum: '10'
}
rules: [
// Scale-out: CPU > 75% for 5 minutes
{
metricTrigger: {
metricName: 'CpuPercentage'
metricResourceUri: appServicePlan.id
timeGrain: 'PT1M'
statistic: 'Average'
timeWindow: 'PT5M'
timeAggregation: 'Average'
operator: 'GreaterThan'
threshold: 75
}
scaleAction: {
direction: 'Increase'
type: 'ChangeCount'
value: '2'
cooldown: 'PT5M'
}
}
// Scale-out: HttpQueueLength > 100 for 3 minutes
{
metricTrigger: {
metricName: 'HttpQueueLength'
metricResourceUri: appServicePlan.id
timeGrain: 'PT1M'
statistic: 'Average'
timeWindow: 'PT3M'
timeAggregation: 'Average'
operator: 'GreaterThan'
threshold: 100
}
scaleAction: {
direction: 'Increase'
type: 'ChangeCount'
value: '3'
cooldown: 'PT3M'
}
}
// Scale-in: CPU < 25% for 10 minutes
{
metricTrigger: {
metricName: 'CpuPercentage'
metricResourceUri: appServicePlan.id
timeGrain: 'PT1M'
statistic: 'Average'
timeWindow: 'PT10M'
timeAggregation: 'Average'
operator: 'LessThan'
threshold: 25
}
scaleAction: {
direction: 'Decrease'
type: 'ChangeCount'
value: '1'
cooldown: 'PT10M'
}
}
]
}
// Profile for business hours
{
name: 'business-hours'
capacity: {
default: '4'
minimum: '4'
maximum: '15'
}
rules: [] // No additional rules; fixed scale of 4 during hours
recurrence: {
frequency: 'Week'
schedule: {
timeZone: 'E. South America Standard Time'
days: ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday']
hours: [8]
minutes: [0]
}
}
}
]
}
}
7. Control and Securityβ
Monitor and audit scaling decisionsβ
# View scaling event history
az monitor autoscale history \
--resource-group "rg-webapp" \
--name "autoscale-asp-producao" \
--start-time "$(date -u -d '7 days ago' +%Y-%m-%dT%H:%M:%SZ)" \
--end-time "$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
--output table
# View current instance count
az appservice plan show \
--resource-group "rg-webapp" \
--name "asp-producao" \
--query "sku.capacity" \
--output tsv
# Alerts when scaling reaches maxCount
az monitor autoscale settings create-notification \
--autoscale-name "autoscale-asp-producao" \
--resource-group "rg-webapp" \
--send-to-subscription-administrator true \
--webhooks "https://hooks.slack.com/services/..."
RBAC for scaling controlβ
To prevent development teams from accidentally altering production scaling:
# Create custom role that allows viewing but not altering scaling
az role definition create --role-definition '{
"Name": "App Service Scaling Viewer",
"Actions": [
"Microsoft.Web/serverFarms/read",
"Microsoft.Insights/autoscaleSettings/read"
],
"NotActions": [
"Microsoft.Web/serverFarms/write",
"Microsoft.Insights/autoscaleSettings/write"
],
"AssignableScopes": ["/subscriptions/<sub-id>"]
}'
8. Decision Makingβ
Scale Up vs. Scale Outβ
| Situation | Choice | Reason |
|---|---|---|
| Single-threaded app that doesn't parallelize | Scale Up | More CPU per instance, no benefit from multiple instances |
| Stateless web app with many requests | Scale Out | Load distribution, better cost-effectiveness |
| App with insufficient memory causing OOM | Scale Up | More RAM per instance |
| Predictable spike of concurrent users | Scale Out | More workers in parallel |
| Legacy app with concurrency issues | Scale Up | Avoid distribution that exposes state problems |
Manual vs. Autoscaleβ
| Situation | Choice | Reason |
|---|---|---|
| Predictable and constant load | Manual | Avoids unnecessary autoscale overhead |
| Variable load throughout the day | Autoscale with schedule | Predictable, without complex metrics |
| Unpredictable and variable load | Autoscale with metrics | Responds to real conditions |
| Black Friday / special event | Fixed Date Profile | Pre-provision maximum capacity |
| Dev/test with low usage | Manual with minimum | Controlled cost without automatic scaling |
Autoscale metric choiceβ
| Metric | When to use | Caution |
|---|---|---|
| CpuPercentage | CPU-bound apps (processing, calculations) | May not reflect I/O or memory issues |
| HttpQueueLength | Web APIs with many simultaneous requests | Ideal for web apps; indicates direct bottleneck |
| MemoryPercentage | Apps that load large datasets into memory | Scaling doesn't resolve memory leaks |
| Multiple metrics | Complex apps with mixed variation | Combine CPU and HttpQueueLength |
9. Best Practicesβ
Always configure a conservative scale-in rule. Aggressive scale-in may remove instances prematurely during still-high load. Use a longer time window (10-15 minutes) and lower threshold (CPU < 20-25%) for scale-in than for scale-out.
Configure autoscale notifications. Receive emails or webhooks when autoscale acts. This allows you to identify if autoscale is responding appropriately, if it's oscillating, or if it has reached maxCount and can no longer scale.
Test scaling before production. Run load tests that force scale-out and scale-in. Verify that cooldown is adequate, that rules are triggered at the right moments, and that apps don't have shared state issues between instances.
Use schedule profiles for predictable loads. If you know the peak is Monday to Friday from 9 AM to 5 PM, configure a schedule profile that increases the minimum to a reasonable number during those hours. This is more reliable than waiting for autoscale to react to the spike.
Separate apps of different criticalities into different Plans. If one app consumes all Plan resources during a spike, other apps on the same Plan are degraded. Production and dev/test apps should never share the same Plan.
Configure minimum with high availability in mind. A minimum of 1 instance means if that instance fails, the app becomes unavailable until autoscale creates a new one. For production, configure minimum >= 2 to ensure HA.
10. Common Errorsβ
| Error | Why it happens | How to avoid |
|---|---|---|
| Auto Scale doesn't work on Basic tier | Basic doesn't support autoscale | Use Standard or higher for autoscale |
| Scale-in removes instances during still-high load | Scale-in threshold too high or window too short | Use conservative threshold (< 25%) and long window (10+ min) |
| Autoscale oscillates endlessly (flapping) | Cooldown too short or overlapping thresholds | Increase cooldown; ensure thresholds don't overlap |
| Scale-out doesn't happen in time for sudden spike | Cooldown too long or evaluation window too long | Reduce evaluation window for rapid spikes |
| All apps degraded when one spikes | Different apps sharing the same Plan | Separate apps of different criticalities into different Plans |
| Scale Up causes downtime in apps | Not preparing for restart during scale up | Use deployment slots for zero-downtime |
| Wrong metric source in rules | Selecting VM or subscription instead of App Service Plan | Verify that metricResourceUri points to the Plan, not the app |
| MaxCount too low without alert | Autoscale reaches limit without being able to scale further | Configure alert when instances == maxCount |
The most costly errorβ
Not configuring scale-in, resulting in an App Service Plan that grows during spikes but never shrinks. The Plan remains at maximum instance count indefinitely, even during periods of very low load. With 10 P2v3 instances running 24/7 when 3 would suffice, unnecessary costs can be thousands of reais per month.
11. Operation and Maintenanceβ
Check scaling state and historyβ
# View current Plan instance count
az appservice plan show \
--resource-group "rg-webapp" \
--name "asp-producao" \
--query "{SKU: sku.name, Instances: sku.capacity, Tier: sku.tier}" \
--output json
# View Plan CPU metrics for the last 4 hours
az monitor metrics list \
--resource "/subscriptions/<sub-id>/resourceGroups/rg-webapp/providers/Microsoft.Web/serverFarms/asp-producao" \
--metric "CpuPercentage" \
--interval PT5M \
--aggregation Average \
--start-time "$(date -u -d '4 hours ago' +%Y-%m-%dT%H:%M:%SZ)" \
--end-time "$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
--output table
# View HttpQueueLength for the last 4 hours
az monitor metrics list \
--resource "/subscriptions/<sub-id>/resourceGroups/rg-webapp/providers/Microsoft.Web/serverFarms/asp-producao" \
--metric "HttpQueueLength" \
--interval PT1M \
--aggregation Average \
--start-time "$(date -u -d '4 hours ago' +%Y-%m-%dT%H:%M:%SZ)" \
--end-time "$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
--output table
Important limitsβ
| Limit | Value |
|---|---|
| Maximum instances per Plan (Standard) | 10 |
| Maximum instances per Plan (Premium v3) | 30 |
| Maximum instances per Plan (Isolated v2) | 100 |
| Apps per App Service Plan | No defined limit (practical: dozens) |
| Rules per autoscale profile | 10 |
| Profiles per autoscale setting | 20 |
| Recommended minimum for production | 2 (for HA) |
12. Integration and Automationβ
Scaling with Application Insights metricsβ
For scaling based on application performance metrics (not just server metrics):
# Create autoscale rule using custom Application Insights metric
# (requires correct metric namespace)
az monitor autoscale rule create \
--resource-group "rg-webapp" \
--autoscale-name "autoscale-asp-producao" \
--condition "requests/duration > 2000 avg 5m" \
--scale out 2 \
--metric-resource-id "/subscriptions/<sub-id>/resourceGroups/rg-webapp/providers/microsoft.insights/components/appinsights-ecommerce"
Deploy pipeline with pre-configured scalingβ
# Azure DevOps: pre-scale before deploy, post-scale for production
steps:
- task: AzureCLI@2
displayName: 'Pre-scale for deploy'
inputs:
scriptType: 'bash'
inlineScript: |
# Increase instances before deploy to absorb traffic during warmup
az appservice plan update \
--resource-group rg-webapp \
--name asp-producao \
--number-of-workers 6
- task: AzureWebApp@1
displayName: 'Deploy application'
inputs:
appName: 'ecommerce-api'
package: '$(Build.ArtifactStagingDirectory)/**/*.zip'
- task: AzureCLI@2
displayName: 'Restore autoscale post-deploy'
inputs:
scriptType: 'bash'
inlineScript: |
# Re-enable autoscale (disabled for manual scaling above)
az monitor autoscale update \
--resource-group rg-webapp \
--name autoscale-asp-producao \
--enabled true
13. Final Summaryβ
Essential points:
- The App Service Plan defines the tier (CPU/memory per instance) and number of instances; all apps on the Plan share these resources
- Scale Up changes the SKU (ex: S1 to P2v3): more CPU and memory per instance; requires brief restart
- Scale Out changes the number of instances: more workers in parallel; no app restart
- Auto Scale is available only from Standard tier onwards; Free, Shared and Basic support only manual scaling
- Scale-out rules are combined with OR (any one triggers scale-out)
- Scale-in rules are combined with AND (all must be true for scale-in)
- Profiles allow different configurations by schedule (recurrence) or specific date (fixed date)
Critical differences:
- Scale Up vs. Scale Out: scale up increases resources per instance (vertical); scale out increases number of instances (horizontal)
- Manual vs. Autoscale: manual maintains fixed number; autoscale varies based on metrics or schedule
- Default Profile vs. other profiles: default is the fallback when no schedule/date profile is active
- Classic Autoscale vs. Automatic Scaling: Classic uses Azure Monitor metric-based rules (available on Standard); Automatic is the new native mechanism based on HTTP requests (only Premium v3+)
What needs to be remembered for AZ-104:
- Autoscale requires Standard tier or higher; Basic supports only maximum 3 instances with manual scaling
- The autoscale setting is a separate resource (
Microsoft.Insights/autoscaleSettings), not an App Service Plan property HttpQueueLengthmetric indicates requests waiting for workers; it's the most direct metric for web apps- Default
CoolDownis 5 minutes for both scale-out and scale-in - Maximum horizontal scaling: 10 instances (Standard), 30 (Premium v3), 100 (Isolated v2)
- The autoscale setting target resource must be the App Service Plan (
Microsoft.Web/serverFarms), not the Web App