Skip to main content

Troubleshooting Lab: Configure scaling for an App Service plan

Diagnostic Scenarios​

Scenario 1 β€” Root Cause​

A production web application is hosted on a Standard S2 App Service plan in the East US region. The operations team configured autoscale two weeks ago without issues. Last Monday, after a traffic spike, the team received slowness alerts. Upon investigation, they found the following in the autoscale activity log:

[2026-03-23 14:02] Autoscale evaluation triggered
[2026-03-23 14:02] Current instances: 3
[2026-03-23 14:02] Metric: CpuPercentage = 82%
[2026-03-23 14:02] Rule matched: scale out (threshold: 75%)
[2026-03-23 14:02] Maximum instance count reached: 3
[2026-03-23 14:02] Scale out action skipped
[2026-03-23 14:07] Autoscale evaluation triggered
[2026-03-23 14:07] Current instances: 3
[2026-03-23 14:07] Metric: CpuPercentage = 88%
[2026-03-23 14:07] Rule matched: scale out (threshold: 75%)
[2026-03-23 14:07] Maximum instance count reached: 3
[2026-03-23 14:07] Scale out action skipped

The administrator also reports that the plan was migrated from S1 to S2 three days before the incident, and that no other configuration was changed in the process. The Azure subscription shows no quota alerts.

What is the root cause of the inability to scale horizontally during the spike?

A) The migration from S1 to S2 reset the autoscale rules, which reverted to default values of maximum 1 instance.

B) The autoscale profile has the maximum instance limit configured as 3, preventing new additions regardless of load.

C) The cool down period between evaluations is too short, causing consecutive actions to be suppressed before reaching the actual maximum.

D) The S2 tier has a lower platform limit than S1 in terms of simultaneous instances supported by autoscale.


Scenario 2 β€” Action Decision​

The platform team identified that an App Service plan in production has autoscale correctly configured, but instances take about 8 to 10 minutes to become fully operational after a scale out event. During this interval, the application shows increased latency and HTTP 503 errors.

The cause has been identified: the application performs heavy initialization on startup, loading large volumes of data into cache and establishing connections with multiple external services. The environment uses Premium P2v3 plan, with minimum of 1 instance and maximum of 5.

It's 11 AM on a Tuesday. The system is in peak hours. The team doesn't have access to the application code at this moment and cannot change the initialization logic.

What is the correct action to take now?

A) Reduce the CPU threshold of the scale out rule from 70% to 40%, so that scaling is triggered before saturation and instances are ready in time.

B) Enable Always On in the App Service to ensure that existing instances are never shut down during peak periods.

C) Increase the minimum number of instances in the autoscale profile to a value that covers peak demand without depending on reactive scale out.

D) Configure a recurring profile for autoscale during peak hours with elevated minimum instances, applicable starting the next day.


Scenario 3 β€” Root Cause​

An administrator receives a ticket reporting that the Azure portal doesn't display the Scale out (App Service plan) option for a specific application. The administrator checks the configuration and collects the following information:

App Service plan: myapp-plan
Tier: Basic B2
Region: Brazil South
OS: Windows
Runtime: .NET 8
Instances: 1

The administrator also notes that the application has Application Insights enabled and that diagnostic logs are being sent to a Log Analytics Workspace. He checks for any resource locks in the subscription and finds none. The account used has the Contributor role on the resource group.

What is the root cause of the absence of the scale out with autoscale option?

A) The absence of a Diagnostic Settings with Storage Account destination prevents the collection of metrics necessary for autoscale.

B) The App Service plan is on Basic tier, which supports manual horizontal scaling but doesn't support autoscale based on rules and metrics.

C) The Brazil South region has capacity restrictions that disable autoscale for Basic and Standard tier plans.

D) The Contributor role doesn't have sufficient permission to view or configure autoscale on App Service plans.


Scenario 4 β€” Collateral Impact​

An administrator identifies that the production App Service plan is constantly at the maximum instance limit (5 of 5) during business hours, never scaling down. The scale in rule is configured as follows:

Metric:     CPU Percentage
Operator: Less than
Threshold: 30%
Duration: 10 minutes
Action: Decrease count by 1
Cool down: 20 minutes

After investigation, the administrator concludes that the scale in threshold is too high for the application's usage pattern, which rarely drops below 40% CPU even with low traffic. To resolve this, he adjusts the scale in threshold from 30% to 50%.

The application has no local state on disk. The team approves the change.

What secondary consequence can this change cause?

A) Autoscale will start scaling in more frequently, which can cause flapping if CPU oscillates around 50% during periods of moderate load.

B) The scale out rule will stop working correctly because scale in and scale out rules internally share the same evaluation threshold.

C) The 20-minute cool down will be ignored after changing the threshold, as rule changes restart the autoscale evaluation cycle.

D) The minimum number of instances will be automatically recalculated by autoscale based on the new threshold, potentially reducing the instance floor below the configured value.


Answer Key and Explanations​

Answer Key β€” Scenario 1​

Answer: B

The log is conclusive: in both evaluations, the message Maximum instance count reached: 3 appears explicitly as the reason for blocking the action. Autoscale detected the need to scale, the rule was matched, but the profile's maximum limit prevented execution.

The decisive clue is in the log, not in the contextual information. The migration from S1 to S2 and the absence of quota alerts are irrelevant information purposely included. Tier migration doesn't reset autoscale configurations, and subscription quota wasn't pointed as limiting in the logs.

Distractor C represents the most dangerous diagnostic error: confusing cool down behavior with maximum limit blocking. Cool down controls the interval between actions, but the logs show evaluations at 5-minute intervals without any action being executed, which is consistent with limit blocking, not active cool down.

The consequence of acting based on distractor C would be adjusting the cool down without solving the real problem, wasting time while the application remains degraded.


Answer Key β€” Scenario 2​

Answer: C

The cause is known and cannot be fixed at the source (the code). The problem is that reactive scale out will always arrive late because instances take 8 to 10 minutes to be ready. The only solution available within the constraints is to ensure sufficient instances are already running before the spike, by raising the minimum instances in the profile.

Alternative A is a valid action in general, but doesn't solve the initialization delay problem: even triggering scale out earlier, the instance will still take 8 to 10 minutes to be ready.

Alternative D would be the correct solution under normal conditions for predictable spikes, but the time constraint is critical: it's 11 AM on an active peak day. A recurring profile applicable "starting the next day" doesn't solve the immediate problem.

Alternative B confuses concepts: Always On keeps the application warm on existing instances, but doesn't affect the initialization time of new instances created by scale out.


Answer Key β€” Scenario 3​

Answer: B

The Basic tier supports horizontal scaling, but only manually, without support for autoscale based on rules and metrics. The option to configure autoscale rules is only available from the Standard tier onwards.

The information about Application Insights, Log Analytics, and resource locks is irrelevant and was included to lead the reader down incorrect diagnostic paths. Metrics collection by Application Insights has no relation to the availability of autoscale functionality in the portal.

Distractor A represents a classic false correlation error: associating the absence of a specific diagnostic destination with the unavailability of a functionality that is actually restricted by tier.

Distractor D is technically false: the Contributor role has full permission to configure autoscale. The absence of the option in the portal is determined by the plan's tier, not by RBAC.

Acting based on distractor A would lead the administrator to configure unnecessary Diagnostic Settings without solving the real problem, and possibly open an unnecessary support ticket.


Answer Key β€” Scenario 4​

Answer: A

Increasing the scale in threshold from 30% to 50% causes the instance reduction condition to be activated much more frequently, since the application's CPU typically stays between 40% and 60% under moderate load. This can create a flapping cycle: autoscale reduces instances when CPU drops below 50%, load redistributes and CPU rises again, triggering scale out, and the cycle repeats.

Alternatives B, C, and D describe behaviors that don't exist in Azure Autoscale:

  • Scale in and scale out rules are independent and have separate thresholds that don't influence each other.
  • Changing a rule's threshold doesn't restart the evaluation cycle nor affect ongoing cool down.
  • The minimum number of instances is an explicit profile configuration and is never automatically recalculated by the autoscale mechanism.

The real impact of flapping goes beyond cost: instances being created and destroyed in rapid cycles can cause instability in long-duration connections and degrade user experience, even if the application has no local state.


Troubleshooting Tree: Configure scaling for an App Service plan​

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

Color Legend:

ColorNode Type
Dark blueInitial symptom (entry point)
Medium blueDiagnostic question
RedIdentified cause
GreenRecommended action or resolution
OrangeIntermediate verification or validation

To use this tree when facing a real problem, start with the root node identifying the observed symptom. At each question node, answer based on what is verifiable directly in the Azure portal or activity logs. Follow the path corresponding to the answer until reaching a cause or action node. At orange validation nodes, collect the indicated evidence before proceeding to the next branch. Never skip intermediate verification steps, as identical symptoms can have distinct causes depending on the path taken.