Technical Lab: Configure scaling for an App Service plan

Questions

Question 1 — Multiple Choice

A development team hosts an application on Azure App Service with a Standard S2 plan. The application traffic is predictable: high demand from Monday to Friday between 8 AM and 6 PM, and minimal demand during other periods. The team wants to reduce costs without compromising availability.

Which scaling approach is most suitable for this scenario?

A) Configure manual scale out, manually adjusting the number of instances every morning and evening.

B) Configure autoscale with schedule-based rules, increasing instances during peak hours and reducing them outside of it.

C) Migrate to a Premium P1v3 plan and configure autoscale rules based on CPU metrics.

D) Enable scale up to an S3 plan during peak hours and return to S2 outside of it via script.

Question 2 — Technical Scenario

An administrator configured the following autoscale rule in an App Service plan:

Metric:     CPU Percentage
Operator:   Greater than
Threshold:  70%
Duration:   5 minutes
Action:     Increase count by 1
Cool down:  5 minutes

During a load spike, logs show that CPU remained above 70% for 20 minutes, but the number of instances increased only once, from 2 to 3, and didn't scale further. The plan is configured with a maximum of 3 instances.

What is the cause of the observed behavior?

A) The cool down period of 5 minutes blocked all new evaluations after the first scale out.

B) The rule is configured to increase only one instance per event, and the maximum instance limit defined in the autoscale profile has already been reached.

C) Autoscale only executes one action per 5-minute evaluation cycle, regardless of the maximum limit.

D) The 5-minute metric duration is insufficient to detect prolonged spikes, causing suppression of subsequent actions.

Question 3 — True or False

An App Service plan on the Free (F1) tier supports configuration of autoscale rules based on metrics, as long as the maximum number of instances is set to 1.

Question 4 — Technical Scenario

A production web application experiences slowness during seasonal high-traffic events. The team decides to configure autoscale, but after deployment, they observe that instances scale out with delay, and the application has already degraded before new instances are ready.

The architect suspects the problem is in the scaling trigger configuration. See the current profile:

Metric:     CPU Percentage
Threshold:  85%
Duration:   10 minutes
Action:     Increase count by 1

Which adjustment resolves the problem most directly?

A) Reduce the CPU threshold to a lower value, like 60%, so the trigger is activated before saturation.

B) Increase the minimum number of instances in the autoscale profile to reduce dependence on reactive scaling.

C) Replace the CPU metric with Http Queue Length to detect overload before CPU saturation.

D) Reduce the duration from 10 to 1 minute so the trigger reacts faster to short-duration spikes.

Question 5 — Multiple Choice

When configuring an autoscale profile in Azure App Service, an administrator defines simultaneously a recurring profile for weekends and a default profile with metric-based rules.

How does Azure determine which profile to apply at a given moment?

A) The profile with the highest maximum instance count configured always takes precedence over others.

B) The default profile is always applied as a base; recurring and fixed-date profiles override the default when their conditions are met.

C) Both profiles are evaluated simultaneously, and Azure applies the average of instances defined in each one.

D) The most recently created profile takes precedence over previous ones, regardless of type.

Answer Key and Explanations

Answer Key — Question 1

Answer: B

Autoscale with schedule-based rules is the ideal solution when the load pattern is known and predictable. It allows defining minimum and maximum instances for specific time windows, ensuring capacity before peaks and cost reduction outside of them, without manual intervention.

The main error in the distractors lies in confusing types of scaling:

Alternative A describes a manual process, which is operationally unfeasible and prone to human error.
Alternative C suggests scale up (tier change) combined with metric-based autoscale, which would be a reactive and unnecessarily expensive response for a predictable pattern.
Alternative D attempts to use scale up as a substitute for scale out, ignoring that tier changes are not autoscale operations and involve plan restart.

Answer Key — Question 2

Answer: B

The behavior is exactly as expected: autoscale increased from 2 to 3 instances and stopped because the maximum limit defined in the profile is 3. The autoscale mechanism never exceeds the limits configured in the profile, regardless of load.

Distractor A represents a common misconception: the cool down controls the minimum interval between consecutive actions, but doesn't prevent evaluations. After the 5-minute cool down, autoscale would evaluate again and attempt to scale, but the maximum limit had already been reached.

The important distinction is that cool down and maximum instance limit are independent controls, and the maximum limit is absolute and inviolable by the scaling rule.

Answer Key — Question 3

Answer: False

The App Service plan on the Free (F1) tier and also the Shared (D1) tier do not support autoscale. The autoscale functionality is available only from Basic (B1 and higher) plans, and automatic horizontal scaling requires Standard or higher plans.

The statement presents a condition that seems reasonable (limiting to 1 instance), but the restriction is not about the number of instances: it's about the plan tier itself. Understanding this limit avoids configuration attempts that the portal simply won't allow.

Answer Key — Question 4

Answer: A

Reducing the CPU threshold to a value like 60% causes the trigger to activate earlier, before the application is already saturated. This gives time for new instances to be provisioned and warmed up before visible degradation to the user.

Alternative B is also valid as a complementary practice (increasing minimum instances reduces dependence on reactive scaling), but doesn't solve the trigger timing problem, which is the central point of the scenario.

Alternative D is technically dangerous: a 1-minute duration makes autoscale hypersensitive to transient spikes, causing flapping (scaling and descaling in rapid cycles), which increases costs and can destabilize the application.

Alternative C could be useful in specific scenarios, but Http Queue Length is a secondary metric and not necessarily the best indicator for the described problem.

Answer Key — Question 5

Answer: B

Azure Autoscale applies profiles in a defined precedence hierarchy: fixed-date profiles have maximum priority, followed by recurring profiles, and lastly the default profile, which acts as a fallback when no other profile is active.

The default profile never competes with others: it is simply replaced when a more specific profile is in effect. This means the metric rules defined in the default profile will not be evaluated during a weekend covered by a recurring profile, unless the recurring profile also contains its own rules.

Distractors A, C, and D describe non-existent behaviors in Azure. There is no averaging aggregation between profiles, nor precedence by creation date or instance size.

Questions​

Question 1 — Multiple Choice​

Question 2 — Technical Scenario​

Question 3 — True or False​

Question 4 — Technical Scenario​

Question 5 — Multiple Choice​

Answer Key and Explanations​

Answer Key — Question 1​

Answer Key — Question 2​

Answer Key — Question 3​

Answer Key — Question 4​

Answer Key — Question 5​

Questions

Question 1 — Multiple Choice

Question 2 — Technical Scenario

Question 3 — True or False

Question 4 — Technical Scenario

Question 5 — Multiple Choice

Answer Key and Explanations

Answer Key — Question 1

Answer Key — Question 2

Answer Key — Question 3

Answer Key — Question 4

Answer Key — Question 5