Skip to main content

Theoretical Foundation: Manage Sizing and Scaling for Containers, Including Azure Container Instances and Azure Container Apps


1. Initial Intuition​

Imagine you have a restaurant. Each dish you serve is prepared in an isolated and fully equipped kitchen. When a customer arrives, you open a kitchen, prepare the dish, and close the kitchen. When 100 customers arrive at the same time, you open 100 kitchens in parallel.

Containers work like this: they are isolated and complete environments that package an application with everything it needs to run. What changes between Azure services is who manages these kitchens and at what level of control you operate.

Sizing is deciding the size of each kitchen: how much CPU and memory each container needs to function properly.

Scaling is deciding how many kitchens to operate simultaneously: one when there's low demand, one hundred when there's a peak.

In Azure, two services stand out for this purpose:

  • Azure Container Instances (ACI): you request a container and Azure delivers it in seconds. Full control over sizing, no automatic scaling. Ideal for one-time tasks.
  • Azure Container Apps (ACA): a managed platform that handles automatic scaling based on metrics, events, or simply traffic demand. Ideal for long-running applications.

2. Context​

Where these services fit in Azure's container ecosystem​

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

The choice between these services primarily depends on three factors: desired level of control, ability to scale automatically, and acceptable operational complexity.

Why sizing and scaling matter​

Inadequate sizing wastes money (over-provisioned) or causes failures (under-provisioned). A container with 0.5 vCPU trying to process intensive requests will be slow or fail. A container with 4 vCPUs for a simple reading task generates unnecessary cost.

Inadequate scaling means running out of capacity at peak (users receive errors) or paying for idle capacity in valleys (unnecessary cost). The goal of scaling is to maintain the balance between performance and cost over time.


3. Building the Concepts​

3.1 Azure Container Instances (ACI)​

ACI is Azure's simplest container service. You specify a container image and the necessary resources, and Azure runs the container directly, without needing to manage VMs, clusters, or orchestrators.

Container Groups​

The fundamental unit of ACI is the Container Group: a set of containers that are scheduled on the same host, sharing the same lifecycle, local network, and storage.

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

Sizing in ACI is defined per container within the group. The Container Group allocates the sum of all resources from the containers it contains.

Sizing in ACI: limits and configurations​

Resources are configured individually per container:

ResourceMinimumMaximum (Linux)Maximum (Windows)
vCPU0.144
Memory (GB)0.11614
GPUOptional4 K80 / 2 V100Not supported

Sizing rules in ACI:

  • CPU must be a number with up to one decimal place (0.1, 0.5, 1.0, 2.5, 4.0)
  • Memory must be a multiple of 0.1 GB
  • The combination of CPU and memory must follow the maximum ratio of 1:4 (1 vCPU for up to 4 GB of memory)

Restart policy types in ACI​

ACI has three restart behaviors that affect how sizing is billed:

PolicyBehaviorUse case
AlwaysAutomatically restarts if the container stopsLong-running services
NeverRuns once and doesn't restartOne-time jobs, initialization scripts
OnFailureRestarts only if it terminates with an errorBatch processing with retry

3.2 Azure Container Apps (ACA)​

ACA is a serverless platform for containers that abstracts the underlying infrastructure (based on Kubernetes and KEDA underneath) and offers sophisticated automatic scaling.

ACA Hierarchy​

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

Container Apps Environment is the logical container that groups multiple Container Apps. Resources like VNet, Log Analytics Workspace, and DAPR configurations are shared at the Environment level.

Container App is the deployment unit: an application composed of one or more revisions (immutable versions of the deployment).

Replica is a running instance of a Container App. Scaling increases or decreases the number of replicas.

Sizing in ACA: CPU and Memory​

In ACA, sizing is defined per Container App and follows predefined combinations:

vCPUAvailable Memory
0.250.5 Gi
0.51.0 Gi
0.751.5 Gi
1.02.0 Gi
1.252.5 Gi
1.53.0 Gi
1.753.5 Gi
2.04.0 Gi

The ratio is always 1 vCPU to 2 Gi of memory. You cannot combine freely like in ACI.

Scaling in ACA: the four trigger types​

ACA offers four scaling mechanisms, all configurable without code:

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

HTTP Scaling: scales based on the number of concurrent HTTP requests per replica. It's the simplest and most used trigger for APIs and web apps. When concurrentRequests exceeds the threshold per replica, ACA adds more replicas.

CPU/Memory Scaling: scales based on resource utilization. Useful for processing-intensive workloads where the load manifests as CPU or memory usage.

Event-Driven Scaling (KEDA): scales based on external metrics. For example, when a Service Bus queue has more than 100 pending messages, scale-out to process faster. When the queue is empty, scale-in to zero replicas.

Scale-to-Zero​

ACA supports scale-to-zero: when there's no demand, the number of replicas can be reduced to zero, completely eliminating compute costs. The first request after zero replicas has a small cold start latency.

Exception: if a Container App receives external HTTP traffic via ingress, ACA maintains at least 1 replica (to avoid cold start perceived by the end user). Scale-to-zero for HTTP only works with internal ingress or no ingress.


4. Structural View​

Architectural comparison: ACI vs. ACA​

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

ACA scaling decision flow​

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

5. Practical Operation​

Container Instance lifecycle (ACI)​

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

Revision lifecycle in ACA​

ACA uses the concept of revisions: each deployment generates a new immutable revision. Scaling applies to a specific revision.

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

Non-obvious behaviors​

In ACI, resizing resources requires recreating the container. It's not possible to change CPU or memory of a running ACI container. You need to stop, delete, and recreate with the new values. Plan adequate sizing before production deployment.

In ACA, scaling has a stabilization period (cooldown). To avoid flapping (repeated scale-out and scale-in in short periods), ACA has stabilization periods: by default, it waits 120 seconds before scale-in. This means that after a load drop, extra replicas remain for up to 2 minutes before being removed.

ACA charges for vCPU and memory consumed, not for declared replicas. If you declare maxReplicas: 10 but there are only 2 active replicas, you pay for 2 replicas. The cost is proportional to actual usage, not the configured maximum.

ACI Container Groups in VNet don't have public IP available. When you deploy a Container Group in a VNet (with subnet delegation), it loses the public IP. External access must be done via Load Balancer or Application Gateway in the same VNet.

Scale-to-zero in ACA has cold start, but it's configurable. Typical cold start is 2 to 10 seconds depending on image size and application initialization time. To minimize it, use small images (alpine-based) and minimize application initialization time.


6. Implementation Methods​

Azure Portal​

ACI through portal:

  1. Portal > Container Instances > + Create
  2. Select subscription, RG, name, region, OS (Linux/Windows)
  3. Select image (Docker Hub, ACR, or other registry)
  4. Configure: CPU cores and Memory (GB)
  5. Configure networking: public IP, DNS label, ports
  6. Configure restart policy and environment variables
  7. Review + Create

ACA through portal:

  1. Portal > Container Apps > + Create
  2. Create or select Container Apps Environment
  3. Configure Container App: name, image, CPU and memory
  4. Configure Ingress (HTTP/HTTPS, external/internal)
  5. Configure Scale: minReplicas, maxReplicas
  6. Add scaling rules (HTTP, CPU, event)
  7. Review + Create

Azure CLI​

# ACI: Create Container Instance with specific sizing
az container create \
--resource-group "rg-containers" \
--name "aci-processador" \
--image "myregistry.azurecr.io/processador:v1.0" \
--cpu 2 \
--memory 4 \
--restart-policy OnFailure \
--environment-variables BATCH_SIZE=100 \
--registry-login-server "myregistry.azurecr.io" \
--registry-username "<username>" \
--registry-password "<password>" \
--location "brazilsouth"

# ACI: Create Container Group with multiple containers (via YAML)
cat > container-group.yaml << EOF
apiVersion: 2021-10-01
name: container-group-prod
type: Microsoft.ContainerInstance/containerGroups
location: brazilsouth
properties:
containers:
- name: app-web
properties:
image: myregistry.azurecr.io/webapp:v1.0
resources:
requests:
cpu: 1.0
memoryInGb: 2.0
ports:
- port: 8080
- name: sidecar-logger
properties:
image: myregistry.azurecr.io/logger:v1.0
resources:
requests:
cpu: 0.5
memoryInGb: 0.5
osType: Linux
restartPolicy: Always
ipAddress:
type: Public
ports:
- protocol: tcp
port: 8080
EOF

az container create \
--resource-group "rg-containers" \
--file container-group.yaml

# ACI: View current state and resource usage
az container show \
--resource-group "rg-containers" \
--name "aci-processador" \
--query "{State: instanceView.state, CPU: containers[0].resources.requests.cpu, Memory: containers[0].resources.requests.memoryInGb}" \
--output json

# ACI: View container logs
az container logs \
--resource-group "rg-containers" \
--name "aci-processador"

# ACI: Delete (necessary for resize)
az container delete \
--resource-group "rg-containers" \
--name "aci-processador" \
--yes

# ACA: Create Environment
az containerapp env create \
--name "env-producao" \
--resource-group "rg-containers" \
--location "brazilsouth" \
--logs-workspace-id "<log-analytics-workspace-id>" \
--logs-workspace-key "<workspace-key>"

# ACA: Create Container App with HTTP scaling
az containerapp create \
--name "api-gateway" \
--resource-group "rg-containers" \
--environment "env-producao" \
--image "myregistry.azurecr.io/api-gateway:v1.0" \
--cpu 0.5 \
--memory 1.0Gi \
--min-replicas 1 \
--max-replicas 10 \
--ingress external \
--target-port 8080 \
--registry-server "myregistry.azurecr.io" \
--registry-username "<username>" \
--registry-password "<password>"

# ACA: Add HTTP scaling rule
az containerapp update \
--name "api-gateway" \
--resource-group "rg-containers" \
--scale-rule-name "http-scaling" \
--scale-rule-type "http" \
--scale-rule-http-concurrency 50

# ACA: Create Container App with scale-to-zero (queue processor)
az containerapp create \
--name "queue-processor" \
--resource-group "rg-containers" \
--environment "env-producao" \
--image "myregistry.azurecr.io/queue-processor:v1.0" \
--cpu 1.0 \
--memory 2.0Gi \
--min-replicas 0 \
--max-replicas 20

# ACA: Add Azure Service Bus Queue scaling rule
az containerapp update \
--name "queue-processor" \
--resource-group "rg-containers" \
--scale-rule-name "servicebus-scaling" \
--scale-rule-type "azure-servicebus" \
--scale-rule-metadata "queueName=fila-trabalho" "namespace=mynamespace" "messageCount=10" \
--scale-rule-auth "connection=servicebus-connection-secret"

# ACA: Update sizing (CPU and memory)
az containerapp update \
--name "api-gateway" \
--resource-group "rg-containers" \
--cpu 1.0 \
--memory 2.0Gi

# ACA: Update min/max replicas
az containerapp update \
--name "api-gateway" \
--resource-group "rg-containers" \
--min-replicas 2 \
--max-replicas 15

# ACA: View current number of replicas
az containerapp replica list \
--name "api-gateway" \
--resource-group "rg-containers" \
--output table

# ACA: View revisions and which is active
az containerapp revision list \
--name "api-gateway" \
--resource-group "rg-containers" \
--output table

Bicep​

// ACI: Container Instance with sizing
resource containerInstance 'Microsoft.ContainerInstance/containerGroups@2021-10-01' = {
name: 'aci-processador'
location: 'brazilsouth'
properties: {
containers: [
{
name: 'processador'
properties: {
image: 'myregistry.azurecr.io/processador:v1.0'
resources: {
requests: {
cpu: 2
memoryInGB: 4
}
}
environmentVariables: [
{ name: 'BATCH_SIZE', value: '100' }
]
}
}
]
osType: 'Linux'
restartPolicy: 'OnFailure'
ipAddress: {
type: 'Public'
ports: []
}
}
}

// ACA Environment
resource environment 'Microsoft.App/managedEnvironments@2023-05-01' = {
name: 'env-producao'
location: 'brazilsouth'
properties: {
appLogsConfiguration: {
destination: 'log-analytics'
logAnalyticsConfiguration: {
customerId: logAnalyticsWorkspaceId
sharedKey: logAnalyticsKey
}
}
}
}

// ACA: Container App with HTTP and CPU scaling
resource containerApp 'Microsoft.App/containerApps@2023-05-01' = {
name: 'api-gateway'
location: 'brazilsouth'
properties: {
managedEnvironmentId: environment.id
configuration: {
ingress: {
external: true
targetPort: 8080
transport: 'http'
}
registries: [
{
server: 'myregistry.azurecr.io'
username: registryUsername
passwordSecretRef: 'registry-password'
}
]
}
template: {
containers: [
{
name: 'api-gateway'
image: 'myregistry.azurecr.io/api-gateway:v1.0'
resources: {
cpu: json('0.5')
memory: '1.0Gi'
}
env: [
{ name: 'ENVIRONMENT', value: 'production' }
]
}
]
scale: {
```yaml
minReplicas: 1
maxReplicas: 10
rules: [
{
name: 'http-scaling'
http: {
metadata: {
concurrentRequests: '50'
}
}
}
{
name: 'cpu-scaling'
custom: {
type: 'cpu'
metadata: {
type: 'Utilization'
value: '70'
}
}
}
]
}
}
}
}

7. Control and Security​

Resource isolation between Container Apps​

In ACA, containers within the same Environment share the internal network and can communicate via service name. To isolate Container Apps in different environments:

  • Use separate Environments for complete network isolation
  • Use VNet integration so the Environment is injected into an existing VNet
  • Configure internal Ingress for Container Apps that shouldn't be accessible externally

Managed Identity for image pull​

# Assign Managed Identity to Container App
az containerapp identity assign \
--name "api-gateway" \
--resource-group "rg-containers" \
--system-assigned

# Grant pull permission on ACR
PRINCIPAL_ID=$(az containerapp show \
--name "api-gateway" \
--resource-group "rg-containers" \
--query "identity.principalId" -o tsv)

az role assignment create \
--assignee "$PRINCIPAL_ID" \
--role "AcrPull" \
--scope "/subscriptions/<sub-id>/resourceGroups/<rg>/providers/Microsoft.ContainerRegistry/registries/myregistry"

# Update Container App to use Managed Identity for registry
az containerapp registry set \
--name "api-gateway" \
--resource-group "rg-containers" \
--server "myregistry.azurecr.io" \
--identity system

Secrets in ACA​

Credentials and sensitive configurations should be managed as secrets, not plain text environment variables:

# Create secret in Container App
az containerapp secret set \
--name "api-gateway" \
--resource-group "rg-containers" \
--secrets "db-password=<database-password>"

# Reference secret in environment variable
az containerapp update \
--name "api-gateway" \
--resource-group "rg-containers" \
--set-env-vars "DB_PASSWORD=secretref:db-password"

8. Decision Making​

ACI vs. ACA​

SituationChoiceReason
Processing job that runs 10 minutes per dayACI with restart policy NeverServerless per execution, no management overhead
API that needs to scale from 0 to 100 instances with trafficACA with HTTP scalingManaged automatic scaling, scale-to-zero possible
Queue processing that grows with messagesACA with KEDA event scalingScales exactly proportionally to backlog
CI/CD task that consumes image and runs testsACISimple, fast for one-time tasks
Production web application with availability SLAACA with minReplicas >= 1Ensures there's always at least one active replica
Workload that can never have cold startACA with minReplicas >= 1Scale-to-zero introduces latency on first request
Container with GPU for machine learningACIACA doesn't support GPU; ACI does
Multiple containers with communication on same machineACI Container GroupShared localhost network, sidecar pattern

Sizing: how to define CPU and memory​

Workload typeRecommended CPURecommended Memory
Simple REST API, I/O bound0.25-0.5 vCPU0.5-1.0 Gi
API with light processing0.5-1.0 vCPU1.0-2.0 Gi
Data processing / compute-intensive1.0-2.0 vCPU2.0-4.0 Gi
Machine Learning inference2.0-4.0 vCPU4.0-8.0 Gi
Batch data transformation job1.0-4.0 vCPU2.0-8.0 Gi

Scaling configuration in ACA​

RequirementConfiguration
Can never go offlineminReplicas: 1 or more
Can have acceptable cold startminReplicas: 0 (scale-to-zero)
Variable but predictable trafficHTTP scaling with adjusted concurrentRequests
Asynchronous queue processingKEDA event scaling with messageCount threshold
Guaranteed baseline + additional scalingminReplicas: 2, maxReplicas: 20

9. Best Practices​

Define resource limits based on real load testing, not estimates. Proper sizing requires profiling: run your application with representative load and measure CPU and memory usage. A container sized by intuition usually results in over-provisioning (extra cost) or under-provisioning (failures).

In ACA, start with minReplicas: 1 for production. Scale-to-zero saves money but adds cold start latency. For public APIs, the first user's experience after an inactive period will be degraded. Evaluate the trade-off between cost and user experience.

Use ACA revisions for safe deployments. ACA supports traffic splitting between revisions (e.g., 90% on current revision, 10% on new). This enables blue/green deployments or canary releases without additional infrastructure.

# Configure traffic splitting between revisions
az containerapp ingress traffic set \
--name "api-gateway" \
--resource-group "rg-containers" \
--revision-weight "api-gateway--revision1=90" "api-gateway--revision2=10"

Configure conservative memory limits to detect memory leaks. If a container consistently uses 90%+ of configured memory, either sizing needs to be increased or there's a memory leak that needs investigation. Monitor memory usage over time.

For ACI in production, use container groups with VNet integration. ACI with direct public IP is suitable for testing. In production, integrate to VNet for network access control and use Private DNS.

In ACA, define maxReplicas based on backend capacity. If your Container App depends on a database with 100 maximum connections, don't configure maxReplicas above the number of connections your application uses per instance. 50 replicas with 3 connections each = 150 connections, overloading the database.


10. Common Errors​

ErrorWhy it happensHow to avoid
ACI with insufficient sizing failingEstimating CPU/memory without testingTest with load before prod; monitor metrics in first week
ACA scaling to maxReplicas and saturating databaseNot considering downstream limitsDefine maxReplicas based on dependency limits
ACI container not starting due to OOMKilledInsufficient configured memoryMonitor logs; increase memory if frequent OOMKilled
Scale-to-zero causing user timeoutsUnexpected cold start in prodUse minReplicas: 1 for latency-sensitive public endpoints
ACI being used where ACA would be betterLack of ACA knowledgeACI for jobs; ACA for long-running services with scaling
Too aggressive scaling rule causing flappingToo low threshold for scale-inIncrease scale-in threshold or use longer stabilization period
100% CPU but scaling not happeningCPU scaling threshold set above current usageReview configuration; CPU scaling needs threshold < current usage
Unexpectedly high ACA costToo high maxReplicas reached during peakMonitor replica count; adjust maxReplicas based on load testing

11. Operations and Maintenance​

Monitor ACI​

# View state and events of a Container Instance
az container show \
--resource-group "rg-containers" \
--name "aci-processor" \
--query "{State: instanceView.state, Events: instanceView.events}" \
--output json

# Real-time log streaming
az container attach \
--resource-group "rg-containers" \
--name "aci-processor"

# Execute command inside container (debug)
az container exec \
--resource-group "rg-containers" \
--name "aci-processor" \
--exec-command "/bin/bash"

Monitor ACA​

# View currently active replicas
az containerapp replica list \
--name "api-gateway" \
--resource-group "rg-containers" \
--output table

# View revision history
az containerapp revision list \
--name "api-gateway" \
--resource-group "rg-containers" \
--output table

# View Container App logs (via Log Analytics)
az monitor log-analytics query \
--workspace "<workspace-id>" \
--analytics-query "
ContainerAppConsoleLogs_CL
| where ContainerAppName_s == 'api-gateway'
| where TimeGenerated > ago(1h)
| project TimeGenerated, Log_s
| order by TimeGenerated desc
| limit 50"

# View CPU and memory metrics
az monitor metrics list \
--resource "/subscriptions/<sub-id>/resourceGroups/rg-containers/providers/Microsoft.App/containerApps/api-gateway" \
--metric "CpuPercentage" \
--interval PT5M \
--output table

Important limits​

ServiceLimitValue
ACIvCPU per container4
ACIMemory per container (Linux)16 GB
ACIContainers per Container Group60
ACIContainer Groups per region100 (default, increasable)
ACAvCPU per container2.0
ACAMemory per container4.0 Gi
ACAmaxReplicas per Container App300
ACAContainer Apps per Environment20 (default)
ACARevisions per Container App100

12. Integration and Automation​

ACI as job executor in pipelines​

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

ACA with KEDA for event-driven processing​

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

Terraform for ACA management at scale​

resource "azurerm_container_app_environment" "prod" {
name = "env-production"
location = "brazilsouth"
resource_group_name = azurerm_resource_group.main.name
log_analytics_workspace_id = azurerm_log_analytics_workspace.main.id
}

resource "azurerm_container_app" "api" {
name = "api-gateway"
container_app_environment_id = azurerm_container_app_environment.prod.id
resource_group_name = azurerm_resource_group.main.name
revision_mode = "Single"

template {
container {
name = "api-gateway"
image = "myregistry.azurecr.io/api-gateway:v1.0"
cpu = 0.5
memory = "1Gi"
}

min_replicas = 1
max_replicas = 10

http_scale_rule {
name = "http-scaling"
concurrent_requests = 50
}
}

ingress {
external_enabled = true
target_port = 8080
traffic_weight {
percentage = 100
latest_revision = true
}
}
}

13. Final Summary​

Essential points:

  • Azure Container Instances (ACI): simple service to run containers on-demand; sizing defined at creation (CPU and memory per container); no automatic scaling; ideal for jobs, one-time tasks, and workloads without orchestration
  • Azure Container Apps (ACA): managed platform with automatic scaling via HTTP, CPU, memory, or external events (KEDA); supports scale-to-zero; ideal for long-running services, APIs, and event-driven processing
  • ACI sizing is free-form within limits (0.1 to 4 vCPU, 0.1 to 16 GB); ACA sizing follows predefined proportions (1 vCPU = 2 Gi memory)
  • Scale-to-zero in ACA eliminates cost when there's no demand but introduces cold start latency; avoid for endpoints with high availability SLA using minReplicas >= 1
  • Resizing ACI requires deleting and recreating the container; resizing ACA is done via update that automatically generates a new revision

Critical differences:

  • ACI vs. ACA: ACI is for one-time executions without management; ACA is for continuous services with automatic scaling
  • Container Group (ACI) vs. Container App (ACA): Container Group is a set of containers on the same host; Container App is a deployment unit that can have N identical replicas
  • HTTP scaling vs. KEDA Event: HTTP scaling responds to concurrent requests; KEDA event scaling responds to external system metrics like queues and topics

What needs to be remembered for AZ-104:

  • ACI charges per vCPU and memory per second of execution
  • ACA charges per vCPU and memory per active replica (scale-to-zero = zero cost)
  • In ACI, maximum ratio is 1 vCPU : 4 GB memory
  • In ACA, ratio is always 1 vCPU : 2 Gi memory (fixed)
  • maxReplicas limit in ACA is 300 per Container App
  • ACI supports GPU; ACA doesn't support GPU
  • ACI restart policies: Always, Never, OnFailure
  • ACA scaling is managed internally by KEDA (Kubernetes Event-Driven Autoscaling)