Theoretical Foundation: Manage Sizing and Scaling for Containers, Including Azure Container Instances and Azure Container Apps
1. Initial Intuitionβ
Imagine you have a restaurant. Each dish you serve is prepared in an isolated and fully equipped kitchen. When a customer arrives, you open a kitchen, prepare the dish, and close the kitchen. When 100 customers arrive at the same time, you open 100 kitchens in parallel.
Containers work like this: they are isolated and complete environments that package an application with everything it needs to run. What changes between Azure services is who manages these kitchens and at what level of control you operate.
Sizing is deciding the size of each kitchen: how much CPU and memory each container needs to function properly.
Scaling is deciding how many kitchens to operate simultaneously: one when there's low demand, one hundred when there's a peak.
In Azure, two services stand out for this purpose:
- Azure Container Instances (ACI): you request a container and Azure delivers it in seconds. Full control over sizing, no automatic scaling. Ideal for one-time tasks.
- Azure Container Apps (ACA): a managed platform that handles automatic scaling based on metrics, events, or simply traffic demand. Ideal for long-running applications.
2. Contextβ
Where these services fit in Azure's container ecosystemβ
The choice between these services primarily depends on three factors: desired level of control, ability to scale automatically, and acceptable operational complexity.
Why sizing and scaling matterβ
Inadequate sizing wastes money (over-provisioned) or causes failures (under-provisioned). A container with 0.5 vCPU trying to process intensive requests will be slow or fail. A container with 4 vCPUs for a simple reading task generates unnecessary cost.
Inadequate scaling means running out of capacity at peak (users receive errors) or paying for idle capacity in valleys (unnecessary cost). The goal of scaling is to maintain the balance between performance and cost over time.
3. Building the Conceptsβ
3.1 Azure Container Instances (ACI)β
ACI is Azure's simplest container service. You specify a container image and the necessary resources, and Azure runs the container directly, without needing to manage VMs, clusters, or orchestrators.
Container Groupsβ
The fundamental unit of ACI is the Container Group: a set of containers that are scheduled on the same host, sharing the same lifecycle, local network, and storage.
Sizing in ACI is defined per container within the group. The Container Group allocates the sum of all resources from the containers it contains.
Sizing in ACI: limits and configurationsβ
Resources are configured individually per container:
| Resource | Minimum | Maximum (Linux) | Maximum (Windows) |
|---|---|---|---|
| vCPU | 0.1 | 4 | 4 |
| Memory (GB) | 0.1 | 16 | 14 |
| GPU | Optional | 4 K80 / 2 V100 | Not supported |
Sizing rules in ACI:
- CPU must be a number with up to one decimal place (0.1, 0.5, 1.0, 2.5, 4.0)
- Memory must be a multiple of 0.1 GB
- The combination of CPU and memory must follow the maximum ratio of 1:4 (1 vCPU for up to 4 GB of memory)
Restart policy types in ACIβ
ACI has three restart behaviors that affect how sizing is billed:
| Policy | Behavior | Use case |
|---|---|---|
| Always | Automatically restarts if the container stops | Long-running services |
| Never | Runs once and doesn't restart | One-time jobs, initialization scripts |
| OnFailure | Restarts only if it terminates with an error | Batch processing with retry |
3.2 Azure Container Apps (ACA)β
ACA is a serverless platform for containers that abstracts the underlying infrastructure (based on Kubernetes and KEDA underneath) and offers sophisticated automatic scaling.
ACA Hierarchyβ
Container Apps Environment is the logical container that groups multiple Container Apps. Resources like VNet, Log Analytics Workspace, and DAPR configurations are shared at the Environment level.
Container App is the deployment unit: an application composed of one or more revisions (immutable versions of the deployment).
Replica is a running instance of a Container App. Scaling increases or decreases the number of replicas.
Sizing in ACA: CPU and Memoryβ
In ACA, sizing is defined per Container App and follows predefined combinations:
| vCPU | Available Memory |
|---|---|
| 0.25 | 0.5 Gi |
| 0.5 | 1.0 Gi |
| 0.75 | 1.5 Gi |
| 1.0 | 2.0 Gi |
| 1.25 | 2.5 Gi |
| 1.5 | 3.0 Gi |
| 1.75 | 3.5 Gi |
| 2.0 | 4.0 Gi |
The ratio is always 1 vCPU to 2 Gi of memory. You cannot combine freely like in ACI.
Scaling in ACA: the four trigger typesβ
ACA offers four scaling mechanisms, all configurable without code:
HTTP Scaling: scales based on the number of concurrent HTTP requests per replica. It's the simplest and most used trigger for APIs and web apps. When concurrentRequests exceeds the threshold per replica, ACA adds more replicas.
CPU/Memory Scaling: scales based on resource utilization. Useful for processing-intensive workloads where the load manifests as CPU or memory usage.
Event-Driven Scaling (KEDA): scales based on external metrics. For example, when a Service Bus queue has more than 100 pending messages, scale-out to process faster. When the queue is empty, scale-in to zero replicas.
Scale-to-Zeroβ
ACA supports scale-to-zero: when there's no demand, the number of replicas can be reduced to zero, completely eliminating compute costs. The first request after zero replicas has a small cold start latency.
Exception: if a Container App receives external HTTP traffic via ingress, ACA maintains at least 1 replica (to avoid cold start perceived by the end user). Scale-to-zero for HTTP only works with internal ingress or no ingress.
4. Structural Viewβ
Architectural comparison: ACI vs. ACAβ
ACA scaling decision flowβ
5. Practical Operationβ
Container Instance lifecycle (ACI)β
Revision lifecycle in ACAβ
ACA uses the concept of revisions: each deployment generates a new immutable revision. Scaling applies to a specific revision.
Non-obvious behaviorsβ
In ACI, resizing resources requires recreating the container. It's not possible to change CPU or memory of a running ACI container. You need to stop, delete, and recreate with the new values. Plan adequate sizing before production deployment.
In ACA, scaling has a stabilization period (cooldown). To avoid flapping (repeated scale-out and scale-in in short periods), ACA has stabilization periods: by default, it waits 120 seconds before scale-in. This means that after a load drop, extra replicas remain for up to 2 minutes before being removed.
ACA charges for vCPU and memory consumed, not for declared replicas.
If you declare maxReplicas: 10 but there are only 2 active replicas, you pay for 2 replicas. The cost is proportional to actual usage, not the configured maximum.
ACI Container Groups in VNet don't have public IP available. When you deploy a Container Group in a VNet (with subnet delegation), it loses the public IP. External access must be done via Load Balancer or Application Gateway in the same VNet.
Scale-to-zero in ACA has cold start, but it's configurable. Typical cold start is 2 to 10 seconds depending on image size and application initialization time. To minimize it, use small images (alpine-based) and minimize application initialization time.
6. Implementation Methodsβ
Azure Portalβ
ACI through portal:
- Portal > Container Instances > + Create
- Select subscription, RG, name, region, OS (Linux/Windows)
- Select image (Docker Hub, ACR, or other registry)
- Configure: CPU cores and Memory (GB)
- Configure networking: public IP, DNS label, ports
- Configure restart policy and environment variables
- Review + Create
ACA through portal:
- Portal > Container Apps > + Create
- Create or select Container Apps Environment
- Configure Container App: name, image, CPU and memory
- Configure Ingress (HTTP/HTTPS, external/internal)
- Configure Scale: minReplicas, maxReplicas
- Add scaling rules (HTTP, CPU, event)
- Review + Create
Azure CLIβ
# ACI: Create Container Instance with specific sizing
az container create \
--resource-group "rg-containers" \
--name "aci-processador" \
--image "myregistry.azurecr.io/processador:v1.0" \
--cpu 2 \
--memory 4 \
--restart-policy OnFailure \
--environment-variables BATCH_SIZE=100 \
--registry-login-server "myregistry.azurecr.io" \
--registry-username "<username>" \
--registry-password "<password>" \
--location "brazilsouth"
# ACI: Create Container Group with multiple containers (via YAML)
cat > container-group.yaml << EOF
apiVersion: 2021-10-01
name: container-group-prod
type: Microsoft.ContainerInstance/containerGroups
location: brazilsouth
properties:
containers:
- name: app-web
properties:
image: myregistry.azurecr.io/webapp:v1.0
resources:
requests:
cpu: 1.0
memoryInGb: 2.0
ports:
- port: 8080
- name: sidecar-logger
properties:
image: myregistry.azurecr.io/logger:v1.0
resources:
requests:
cpu: 0.5
memoryInGb: 0.5
osType: Linux
restartPolicy: Always
ipAddress:
type: Public
ports:
- protocol: tcp
port: 8080
EOF
az container create \
--resource-group "rg-containers" \
--file container-group.yaml
# ACI: View current state and resource usage
az container show \
--resource-group "rg-containers" \
--name "aci-processador" \
--query "{State: instanceView.state, CPU: containers[0].resources.requests.cpu, Memory: containers[0].resources.requests.memoryInGb}" \
--output json
# ACI: View container logs
az container logs \
--resource-group "rg-containers" \
--name "aci-processador"
# ACI: Delete (necessary for resize)
az container delete \
--resource-group "rg-containers" \
--name "aci-processador" \
--yes
# ACA: Create Environment
az containerapp env create \
--name "env-producao" \
--resource-group "rg-containers" \
--location "brazilsouth" \
--logs-workspace-id "<log-analytics-workspace-id>" \
--logs-workspace-key "<workspace-key>"
# ACA: Create Container App with HTTP scaling
az containerapp create \
--name "api-gateway" \
--resource-group "rg-containers" \
--environment "env-producao" \
--image "myregistry.azurecr.io/api-gateway:v1.0" \
--cpu 0.5 \
--memory 1.0Gi \
--min-replicas 1 \
--max-replicas 10 \
--ingress external \
--target-port 8080 \
--registry-server "myregistry.azurecr.io" \
--registry-username "<username>" \
--registry-password "<password>"
# ACA: Add HTTP scaling rule
az containerapp update \
--name "api-gateway" \
--resource-group "rg-containers" \
--scale-rule-name "http-scaling" \
--scale-rule-type "http" \
--scale-rule-http-concurrency 50
# ACA: Create Container App with scale-to-zero (queue processor)
az containerapp create \
--name "queue-processor" \
--resource-group "rg-containers" \
--environment "env-producao" \
--image "myregistry.azurecr.io/queue-processor:v1.0" \
--cpu 1.0 \
--memory 2.0Gi \
--min-replicas 0 \
--max-replicas 20
# ACA: Add Azure Service Bus Queue scaling rule
az containerapp update \
--name "queue-processor" \
--resource-group "rg-containers" \
--scale-rule-name "servicebus-scaling" \
--scale-rule-type "azure-servicebus" \
--scale-rule-metadata "queueName=fila-trabalho" "namespace=mynamespace" "messageCount=10" \
--scale-rule-auth "connection=servicebus-connection-secret"
# ACA: Update sizing (CPU and memory)
az containerapp update \
--name "api-gateway" \
--resource-group "rg-containers" \
--cpu 1.0 \
--memory 2.0Gi
# ACA: Update min/max replicas
az containerapp update \
--name "api-gateway" \
--resource-group "rg-containers" \
--min-replicas 2 \
--max-replicas 15
# ACA: View current number of replicas
az containerapp replica list \
--name "api-gateway" \
--resource-group "rg-containers" \
--output table
# ACA: View revisions and which is active
az containerapp revision list \
--name "api-gateway" \
--resource-group "rg-containers" \
--output table
Bicepβ
// ACI: Container Instance with sizing
resource containerInstance 'Microsoft.ContainerInstance/containerGroups@2021-10-01' = {
name: 'aci-processador'
location: 'brazilsouth'
properties: {
containers: [
{
name: 'processador'
properties: {
image: 'myregistry.azurecr.io/processador:v1.0'
resources: {
requests: {
cpu: 2
memoryInGB: 4
}
}
environmentVariables: [
{ name: 'BATCH_SIZE', value: '100' }
]
}
}
]
osType: 'Linux'
restartPolicy: 'OnFailure'
ipAddress: {
type: 'Public'
ports: []
}
}
}
// ACA Environment
resource environment 'Microsoft.App/managedEnvironments@2023-05-01' = {
name: 'env-producao'
location: 'brazilsouth'
properties: {
appLogsConfiguration: {
destination: 'log-analytics'
logAnalyticsConfiguration: {
customerId: logAnalyticsWorkspaceId
sharedKey: logAnalyticsKey
}
}
}
}
// ACA: Container App with HTTP and CPU scaling
resource containerApp 'Microsoft.App/containerApps@2023-05-01' = {
name: 'api-gateway'
location: 'brazilsouth'
properties: {
managedEnvironmentId: environment.id
configuration: {
ingress: {
external: true
targetPort: 8080
transport: 'http'
}
registries: [
{
server: 'myregistry.azurecr.io'
username: registryUsername
passwordSecretRef: 'registry-password'
}
]
}
template: {
containers: [
{
name: 'api-gateway'
image: 'myregistry.azurecr.io/api-gateway:v1.0'
resources: {
cpu: json('0.5')
memory: '1.0Gi'
}
env: [
{ name: 'ENVIRONMENT', value: 'production' }
]
}
]
scale: {
```yaml
minReplicas: 1
maxReplicas: 10
rules: [
{
name: 'http-scaling'
http: {
metadata: {
concurrentRequests: '50'
}
}
}
{
name: 'cpu-scaling'
custom: {
type: 'cpu'
metadata: {
type: 'Utilization'
value: '70'
}
}
}
]
}
}
}
}
7. Control and Securityβ
Resource isolation between Container Appsβ
In ACA, containers within the same Environment share the internal network and can communicate via service name. To isolate Container Apps in different environments:
- Use separate Environments for complete network isolation
- Use VNet integration so the Environment is injected into an existing VNet
- Configure internal Ingress for Container Apps that shouldn't be accessible externally
Managed Identity for image pullβ
# Assign Managed Identity to Container App
az containerapp identity assign \
--name "api-gateway" \
--resource-group "rg-containers" \
--system-assigned
# Grant pull permission on ACR
PRINCIPAL_ID=$(az containerapp show \
--name "api-gateway" \
--resource-group "rg-containers" \
--query "identity.principalId" -o tsv)
az role assignment create \
--assignee "$PRINCIPAL_ID" \
--role "AcrPull" \
--scope "/subscriptions/<sub-id>/resourceGroups/<rg>/providers/Microsoft.ContainerRegistry/registries/myregistry"
# Update Container App to use Managed Identity for registry
az containerapp registry set \
--name "api-gateway" \
--resource-group "rg-containers" \
--server "myregistry.azurecr.io" \
--identity system
Secrets in ACAβ
Credentials and sensitive configurations should be managed as secrets, not plain text environment variables:
# Create secret in Container App
az containerapp secret set \
--name "api-gateway" \
--resource-group "rg-containers" \
--secrets "db-password=<database-password>"
# Reference secret in environment variable
az containerapp update \
--name "api-gateway" \
--resource-group "rg-containers" \
--set-env-vars "DB_PASSWORD=secretref:db-password"
8. Decision Makingβ
ACI vs. ACAβ
| Situation | Choice | Reason |
|---|---|---|
| Processing job that runs 10 minutes per day | ACI with restart policy Never | Serverless per execution, no management overhead |
| API that needs to scale from 0 to 100 instances with traffic | ACA with HTTP scaling | Managed automatic scaling, scale-to-zero possible |
| Queue processing that grows with messages | ACA with KEDA event scaling | Scales exactly proportionally to backlog |
| CI/CD task that consumes image and runs tests | ACI | Simple, fast for one-time tasks |
| Production web application with availability SLA | ACA with minReplicas >= 1 | Ensures there's always at least one active replica |
| Workload that can never have cold start | ACA with minReplicas >= 1 | Scale-to-zero introduces latency on first request |
| Container with GPU for machine learning | ACI | ACA doesn't support GPU; ACI does |
| Multiple containers with communication on same machine | ACI Container Group | Shared localhost network, sidecar pattern |
Sizing: how to define CPU and memoryβ
| Workload type | Recommended CPU | Recommended Memory |
|---|---|---|
| Simple REST API, I/O bound | 0.25-0.5 vCPU | 0.5-1.0 Gi |
| API with light processing | 0.5-1.0 vCPU | 1.0-2.0 Gi |
| Data processing / compute-intensive | 1.0-2.0 vCPU | 2.0-4.0 Gi |
| Machine Learning inference | 2.0-4.0 vCPU | 4.0-8.0 Gi |
| Batch data transformation job | 1.0-4.0 vCPU | 2.0-8.0 Gi |
Scaling configuration in ACAβ
| Requirement | Configuration |
|---|---|
| Can never go offline | minReplicas: 1 or more |
| Can have acceptable cold start | minReplicas: 0 (scale-to-zero) |
| Variable but predictable traffic | HTTP scaling with adjusted concurrentRequests |
| Asynchronous queue processing | KEDA event scaling with messageCount threshold |
| Guaranteed baseline + additional scaling | minReplicas: 2, maxReplicas: 20 |
9. Best Practicesβ
Define resource limits based on real load testing, not estimates. Proper sizing requires profiling: run your application with representative load and measure CPU and memory usage. A container sized by intuition usually results in over-provisioning (extra cost) or under-provisioning (failures).
In ACA, start with minReplicas: 1 for production. Scale-to-zero saves money but adds cold start latency. For public APIs, the first user's experience after an inactive period will be degraded. Evaluate the trade-off between cost and user experience.
Use ACA revisions for safe deployments. ACA supports traffic splitting between revisions (e.g., 90% on current revision, 10% on new). This enables blue/green deployments or canary releases without additional infrastructure.
# Configure traffic splitting between revisions
az containerapp ingress traffic set \
--name "api-gateway" \
--resource-group "rg-containers" \
--revision-weight "api-gateway--revision1=90" "api-gateway--revision2=10"
Configure conservative memory limits to detect memory leaks. If a container consistently uses 90%+ of configured memory, either sizing needs to be increased or there's a memory leak that needs investigation. Monitor memory usage over time.
For ACI in production, use container groups with VNet integration. ACI with direct public IP is suitable for testing. In production, integrate to VNet for network access control and use Private DNS.
In ACA, define maxReplicas based on backend capacity. If your Container App depends on a database with 100 maximum connections, don't configure maxReplicas above the number of connections your application uses per instance. 50 replicas with 3 connections each = 150 connections, overloading the database.
10. Common Errorsβ
| Error | Why it happens | How to avoid |
|---|---|---|
| ACI with insufficient sizing failing | Estimating CPU/memory without testing | Test with load before prod; monitor metrics in first week |
| ACA scaling to maxReplicas and saturating database | Not considering downstream limits | Define maxReplicas based on dependency limits |
| ACI container not starting due to OOMKilled | Insufficient configured memory | Monitor logs; increase memory if frequent OOMKilled |
| Scale-to-zero causing user timeouts | Unexpected cold start in prod | Use minReplicas: 1 for latency-sensitive public endpoints |
| ACI being used where ACA would be better | Lack of ACA knowledge | ACI for jobs; ACA for long-running services with scaling |
| Too aggressive scaling rule causing flapping | Too low threshold for scale-in | Increase scale-in threshold or use longer stabilization period |
| 100% CPU but scaling not happening | CPU scaling threshold set above current usage | Review configuration; CPU scaling needs threshold < current usage |
| Unexpectedly high ACA cost | Too high maxReplicas reached during peak | Monitor replica count; adjust maxReplicas based on load testing |
11. Operations and Maintenanceβ
Monitor ACIβ
# View state and events of a Container Instance
az container show \
--resource-group "rg-containers" \
--name "aci-processor" \
--query "{State: instanceView.state, Events: instanceView.events}" \
--output json
# Real-time log streaming
az container attach \
--resource-group "rg-containers" \
--name "aci-processor"
# Execute command inside container (debug)
az container exec \
--resource-group "rg-containers" \
--name "aci-processor" \
--exec-command "/bin/bash"
Monitor ACAβ
# View currently active replicas
az containerapp replica list \
--name "api-gateway" \
--resource-group "rg-containers" \
--output table
# View revision history
az containerapp revision list \
--name "api-gateway" \
--resource-group "rg-containers" \
--output table
# View Container App logs (via Log Analytics)
az monitor log-analytics query \
--workspace "<workspace-id>" \
--analytics-query "
ContainerAppConsoleLogs_CL
| where ContainerAppName_s == 'api-gateway'
| where TimeGenerated > ago(1h)
| project TimeGenerated, Log_s
| order by TimeGenerated desc
| limit 50"
# View CPU and memory metrics
az monitor metrics list \
--resource "/subscriptions/<sub-id>/resourceGroups/rg-containers/providers/Microsoft.App/containerApps/api-gateway" \
--metric "CpuPercentage" \
--interval PT5M \
--output table
Important limitsβ
| Service | Limit | Value |
|---|---|---|
| ACI | vCPU per container | 4 |
| ACI | Memory per container (Linux) | 16 GB |
| ACI | Containers per Container Group | 60 |
| ACI | Container Groups per region | 100 (default, increasable) |
| ACA | vCPU per container | 2.0 |
| ACA | Memory per container | 4.0 Gi |
| ACA | maxReplicas per Container App | 300 |
| ACA | Container Apps per Environment | 20 (default) |
| ACA | Revisions per Container App | 100 |
12. Integration and Automationβ
ACI as job executor in pipelinesβ
ACA with KEDA for event-driven processingβ
Terraform for ACA management at scaleβ
resource "azurerm_container_app_environment" "prod" {
name = "env-production"
location = "brazilsouth"
resource_group_name = azurerm_resource_group.main.name
log_analytics_workspace_id = azurerm_log_analytics_workspace.main.id
}
resource "azurerm_container_app" "api" {
name = "api-gateway"
container_app_environment_id = azurerm_container_app_environment.prod.id
resource_group_name = azurerm_resource_group.main.name
revision_mode = "Single"
template {
container {
name = "api-gateway"
image = "myregistry.azurecr.io/api-gateway:v1.0"
cpu = 0.5
memory = "1Gi"
}
min_replicas = 1
max_replicas = 10
http_scale_rule {
name = "http-scaling"
concurrent_requests = 50
}
}
ingress {
external_enabled = true
target_port = 8080
traffic_weight {
percentage = 100
latest_revision = true
}
}
}
13. Final Summaryβ
Essential points:
- Azure Container Instances (ACI): simple service to run containers on-demand; sizing defined at creation (CPU and memory per container); no automatic scaling; ideal for jobs, one-time tasks, and workloads without orchestration
- Azure Container Apps (ACA): managed platform with automatic scaling via HTTP, CPU, memory, or external events (KEDA); supports scale-to-zero; ideal for long-running services, APIs, and event-driven processing
- ACI sizing is free-form within limits (0.1 to 4 vCPU, 0.1 to 16 GB); ACA sizing follows predefined proportions (1 vCPU = 2 Gi memory)
- Scale-to-zero in ACA eliminates cost when there's no demand but introduces cold start latency; avoid for endpoints with high availability SLA using
minReplicas >= 1 - Resizing ACI requires deleting and recreating the container; resizing ACA is done via update that automatically generates a new revision
Critical differences:
- ACI vs. ACA: ACI is for one-time executions without management; ACA is for continuous services with automatic scaling
- Container Group (ACI) vs. Container App (ACA): Container Group is a set of containers on the same host; Container App is a deployment unit that can have N identical replicas
- HTTP scaling vs. KEDA Event: HTTP scaling responds to concurrent requests; KEDA event scaling responds to external system metrics like queues and topics
What needs to be remembered for AZ-104:
- ACI charges per vCPU and memory per second of execution
- ACA charges per vCPU and memory per active replica (scale-to-zero = zero cost)
- In ACI, maximum ratio is 1 vCPU : 4 GB memory
- In ACA, ratio is always 1 vCPU : 2 Gi memory (fixed)
- maxReplicas limit in ACA is 300 per Container App
- ACI supports GPU; ACA doesn't support GPU
- ACI restart policies: Always, Never, OnFailure
- ACA scaling is managed internally by KEDA (Kubernetes Event-Driven Autoscaling)