Theoretical Foundation: Provision a container by using Azure Container Instances
1. Initial Intuitionβ
Imagine you need to execute a specific task: process a file, send a report, or run a maintenance script. For this, you would normally need to provision a VM, configure the operating system, install dependencies, execute the task, and then shut down the VM. This entire process takes time and involves infrastructure management that has no relation to the task itself.
Azure Container Instances (ACI) eliminates this overhead. You tell Azure: "run this container with this image, with 1 CPU and 1.5 GB of RAM, please". Azure takes care of all the underlying infrastructure. In seconds, your container is running. When it finishes, you only pay for the time the container was active.
It's like renting a hotel room for one night, instead of buying an apartment. No property management, no long-term contracts, pay only for the time you used.
2. Contextβ
ACI occupies a specific position in the Azure compute ecosystem:
ACI is not a replacement for AKS in complex production applications, nor for App Service in long-running web applications. It fills the niche of simple executions, short-duration tasks, development/test environments, and batch processes.
ACI also has a special relationship with AKS through Virtual Kubelet: AKS can use ACI as a "virtual node" to scale load bursts without needing to provision additional physical nodes.
3. Building the Conceptsβ
3.1 Container Group: The Fundamental Unitβ
In ACI, the deployment unit is the Container Group, not the individual container. A container group is a set of containers that share the same host, the same network, and the same volumes. It's the equivalent of a Pod in Kubernetes.
For simple cases (a single container), you work with a container group of a single container, but the concept exists to support sidecar patterns where an auxiliary container complements the primary one.
3.2 OS Types: Linux and Windowsβ
ACI supports both Linux and Windows containers. Most container workloads use Linux. Windows containers are supported but with some limitations:
| Aspect | Linux | Windows |
|---|---|---|
| Available sizes | More options | Smaller subset |
| Cost | Lower | Higher |
| Base images | Alpine, Ubuntu, Debian | Windows Server Core, Nano Server |
| Use case | Most modern applications | Legacy .NET Framework applications |
3.3 Network Types in ACIβ
ACI has three network configuration options:
| Type | Description | Access |
|---|---|---|
| Public | Public IP assigned to container group | Accessible from internet |
| Private (VNet) | Container injected into a VNet subnet | Only within VNet and connected networks |
| None | No network interface | Container without network access (isolated processing) |
The VNet (Private) mode is implemented through subnet delegation: the subnet must have the Microsoft.ContainerInstance/containerGroups delegation configured to receive ACI containers.
3.4 Restart Policyβ
Defines behavior when the container's main process terminates:
| Policy | Behavior | When to use |
|---|---|---|
| Always | Restarts whenever container stops | Long-running services |
| Never | Does not restart (terminates after completion) | One-time batch tasks |
| OnFailure | Restarts only on failure (exit code != 0) | Tasks that may fail and need retry |
3.5 Resources: CPU and Memoryβ
ACI charges per vCPU and GB of memory per second of execution. The limits:
- CPU: from 0.1 to 4 vCPU per container (maximum 4 vCPU per container group)
- Memory: from 0.1 to 16 GB per container group (proportional to CPU)
- GPU: available in some sizes (K80, P100, V100) for ML workloads
The CPU/memory relationship has restrictions: it's not possible to allocate extreme proportions (lots of CPU with little memory or vice versa).
3.6 Environment Variables and Secretsβ
Containers receive configuration via environment variables. For sensitive information (passwords, API keys), ACI supports secure environment variables: the value is sent securely and doesn't appear in logs or CLI/portal output.
# Common variable (visible in logs)
--environment-variables ENVIRONMENT=production
# Secure variable (value doesn't appear in logs)
--secure-environment-variables DB_PASSWORD=mypassword
3.7 Volumes and Persistenceβ
By default, an ACI container's storage is ephemeral: data written inside the container is lost when it stops. For persistence, ACI supports:
| Volume type | Protocol | When to use |
|---|---|---|
| Azure Files | SMB | Sharing between containers and persistence |
| Azure Files (gitRepo - deprecated) | - | Git repository clone |
| emptyDir | - | Temporary sharing between containers in the same group |
| Secret | - | Mount secrets as files |
The Azure Files volume is mounted as an SMB share inside the container, allowing data to persist beyond the container's lifecycle.
4. Structural Viewβ
5. Practical Operationβ
ACI Container Lifecycleβ
Important and Non-Obvious Behaviorsβ
Image pull happens on every creation: ACI doesn't maintain an image cache between executions. Every time a container group is created, the image is downloaded again from the registry. For large images (hundreds of MB), this adds startup time. To optimize, use smaller images (Alpine) and well-organized layers.
Stopped containers still incur storage cost: when a container group is in Stopped state, there's no charge for CPU and memory, but the container group still exists as a resource. To completely eliminate cost, delete the container group.
Managed Identity works in ACI: it's possible to assign a Managed Identity to a container group. This allows code inside the container to obtain Entra ID tokens to access other Azure services (Key Vault, Storage, etc.) without hardcoded credentials.
Generated FQDN is unpredictable: when creating a container with public IP, Azure automatically generates an FQDN in the format <dns-name-label>.<region>.azurecontainer.io. This label must be unique in the region. If you don't specify a DNS label, the container is only accessible by IP.
6. Implementation Methodsβ
6.1 Azure Portalβ
When to use: exploratory creation, quick tests, status verification.
Path: Create a resource > Containers > Container Instances
The wizard has tabs:
- Basics: image, name, SKU (Linux/Windows), region
- Networking: network type, ports, DNS label
- Advanced: environment variables, restart policy, custom commands, volumes
- Tags
6.2 Azure CLIβ
Create simple container with public IP:
az container create \
--name my-container \
--resource-group rg-containers \
--image nginx:alpine \
--cpu 0.5 \
--memory 0.5 \
--ip-address Public \
--ports 80 \
--dns-name-label my-unique-nginx \
--restart-policy Always \
--location brazilsouth
Create container using ACR image with Managed Identity:
# Create managed identity
az identity create \
--name id-aci-production \
--resource-group rg-containers
# Assign AcrPull to ACR for the identity
ACR_ID=$(az acr show --name myacr --resource-group rg-containers --query id -o tsv)
IDENTITY_ID=$(az identity show --name id-aci-production --resource-group rg-containers --query principalId -o tsv)
az role assignment create --assignee $IDENTITY_ID --role AcrPull --scope $ACR_ID
# Create container using identity for pull
az container create \
--name api-container \
--resource-group rg-containers \
--image myacr.azurecr.io/api-gateway:v1.2.3 \
--cpu 1 \
--memory 1.5 \
--ip-address Private \
--vnet vnet-production \
--subnet subnet-containers \
--assign-identity $(az identity show --name id-aci-production --resource-group rg-containers --query id -o tsv) \
--registry-login-server myacr.azurecr.io \
--restart-policy OnFailure
Create container with environment variables and Azure Files volume:
# Create Azure Files share
STORAGE_KEY=$(az storage account keys list \
--account-name myaccount \
--resource-group rg-containers \
--query "[0].value" -o tsv)
az storage share create \
--name container-data \
--account-name myaccount \
--account-key $STORAGE_KEY
# Create container with mounted volume
az container create \
--name processor \
--resource-group rg-containers \
--image myacr.azurecr.io/processor:latest \
--cpu 2 \
--memory 4 \
--ip-address None \
--restart-policy Never \
--environment-variables ENVIRONMENT=production BUCKET=results \
--secure-environment-variables DB_PASSWORD=mypassword \
--azure-file-volume-account-name myaccount \
--azure-file-volume-account-key $STORAGE_KEY \
--azure-file-volume-share-name container-data \
--azure-file-volume-mount-path /mnt/data
Check status and logs:
# Container status
az container show \
--name my-container \
--resource-group rg-containers \
--query "{Status:instanceView.state, IP:ipAddress.ip, FQDN:ipAddress.fqdn}" \
--output table
# Container logs (stdout/stderr)
az container logs \
--name my-container \
--resource-group rg-containers \
--follow # real-time streaming
# Access interactive shell in container (if still running)
az container exec \
--name my-container \
--resource-group rg-containers \
--exec-command "/bin/sh"
# Stop and start
az container stop --name my-container --resource-group rg-containers
az container start --name my-container --resource-group rg-containers
# Delete
az container delete --name my-container --resource-group rg-containers --yes
Deploy via YAML file (complex container groups):
# container-group.yaml
apiVersion: 2021-10-01
location: brazilsouth
name: my-container-group
properties:
containers:
- name: main-app
properties:
image: myacr.azurecr.io/api:v1.0
resources:
requests:
cpu: 1
memoryInGb: 1.5
ports:
- port: 80
protocol: TCP
environmentVariables:
- name: ENVIRONMENT
value: production
- name: DB_PASSWORD
secureValue: mypassword
volumeMounts:
- name: data
mountPath: /mnt/data
- name: log-sidecar
properties:
image: fluent/fluent-bit:latest
resources:
requests:
cpu: 0.5
memoryInGb: 0.5
volumeMounts:
- name: data
mountPath: /mnt/logs
osType: Linux
ipAddress:
type: Public
ports:
- port: 80
protocol: TCP
dnsNameLabel: my-app-prod
restartPolicy: Always
volumes:
- name: data
azureFile:
shareName: container-data
storageAccountName: myaccount
storageAccountKey: <key>
type: Microsoft.ContainerInstance/containerGroups
# Deploy via YAML file
az container create \
--resource-group rg-containers \
--file container-group.yaml
6.3 PowerShellβ
# Create simple container
$container = New-AzContainerInstanceObject `
-Name "app-container" `
-Image "nginx:alpine" `
-RequestCpu 0.5 `
-RequestMemoryInGb 0.5 `
-Port @(New-AzContainerInstancePortObject -Port 80 -Protocol TCP)
New-AzContainerGroup `
-ResourceGroupName "rg-containers" `
-Name "my-container-group" `
-Location "brazilsouth" `
-Container @($container) `
-OsType Linux `
-IpAddressType Public `
-DnsNameLabel "my-nginx-ps" `
-RestartPolicy Always
6.4 Bicepβ
resource containerGroup 'Microsoft.ContainerInstance/containerGroups@2023-05-01' = {
name: 'my-container-group'
location: location
properties: {
containers: [
{
name: 'app-container'
properties: {
image: 'myacr.azurecr.io/api:v1.0'
resources: {
requests: {
cpu: 1
memoryInGB: 1.5
}
}
ports: [
{
port: 80
protocol: 'TCP'
}
]
environmentVariables: [
{
name: 'ENVIRONMENT'
value: 'production'
}
{
name: 'DB_PASSWORD'
secureValue: dbPassword // secure parameter
}
]
}
}
]
osType: 'Linux'
ipAddress: {
type: 'Public'
ports: [
{
port: 80
protocol: 'TCP'
}
]
dnsNameLabel: 'my-app-prod-${uniqueString(resourceGroup().id)}'
}
restartPolicy: 'Always'
}
}
7. Control and Securityβ
Authentication in Private Registriesβ
To use ACR images, ACI can authenticate in three ways:
- Managed Identity (recommended): the container group has an identity with AcrPull role on ACR
- Service Principal: username = service principal ID, password = client secret
- Admin credentials: username = registry name, password = admin password (not recommended)
# Using Managed Identity (must specify registry login server)
az container create \
--assign-identity <identity-resource-id> \
--registry-login-server myacr.azurecr.io \
...
Containers in VNet: Network Isolationβ
For workloads that need to access internal VNet resources (databases, private services), ACI in VNet mode is essential:
# Create subnet with delegation for ACI
az network vnet subnet create \
--name subnet-aci \
--vnet-name vnet-production \
--resource-group rg-networking \
--address-prefix 10.0.5.0/24 \
--delegations Microsoft.ContainerInstance/containerGroups
# Container in VNet
az container create \
--ip-address Private \
--vnet vnet-production \
--subnet subnet-aci \
...
Important restriction: containers in VNet cannot have simultaneous public IP. The choice is one or the other.
Security Initiativesβ
- Never use common environment variables for passwords; always use
--secure-environment-variables - For more complex secrets, use Managed Identity + Key Vault to fetch secrets at runtime
- Production containers should be in VNet, not with direct public IP
- Images should come from private registries (ACR), not directly from Docker Hub in production
8. Decision Makingβ
ACI vs. other compute servicesβ
| Situation | Best choice | Reason |
|---|---|---|
| Short batch task (< 1 hour) | ACI | Pay per second, no infrastructure management |
| Long-running web application | App Service | Better for persistent workloads, built-in scaling |
| Complex microservices | AKS | Service mesh, advanced orchestration |
| Event-driven processing | Azure Functions | Better event integration, consumption billing |
| Development/testing | ACI | Quick creation/deletion, cost-effective |
| Burst scaling from AKS | ACI + Virtual Kubelet | Elastic scaling without additional nodes |
| Windows legacy applications | VM or ACI (Windows) | Depends on complexity and dependencies |
| GPU workloads | ACI (with GPU) or VM | ACI for simple ML inference, VM for training |
Key decision factors:β
- Duration: < 1 hour β ACI, persistent β App Service/AKS
- Complexity: Single container β ACI, orchestration β AKS
- Network: Internet exposure β Public ACI, internal β VNet ACI
- State: Stateless β ACI, stateful β consider other options with persistent storage
ACI shines in scenarios where you need "compute on demand" without the overhead of managing virtual machines or complex orchestration platforms.
| Short-duration batch task | ACI with restartPolicy: Never | Pay only for execution time |
| Simple long-duration REST API | App Service Containers | More web features (SSL, custom domain, deployment slots) |
| Production microservices, multiple instances | AKS | Orchestration, auto scaling, service mesh |
| Container dev/test environment | ACI | Fast, no cluster overhead |
| Event processing (Function-like) | Azure Container Apps | Native scaling to zero, KEDA |
| Container that needs VNet resource access | ACI in VNet mode | Native VNet integration |
| Load burst in AKS | ACI via Virtual Kubelet | Fast scale out without new nodes |
When to use different restart policy?β
| Scenario | Restart Policy | Reason |
|---|---|---|
| Web server, API | Always | Restarts if crashes or fails |
| Data processing job | Never | Runs once and terminates |
| Script that may have transient failure | OnFailure | Retry on failure, doesn't restart after success |
9. Best Practicesβ
Use minimal images (Alpine, Distroless): smaller images start faster (less data to download), cost less in ACR storage, and have smaller security attack surface.
Prefer Managed Identity to explicit credentials: instead of passing passwords and API keys as environment variables, configure a Managed Identity on the container group and use it to access Key Vault, Storage, and other Azure services.
For batch tasks, use restartPolicy: Never and check exit code: a script that terminates with exit code 0 indicates success. Exit code other than zero indicates failure. Use this convention so ACI knows if the task was successful.
Define conservative CPU and memory limits: ACI charges for allocated resources, not usage. If you allocate 4 vCPU but use 0.5, you pay for 4. Size based on real application benchmarks.
Name DNS labels predictably and uniquely: the DNS label is part of the public FQDN. Use a pattern like <app>-<environment>-<unique-suffix> to avoid conflicts and facilitate identification.
10. Common Errorsβ
Using restart policy "Always" for batch tasks
A processing script terminates with exit code 0 (success). With Always, ACI immediately restarts. The script processes the same data over and over again, in a loop, consuming resources and potentially processing duplicate data. For batch tasks, use Never.
Not specifying DNS label and then not getting the FQDN
The container is created with public IP but without --dns-name-label. The IP changes every time the container group is recreated. The infra team needs to manually update all systems pointing to the IP. Always define a predictable DNS label for containers with public IP.
Image pull fails silently due to lack of ACR permission
The container group is created but stays in Pending state indefinitely. The "image pull failed" error is in logs but not obvious in the portal. The cause is that the container doesn't have credentials to access the private ACR. Configure the Managed Identity with AcrPull before creating the container.
Container in VNet cannot access internet for external APIs
A container in the VNet needs to call an external API. The outbound traffic from the subnet has no route to the internet (no NAT Gateway, no UDR to internet). The container gets stuck waiting for response. Containers in VNet that need internet access require explicit outbound configuration (NAT Gateway or Azure Firewall with UDR).
Store data in container filesystem without mounting volume
A processing job writes results to /tmp/resultados. The job finishes. The container group is deleted. The results are lost. For persistence, always mount an Azure Files volume to directories where important data is written.
11. Operation and Maintenanceβ
Monitoringβ
ACI sends metrics to Azure Monitor:
| Metric | What it measures |
|---|---|
| CpuUsage | vCPU used by container group |
| MemoryUsage | Memory in bytes in use |
| NetworkBytesReceivedPerSecond | Inbound traffic |
| NetworkBytesTransmittedPerSecond | Outbound traffic |
To access logs from a running container or after termination:
# Logs from main container
az container logs --name meu-container --resource-group rg-containers
# Logs from specific container in a group with multiple containers
az container logs \
--name meu-container-group \
--resource-group rg-containers \
--container-name sidecar-container
Logs are retained while the container group exists. After deleting the container group, logs are lost unless they were sent to a log service (Log Analytics, Azure Monitor).
Configure Log Analyticsβ
WORKSPACE_ID=$(az monitor log-analytics workspace show \
--workspace-name law-monitoring \
--resource-group rg-monitoring \
--query customerId -o tsv)
WORKSPACE_KEY=$(az monitor log-analytics workspace get-shared-keys \
--workspace-name law-monitoring \
--resource-group rg-monitoring \
--query primarySharedKey -o tsv)
az container create \
--name meu-container \
--resource-group rg-containers \
--log-analytics-workspace $WORKSPACE_ID \
--log-analytics-workspace-key $WORKSPACE_KEY \
...
With Log Analytics configured, container logs are automatically sent to the workspace and can be queried with KQL even after the container is deleted.
Important Limitsβ
| Item | Limit |
|---|---|
| CPUs per container group | 4 vCPU |
| Memory per container group | 16 GB |
| Containers per container group | 60 |
| Container groups per subscription per region | 100 (default, adjustable) |
| Maximum duration of a container group | No technical limit (but billing limitations) |
| Ports per container group | Multiple, but no overlap between containers |
12. Integration and Automationβ
ACI as Backend for Azure Logic Apps and Functionsβ
ACI can be created and destroyed programmatically via REST API, making it ideal for job orchestration:
Virtual Kubelet: ACI as AKS Nodeβ
AKS can use ACI as a "virtual node" to absorb demand bursts without scaling actual cluster nodes:
# Enable Virtual Nodes addon in AKS
az aks enable-addons \
--resource-group rg-aks \
--name meu-cluster \
--addons virtual-node \
--subnet-name subnet-aci
With this, Pods with toleration virtual-kubelet.io/provider=azure are scheduled on ACI instead of physical nodes, allowing instant scale without node provisioning time.
Automation via ARM/Bicep in Pipelinesβ
For data processing pipelines where each run is a container:
#!/bin/bash
# Pipeline: create container, wait for completion, collect result, delete
CONTAINER_NAME="job-$(date +%Y%m%d%H%M%S)"
RG="rg-containers"
# Create container
az container create \
--name $CONTAINER_NAME \
--resource-group $RG \
--image minhacr.azurecr.io/processador:latest \
--cpu 2 --memory 4 \
--ip-address None \
--restart-policy Never \
--environment-variables INPUT_FILE="arquivo-$(date +%Y%m%d).csv"
# Wait for completion
az container wait \
--name $CONTAINER_NAME \
--resource-group $RG \
--condition terminated
# Check exit code
EXIT_CODE=$(az container show \
--name $CONTAINER_NAME \
--resource-group $RG \
--query "containers[0].instanceView.currentState.exitCode" -o tsv)
if [ "$EXIT_CODE" == "0" ]; then
echo "Job completed successfully"
else
echo "Job failed with exit code $EXIT_CODE"
az container logs --name $CONTAINER_NAME --resource-group $RG
fi
# Clean up container
az container delete --name $CONTAINER_NAME --resource-group $RG --yes
13. Final Summaryβ
Essential points:
- ACI runs containers without VM or cluster management. It's ideal for short-duration tasks, batch jobs, and development.
- The deployment unit is the Container Group, which can contain multiple containers sharing network and volumes.
- Three restart policies: Always (services), Never (batch tasks), OnFailure (retry on failure).
- Three network modes: Public (public IP), Private/VNet (network isolation), None (no network).
Critical differences:
- ACI vs. AKS: ACI is for simple containers, single executions, or development. AKS is for production microservices, multiple replicas, and complex orchestration.
- Restart policy Never vs. Always: Never is for jobs that run once; Always is for continuous services. Using Always on a batch job creates an infinite loop.
- Ephemeral storage vs. Azure Files Volume: data in container filesystem is lost when container stops. Data in Azure Files volume persists.
- Public IP vs. VNet mode: containers in VNet cannot have public IP; choose one or the other.
What needs to be remembered:
- ACI charges for allocated vCPU and GB of memory, per second of execution.
- Stopped containers (
Stopped) don't charge for CPU/memory, but the container group still exists (may incur associated storage costs). - For private ACR images, use Managed Identity with
AcrPullrole. - The subnet for ACI in VNet needs delegation
Microsoft.ContainerInstance/containerGroups. - The
az container wait --condition terminatedcommand blocks until the container terminates, useful in automated pipelines. - Logs are lost when the container group is deleted. Configure Log Analytics for retention.