Skip to main content

Theoretical Foundation: Provision a container by using Azure Container Instances


1. Initial Intuition​

Imagine you need to execute a specific task: process a file, send a report, or run a maintenance script. For this, you would normally need to provision a VM, configure the operating system, install dependencies, execute the task, and then shut down the VM. This entire process takes time and involves infrastructure management that has no relation to the task itself.

Azure Container Instances (ACI) eliminates this overhead. You tell Azure: "run this container with this image, with 1 CPU and 1.5 GB of RAM, please". Azure takes care of all the underlying infrastructure. In seconds, your container is running. When it finishes, you only pay for the time the container was active.

It's like renting a hotel room for one night, instead of buying an apartment. No property management, no long-term contracts, pay only for the time you used.


2. Context​

ACI occupies a specific position in the Azure compute ecosystem:

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

ACI is not a replacement for AKS in complex production applications, nor for App Service in long-running web applications. It fills the niche of simple executions, short-duration tasks, development/test environments, and batch processes.

ACI also has a special relationship with AKS through Virtual Kubelet: AKS can use ACI as a "virtual node" to scale load bursts without needing to provision additional physical nodes.


3. Building the Concepts​

3.1 Container Group: The Fundamental Unit​

In ACI, the deployment unit is the Container Group, not the individual container. A container group is a set of containers that share the same host, the same network, and the same volumes. It's the equivalent of a Pod in Kubernetes.

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

For simple cases (a single container), you work with a container group of a single container, but the concept exists to support sidecar patterns where an auxiliary container complements the primary one.

3.2 OS Types: Linux and Windows​

ACI supports both Linux and Windows containers. Most container workloads use Linux. Windows containers are supported but with some limitations:

AspectLinuxWindows
Available sizesMore optionsSmaller subset
CostLowerHigher
Base imagesAlpine, Ubuntu, DebianWindows Server Core, Nano Server
Use caseMost modern applicationsLegacy .NET Framework applications

3.3 Network Types in ACI​

ACI has three network configuration options:

TypeDescriptionAccess
PublicPublic IP assigned to container groupAccessible from internet
Private (VNet)Container injected into a VNet subnetOnly within VNet and connected networks
NoneNo network interfaceContainer without network access (isolated processing)

The VNet (Private) mode is implemented through subnet delegation: the subnet must have the Microsoft.ContainerInstance/containerGroups delegation configured to receive ACI containers.

3.4 Restart Policy​

Defines behavior when the container's main process terminates:

PolicyBehaviorWhen to use
AlwaysRestarts whenever container stopsLong-running services
NeverDoes not restart (terminates after completion)One-time batch tasks
OnFailureRestarts only on failure (exit code != 0)Tasks that may fail and need retry

3.5 Resources: CPU and Memory​

ACI charges per vCPU and GB of memory per second of execution. The limits:

  • CPU: from 0.1 to 4 vCPU per container (maximum 4 vCPU per container group)
  • Memory: from 0.1 to 16 GB per container group (proportional to CPU)
  • GPU: available in some sizes (K80, P100, V100) for ML workloads

The CPU/memory relationship has restrictions: it's not possible to allocate extreme proportions (lots of CPU with little memory or vice versa).

3.6 Environment Variables and Secrets​

Containers receive configuration via environment variables. For sensitive information (passwords, API keys), ACI supports secure environment variables: the value is sent securely and doesn't appear in logs or CLI/portal output.

# Common variable (visible in logs)
--environment-variables ENVIRONMENT=production

# Secure variable (value doesn't appear in logs)
--secure-environment-variables DB_PASSWORD=mypassword

3.7 Volumes and Persistence​

By default, an ACI container's storage is ephemeral: data written inside the container is lost when it stops. For persistence, ACI supports:

Volume typeProtocolWhen to use
Azure FilesSMBSharing between containers and persistence
Azure Files (gitRepo - deprecated)-Git repository clone
emptyDir-Temporary sharing between containers in the same group
Secret-Mount secrets as files

The Azure Files volume is mounted as an SMB share inside the container, allowing data to persist beyond the container's lifecycle.


4. Structural View​

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

5. Practical Operation​

ACI Container Lifecycle​

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

Important and Non-Obvious Behaviors​

Image pull happens on every creation: ACI doesn't maintain an image cache between executions. Every time a container group is created, the image is downloaded again from the registry. For large images (hundreds of MB), this adds startup time. To optimize, use smaller images (Alpine) and well-organized layers.

Stopped containers still incur storage cost: when a container group is in Stopped state, there's no charge for CPU and memory, but the container group still exists as a resource. To completely eliminate cost, delete the container group.

Managed Identity works in ACI: it's possible to assign a Managed Identity to a container group. This allows code inside the container to obtain Entra ID tokens to access other Azure services (Key Vault, Storage, etc.) without hardcoded credentials.

Generated FQDN is unpredictable: when creating a container with public IP, Azure automatically generates an FQDN in the format <dns-name-label>.<region>.azurecontainer.io. This label must be unique in the region. If you don't specify a DNS label, the container is only accessible by IP.


6. Implementation Methods​

6.1 Azure Portal​

When to use: exploratory creation, quick tests, status verification.

Path: Create a resource > Containers > Container Instances

The wizard has tabs:

  • Basics: image, name, SKU (Linux/Windows), region
  • Networking: network type, ports, DNS label
  • Advanced: environment variables, restart policy, custom commands, volumes
  • Tags

6.2 Azure CLI​

Create simple container with public IP:

az container create \
--name my-container \
--resource-group rg-containers \
--image nginx:alpine \
--cpu 0.5 \
--memory 0.5 \
--ip-address Public \
--ports 80 \
--dns-name-label my-unique-nginx \
--restart-policy Always \
--location brazilsouth

Create container using ACR image with Managed Identity:

# Create managed identity
az identity create \
--name id-aci-production \
--resource-group rg-containers

# Assign AcrPull to ACR for the identity
ACR_ID=$(az acr show --name myacr --resource-group rg-containers --query id -o tsv)
IDENTITY_ID=$(az identity show --name id-aci-production --resource-group rg-containers --query principalId -o tsv)
az role assignment create --assignee $IDENTITY_ID --role AcrPull --scope $ACR_ID

# Create container using identity for pull
az container create \
--name api-container \
--resource-group rg-containers \
--image myacr.azurecr.io/api-gateway:v1.2.3 \
--cpu 1 \
--memory 1.5 \
--ip-address Private \
--vnet vnet-production \
--subnet subnet-containers \
--assign-identity $(az identity show --name id-aci-production --resource-group rg-containers --query id -o tsv) \
--registry-login-server myacr.azurecr.io \
--restart-policy OnFailure

Create container with environment variables and Azure Files volume:

# Create Azure Files share
STORAGE_KEY=$(az storage account keys list \
--account-name myaccount \
--resource-group rg-containers \
--query "[0].value" -o tsv)

az storage share create \
--name container-data \
--account-name myaccount \
--account-key $STORAGE_KEY

# Create container with mounted volume
az container create \
--name processor \
--resource-group rg-containers \
--image myacr.azurecr.io/processor:latest \
--cpu 2 \
--memory 4 \
--ip-address None \
--restart-policy Never \
--environment-variables ENVIRONMENT=production BUCKET=results \
--secure-environment-variables DB_PASSWORD=mypassword \
--azure-file-volume-account-name myaccount \
--azure-file-volume-account-key $STORAGE_KEY \
--azure-file-volume-share-name container-data \
--azure-file-volume-mount-path /mnt/data

Check status and logs:

# Container status
az container show \
--name my-container \
--resource-group rg-containers \
--query "{Status:instanceView.state, IP:ipAddress.ip, FQDN:ipAddress.fqdn}" \
--output table

# Container logs (stdout/stderr)
az container logs \
--name my-container \
--resource-group rg-containers \
--follow # real-time streaming

# Access interactive shell in container (if still running)
az container exec \
--name my-container \
--resource-group rg-containers \
--exec-command "/bin/sh"

# Stop and start
az container stop --name my-container --resource-group rg-containers
az container start --name my-container --resource-group rg-containers

# Delete
az container delete --name my-container --resource-group rg-containers --yes

Deploy via YAML file (complex container groups):

# container-group.yaml
apiVersion: 2021-10-01
location: brazilsouth
name: my-container-group
properties:
containers:
- name: main-app
properties:
image: myacr.azurecr.io/api:v1.0
resources:
requests:
cpu: 1
memoryInGb: 1.5
ports:
- port: 80
protocol: TCP
environmentVariables:
- name: ENVIRONMENT
value: production
- name: DB_PASSWORD
secureValue: mypassword
volumeMounts:
- name: data
mountPath: /mnt/data
- name: log-sidecar
properties:
image: fluent/fluent-bit:latest
resources:
requests:
cpu: 0.5
memoryInGb: 0.5
volumeMounts:
- name: data
mountPath: /mnt/logs
osType: Linux
ipAddress:
type: Public
ports:
- port: 80
protocol: TCP
dnsNameLabel: my-app-prod
restartPolicy: Always
volumes:
- name: data
azureFile:
shareName: container-data
storageAccountName: myaccount
storageAccountKey: <key>
type: Microsoft.ContainerInstance/containerGroups
# Deploy via YAML file
az container create \
--resource-group rg-containers \
--file container-group.yaml

6.3 PowerShell​

# Create simple container
$container = New-AzContainerInstanceObject `
-Name "app-container" `
-Image "nginx:alpine" `
-RequestCpu 0.5 `
-RequestMemoryInGb 0.5 `
-Port @(New-AzContainerInstancePortObject -Port 80 -Protocol TCP)

New-AzContainerGroup `
-ResourceGroupName "rg-containers" `
-Name "my-container-group" `
-Location "brazilsouth" `
-Container @($container) `
-OsType Linux `
-IpAddressType Public `
-DnsNameLabel "my-nginx-ps" `
-RestartPolicy Always

6.4 Bicep​

resource containerGroup 'Microsoft.ContainerInstance/containerGroups@2023-05-01' = {
name: 'my-container-group'
location: location
properties: {
containers: [
{
name: 'app-container'
properties: {
image: 'myacr.azurecr.io/api:v1.0'
resources: {
requests: {
cpu: 1
memoryInGB: 1.5
}
}
ports: [
{
port: 80
protocol: 'TCP'
}
]
environmentVariables: [
{
name: 'ENVIRONMENT'
value: 'production'
}
{
name: 'DB_PASSWORD'
secureValue: dbPassword // secure parameter
}
]
}
}
]
osType: 'Linux'
ipAddress: {
type: 'Public'
ports: [
{
port: 80
protocol: 'TCP'
}
]
dnsNameLabel: 'my-app-prod-${uniqueString(resourceGroup().id)}'
}
restartPolicy: 'Always'
}
}

7. Control and Security​

Authentication in Private Registries​

To use ACR images, ACI can authenticate in three ways:

  1. Managed Identity (recommended): the container group has an identity with AcrPull role on ACR
  2. Service Principal: username = service principal ID, password = client secret
  3. Admin credentials: username = registry name, password = admin password (not recommended)
# Using Managed Identity (must specify registry login server)
az container create \
--assign-identity <identity-resource-id> \
--registry-login-server myacr.azurecr.io \
...

Containers in VNet: Network Isolation​

For workloads that need to access internal VNet resources (databases, private services), ACI in VNet mode is essential:

# Create subnet with delegation for ACI
az network vnet subnet create \
--name subnet-aci \
--vnet-name vnet-production \
--resource-group rg-networking \
--address-prefix 10.0.5.0/24 \
--delegations Microsoft.ContainerInstance/containerGroups

# Container in VNet
az container create \
--ip-address Private \
--vnet vnet-production \
--subnet subnet-aci \
...

Important restriction: containers in VNet cannot have simultaneous public IP. The choice is one or the other.

Security Initiatives​

  • Never use common environment variables for passwords; always use --secure-environment-variables
  • For more complex secrets, use Managed Identity + Key Vault to fetch secrets at runtime
  • Production containers should be in VNet, not with direct public IP
  • Images should come from private registries (ACR), not directly from Docker Hub in production

8. Decision Making​

ACI vs. other compute services​

SituationBest choiceReason
Short batch task (< 1 hour)ACIPay per second, no infrastructure management
Long-running web applicationApp ServiceBetter for persistent workloads, built-in scaling
Complex microservicesAKSService mesh, advanced orchestration
Event-driven processingAzure FunctionsBetter event integration, consumption billing
Development/testingACIQuick creation/deletion, cost-effective
Burst scaling from AKSACI + Virtual KubeletElastic scaling without additional nodes
Windows legacy applicationsVM or ACI (Windows)Depends on complexity and dependencies
GPU workloadsACI (with GPU) or VMACI for simple ML inference, VM for training

Key decision factors:​

  • Duration: < 1 hour β†’ ACI, persistent β†’ App Service/AKS
  • Complexity: Single container β†’ ACI, orchestration β†’ AKS
  • Network: Internet exposure β†’ Public ACI, internal β†’ VNet ACI
  • State: Stateless β†’ ACI, stateful β†’ consider other options with persistent storage

ACI shines in scenarios where you need "compute on demand" without the overhead of managing virtual machines or complex orchestration platforms. | Short-duration batch task | ACI with restartPolicy: Never | Pay only for execution time | | Simple long-duration REST API | App Service Containers | More web features (SSL, custom domain, deployment slots) | | Production microservices, multiple instances | AKS | Orchestration, auto scaling, service mesh | | Container dev/test environment | ACI | Fast, no cluster overhead | | Event processing (Function-like) | Azure Container Apps | Native scaling to zero, KEDA | | Container that needs VNet resource access | ACI in VNet mode | Native VNet integration | | Load burst in AKS | ACI via Virtual Kubelet | Fast scale out without new nodes |

When to use different restart policy?​

ScenarioRestart PolicyReason
Web server, APIAlwaysRestarts if crashes or fails
Data processing jobNeverRuns once and terminates
Script that may have transient failureOnFailureRetry on failure, doesn't restart after success

9. Best Practices​

Use minimal images (Alpine, Distroless): smaller images start faster (less data to download), cost less in ACR storage, and have smaller security attack surface.

Prefer Managed Identity to explicit credentials: instead of passing passwords and API keys as environment variables, configure a Managed Identity on the container group and use it to access Key Vault, Storage, and other Azure services.

For batch tasks, use restartPolicy: Never and check exit code: a script that terminates with exit code 0 indicates success. Exit code other than zero indicates failure. Use this convention so ACI knows if the task was successful.

Define conservative CPU and memory limits: ACI charges for allocated resources, not usage. If you allocate 4 vCPU but use 0.5, you pay for 4. Size based on real application benchmarks.

Name DNS labels predictably and uniquely: the DNS label is part of the public FQDN. Use a pattern like <app>-<environment>-<unique-suffix> to avoid conflicts and facilitate identification.


10. Common Errors​

Using restart policy "Always" for batch tasks

A processing script terminates with exit code 0 (success). With Always, ACI immediately restarts. The script processes the same data over and over again, in a loop, consuming resources and potentially processing duplicate data. For batch tasks, use Never.

Not specifying DNS label and then not getting the FQDN

The container is created with public IP but without --dns-name-label. The IP changes every time the container group is recreated. The infra team needs to manually update all systems pointing to the IP. Always define a predictable DNS label for containers with public IP.

Image pull fails silently due to lack of ACR permission

The container group is created but stays in Pending state indefinitely. The "image pull failed" error is in logs but not obvious in the portal. The cause is that the container doesn't have credentials to access the private ACR. Configure the Managed Identity with AcrPull before creating the container.

Container in VNet cannot access internet for external APIs

A container in the VNet needs to call an external API. The outbound traffic from the subnet has no route to the internet (no NAT Gateway, no UDR to internet). The container gets stuck waiting for response. Containers in VNet that need internet access require explicit outbound configuration (NAT Gateway or Azure Firewall with UDR).

Store data in container filesystem without mounting volume

A processing job writes results to /tmp/resultados. The job finishes. The container group is deleted. The results are lost. For persistence, always mount an Azure Files volume to directories where important data is written.


11. Operation and Maintenance​

Monitoring​

ACI sends metrics to Azure Monitor:

MetricWhat it measures
CpuUsagevCPU used by container group
MemoryUsageMemory in bytes in use
NetworkBytesReceivedPerSecondInbound traffic
NetworkBytesTransmittedPerSecondOutbound traffic

To access logs from a running container or after termination:

# Logs from main container
az container logs --name meu-container --resource-group rg-containers

# Logs from specific container in a group with multiple containers
az container logs \
--name meu-container-group \
--resource-group rg-containers \
--container-name sidecar-container

Logs are retained while the container group exists. After deleting the container group, logs are lost unless they were sent to a log service (Log Analytics, Azure Monitor).

Configure Log Analytics​

WORKSPACE_ID=$(az monitor log-analytics workspace show \
--workspace-name law-monitoring \
--resource-group rg-monitoring \
--query customerId -o tsv)

WORKSPACE_KEY=$(az monitor log-analytics workspace get-shared-keys \
--workspace-name law-monitoring \
--resource-group rg-monitoring \
--query primarySharedKey -o tsv)

az container create \
--name meu-container \
--resource-group rg-containers \
--log-analytics-workspace $WORKSPACE_ID \
--log-analytics-workspace-key $WORKSPACE_KEY \
...

With Log Analytics configured, container logs are automatically sent to the workspace and can be queried with KQL even after the container is deleted.

Important Limits​

ItemLimit
CPUs per container group4 vCPU
Memory per container group16 GB
Containers per container group60
Container groups per subscription per region100 (default, adjustable)
Maximum duration of a container groupNo technical limit (but billing limitations)
Ports per container groupMultiple, but no overlap between containers

12. Integration and Automation​

ACI as Backend for Azure Logic Apps and Functions​

ACI can be created and destroyed programmatically via REST API, making it ideal for job orchestration:

100%
Scroll para zoom Β· Arraste para mover Β· πŸ“± Pinch para zoom no celular

Virtual Kubelet: ACI as AKS Node​

AKS can use ACI as a "virtual node" to absorb demand bursts without scaling actual cluster nodes:

# Enable Virtual Nodes addon in AKS
az aks enable-addons \
--resource-group rg-aks \
--name meu-cluster \
--addons virtual-node \
--subnet-name subnet-aci

With this, Pods with toleration virtual-kubelet.io/provider=azure are scheduled on ACI instead of physical nodes, allowing instant scale without node provisioning time.

Automation via ARM/Bicep in Pipelines​

For data processing pipelines where each run is a container:

#!/bin/bash
# Pipeline: create container, wait for completion, collect result, delete

CONTAINER_NAME="job-$(date +%Y%m%d%H%M%S)"
RG="rg-containers"

# Create container
az container create \
--name $CONTAINER_NAME \
--resource-group $RG \
--image minhacr.azurecr.io/processador:latest \
--cpu 2 --memory 4 \
--ip-address None \
--restart-policy Never \
--environment-variables INPUT_FILE="arquivo-$(date +%Y%m%d).csv"

# Wait for completion
az container wait \
--name $CONTAINER_NAME \
--resource-group $RG \
--condition terminated

# Check exit code
EXIT_CODE=$(az container show \
--name $CONTAINER_NAME \
--resource-group $RG \
--query "containers[0].instanceView.currentState.exitCode" -o tsv)

if [ "$EXIT_CODE" == "0" ]; then
echo "Job completed successfully"
else
echo "Job failed with exit code $EXIT_CODE"
az container logs --name $CONTAINER_NAME --resource-group $RG
fi

# Clean up container
az container delete --name $CONTAINER_NAME --resource-group $RG --yes

13. Final Summary​

Essential points:

  • ACI runs containers without VM or cluster management. It's ideal for short-duration tasks, batch jobs, and development.
  • The deployment unit is the Container Group, which can contain multiple containers sharing network and volumes.
  • Three restart policies: Always (services), Never (batch tasks), OnFailure (retry on failure).
  • Three network modes: Public (public IP), Private/VNet (network isolation), None (no network).

Critical differences:

  • ACI vs. AKS: ACI is for simple containers, single executions, or development. AKS is for production microservices, multiple replicas, and complex orchestration.
  • Restart policy Never vs. Always: Never is for jobs that run once; Always is for continuous services. Using Always on a batch job creates an infinite loop.
  • Ephemeral storage vs. Azure Files Volume: data in container filesystem is lost when container stops. Data in Azure Files volume persists.
  • Public IP vs. VNet mode: containers in VNet cannot have public IP; choose one or the other.

What needs to be remembered:

  • ACI charges for allocated vCPU and GB of memory, per second of execution.
  • Stopped containers (Stopped) don't charge for CPU/memory, but the container group still exists (may incur associated storage costs).
  • For private ACR images, use Managed Identity with AcrPull role.
  • The subnet for ACI in VNet needs delegation Microsoft.ContainerInstance/containerGroups.
  • The az container wait --condition terminated command blocks until the container terminates, useful in automated pipelines.
  • Logs are lost when the container group is deleted. Configure Log Analytics for retention.