NashTech Blog

Autoscaling in Azure Container Apps: KEDA Event-Driven Scale

Table of Contents

Introduction

Modern cloud-native applications often face unpredictable workloads — sudden traffic spikes, bursty event streams, or periods of zero activity. Azure Container Apps (ACA) solves this challenge by offering built-in serverless, event-driven autoscaling — all powered by KEDA (Kubernetes-based Event Driven Autoscaler) behind the scenes.

In this deep dive, we’ll cover:

  • How autoscaling works inside ACA
  • The role of KEDA (without managing Kubernetes)
  • Types of scaling ACA supports
  • Practical YAML configuration examples
  • Real-world use cases

By the end, you’ll know how to design and implement autoscaling logic for your container apps in ACA, without worrying about Kubernetes internals.


⚙️ What is KEDA?

KEDA is an open-source component for Kubernetes that allows containers to scale dynamically based on external event triggers or metrics.

Kubernetes Event-driven Autoscaling was built to solve a common problem in Kubernetes:

How do you autoscale apps based on non-resource metrics like queue length or event counts?


KEDA Supported Trigger Types (used inside Azure Container Apps):

Trigger TypeExample Use Cases
HTTP request concurrencyPublic APIs, Webhooks
Azure Service Bus QueueMessage queue processing
Azure Storage QueuesEvent-driven microservices
Kafka / RabbitMQStream processing
CPU / MemoryResource-based scale out (Dedicated Plan only)

Good news:

In ACA, you don’t install or operate KEDA manually.
Microsoft manages it within the ACA control plane.


🧱 How Autoscaling Works in ACA Internally (Under the Hood)

Here’s the high-level flow:

  1. You define scale triggers in your ACA YAML, CLI, or Bicep/ARM template.
  2. ACA’s platform control plane registers these scale rules into KEDA ScaledObjects.
  3. KEDA continuously monitors the trigger source (like a Service Bus queue or HTTP request rate).
  4. When the trigger condition is met, KEDA tells ACA to increase or decrease the number of replicas (container instances).

Types of Autoscaling in Azure Container Apps

TypeDescriptionACA Plan Support
HTTP-based scalingBased on concurrent HTTP requests per replicaBoth Consumption & Dedicated
Event-driven scalingBased on external trigger metrics (queues, Kafka, etc.)Both
CPU/Memory-based scalingBased on container CPU/memory usageDedicated Plan only
Scheduled scalingNot directly supported yet in ACA (but can be simulated with Jobs + Event Grid)

1: HTTP Request-Based Autoscaling

Suppose you run a public-facing REST API with unpredictable traffic spikes.

YAML Definition:

scale:
minReplicas: 1
maxReplicas: 10
rules:
- name: http-scale
http:
concurrentRequests: 100

What happens at runtime:

  • ACA will always keep at least 1 replica running.
  • When incoming concurrent HTTP requests per replica exceed 100,
    ACA will scale out horizontally up to 10 replicas.

CLI Deployment Example:

az containerapp create \
--name my-http-api \
--resource-group my-rg \
--environment my-aca-env \
--image myregistry.azurecr.io/myapi:latest \
--scale-rule-name http-scale \
--scale-rule-type http \
--scale-rule-metadata concurrentRequests=100 \
--min-replicas 1 \
--max-re

2: Event-Driven Scaling (Service Bus Queue Trigger)

Imagine you’re processing messages from an Azure Service Bus Queue, and you want to scale out whenever there are more than 50 unprocessed messages.

YAML Definition:

scale:
minReplicas: 0
maxReplicas: 20
rules:
- name: sb-queue-scaler
azureServiceBusQueue:
queueName: orders-queue
namespace: myservicebusns
messageCount: 50

CLI Deployment Example:

az containerapp create \
--name sb-consumer-app \
--resource-group my-rg \
--environment my-aca-env \
--image myregistry.azurecr.io/queueprocessor:latest \
--scale-rule-name sb-queue-scaler \
--scale-rule-type azure-servicebus \
--scale-rule-metadata queueName=orders-queue namespace=myservicebusns messageCount=50 \
--min-replicas 0 \
--max-replicas 20

Result:

ACA will scale out more replicas as soon as there are 50+ messages pending.


⚡ Scale-to-Zero: A Cost-Saving Superpower

For event-driven apps, ACA supports automatic scale-to-zero, meaning you pay nothing when there’s no load.

FeatureBenefit
No active replicasNo CPU/memory charges
Trigger-based activationApp “wakes up” instantly when a new event comes

Example Use Case:
A Service Bus message handler that only runs during business hours when messages arrive.


🎯 Concurrency Management: Tuning HTTP Performance

For HTTP apps, concurrency per replica is a key performance tuning parameter.

ParameterWhat it Means
maxConcurrentRequestsHow many simultaneous HTTP requests a single replica can handle before ACA triggers scale out

Example:
If your container app can handle 50 concurrent requests efficiently, set:

scale:
rules:
- name: http-scale
http:
concurrentRequests: 50

What ACA Does Behind the Scenes with KEDA

Though users never interact with raw KEDA objects, here’s what Azure is doing internally for each scaling rule:

ACA Config ElementBackend KEDA Resource
HTTP scale ruleKEDA HTTP ScaledObject
Event-driven ruleKEDA ScaledObject
Job triggers (for ACA Jobs)KEDA ScaledJob
Metrics pollingKEDA Metrics Adapter

Best Practices for ACA Autoscaling

Best PracticeWhy
Always set minReplicas for critical appsAvoid cold start for production APIs
Test your app’s concurrency limit under loadPrevent overloading a single replica
Use event-driven scale rules for background processingSaves cost and improves efficiency
Monitor scaling behavior in Azure MonitorHelps optimize trigger thresholds

Real-World Use Case Scenario: Order Processing System

Imagine an e-commerce app with:

ComponentACA AppScaling Trigger
Public APIAPI Gateway appHTTP concurrency
Order Queue ConsumerWorker appService Bus queue length
Notification SenderEmail/SMS appService Bus topic subscription

Using ACA and KEDA-backed scaling, each component scales independently based on its trigger type.

Conclusion:

Autoscaling in Azure Container Apps is powerful, flexible, and requires almost zero infrastructure management. By leveraging KEDA under the hood, ACA offers the best of both worlds: Kubernetes-grade event-driven scaling with serverless simplicity.

Picture of Gaurav Shukla

Gaurav Shukla

Gaurav Shukla is a Software Consultant specializing in DevOps at NashTech, with over 2 years of hands-on experience in the field. Passionate about streamlining development pipelines and optimizing cloud infrastructure, He has worked extensively on Azure migration projects, Kubernetes orchestration, and CI/CD implementations. His proficiency in tools like Jenkins, Azure DevOps, and Terraform ensures that he delivers efficient, reliable software development workflows, contributing to seamless operational efficiency.

Leave a Comment

Your email address will not be published. Required fields are marked *

Suggested Article

Scroll to Top