Apache Kafka has become a cornerstone for building robust, scalable, and reliable data streaming platforms. Confluent Kafka extends the capabilities of Apache Kafka by offering additional tools and features, making it even easier to deploy and manage Kafka clusters. Running Confluent Kafka on Kubernetes enhances its scalability and availability, leveraging Kubernetes’ orchestration capabilities. In this blog, we’ll walk through the steps to set up Confluent Kafka on Kubernetes for high availability (HA), complete with code examples.
Why Run Confluent Kafka on Kubernetes?
Deploying Kafka on Kubernetes provides:
- Scalability: Scale Kafka brokers up or down seamlessly based on workload.
- High Availability: Ensure continuous availability using Kubernetes’ pod management and self-healing capabilities.
- Ease of Management: Use Kubernetes’ declarative configuration and tooling.
- Resource Efficiency: Optimize resource utilization through containerization.
Prerequisites
Before starting, ensure you have:
- A Kubernetes cluster (Minikube, AKS, EKS, GKE, etc.)
kubectlandhelminstalled and configured.- Confluent’s Helm repository added to your setup:
helm repo add confluentinc https://packages.confluent.io/helm helm repo update - A basic understanding of Kubernetes resources (e.g., StatefulSets, ConfigMaps).
Step 1: Prepare Your Kubernetes Cluster

Ensure your cluster has sufficient resources for the Kafka setup. Allocate at least three worker nodes for HA.
Example Configuration (Using Minikube)
minikube start --nodes=4 --cpus=4 --memory=8g
You can verify the nodes:
kubectl get nodes
Step 2: Install Zookeeper
Zookeeper is essential for Kafka’s coordination. Confluent provides a Helm chart for deploying Zookeeper.
Deploy Zookeeper
Create a values file for Zookeeper configuration (zookeeper-values.yaml):
replicas: 3
resources:
requests:
memory: "1Gi"
cpu: "500m"
limits:
memory: "2Gi"
cpu: "1000m"
Deploy Zookeeper using Helm:
helm install zookeeper confluentinc/cp-zookeeper --values zookeeper-values.yaml
Verify the deployment:
kubectl get pods -l app=cp-zookeeper
Step 3: Deploy Confluent Kafka
Kafka’s HA setup involves deploying multiple brokers. Confluent’s Helm chart simplifies this process.
Configure Kafka Brokers
Create a values file for Kafka configuration (kafka-values.yaml):
replicas: 3
resources:
requests:
memory: "2Gi"
cpu: "1000m"
limits:
memory: "4Gi"
cpu: "2000m"
configurationOverrides:
"confluent.broker": "broker.id=1;default.replication.factor=3;min.insync.replicas=2"
Deploy Kafka using Helm:
helm install kafka confluentinc/cp-kafka --values kafka-values.yaml
Check the status:
kubectl get pods -l app=cp-kafka
Ensure all pods are running before proceeding.
Step 4: Expose Kafka Services
To allow external access to Kafka, create a LoadBalancer service or use NodePorts.
Example NodePort Service
Edit the service to expose Kafka:
apiVersion: v1
kind: Service
metadata:
name: kafka-service
labels:
app: cp-kafka
spec:
type: NodePort
ports:
- port: 9092
nodePort: 30092
name: kafka
selector:
app: cp-kafka
Apply the configuration:
kubectl apply -f kafka-service.yaml
Verify the service:
kubectl get svc kafka-service
Step 5: Configure Monitoring and Logging
Monitoring Kafka is critical for maintaining HA. Confluent provides metrics exporters compatible with Prometheus and Grafana.
Deploy Prometheus and Grafana
Install Prometheus:
helm install prometheus prometheus-community/prometheus
Install Grafana:
helm install grafana grafana/grafana
Set Up Kafka Metrics Exporter
Add a Kafka exporter to scrape metrics. Update your Kafka Helm values:
metrics:
jmx:
enabled: true
prometheus:
enabled: true
Redeploy Kafka with updated values:
helm upgrade kafka confluentinc/cp-kafka --values kafka-values.yaml
Step 6: Test the Kafka Cluster
Deploy a Test Producer and Consumer
Create a simple Kafka producer deployment (producer.yaml):
apiVersion: apps/v1
kind: Deployment
metadata:
name: kafka-producer
spec:
replicas: 1
selector:
matchLabels:
app: kafka-producer
template:
metadata:
labels:
app: kafka-producer
spec:
containers:
- name: kafka-producer
image: confluentinc/cp-kafka:latest
command: ["bash", "-c"]
args: ["echo 'Hello, Kafka!' | kafka-console-producer --broker-list kafka-service:9092 --topic test"]
Apply the deployment:
kubectl apply -f producer.yaml
Verify the producer logs:
kubectl logs -l app=kafka-producer
Similarly, create and deploy a Kafka consumer to validate message consumption.
Step 7: Scale and Test High Availability
To test HA, simulate node failures:
- Scale down one Kafka pod:
kubectl delete pod <kafka-pod-name> - Verify the remaining pods take over message handling.
Monitor the cluster’s self-healing capabilities as Kubernetes spins up a new pod to replace the failed one.
Conclusion
Deploying Confluent Kafka on Kubernetes for high availability enables a resilient, scalable, and manageable setup. Kubernetes’ self-healing and orchestration capabilities complement Kafka’s distributed nature, ensuring robust data streaming even under adverse conditions. With Confluent’s Helm charts and additional tools like Prometheus and Grafana, managing a production-ready Kafka cluster becomes straightforward.
Leverage this setup to build reliable real-time data pipelines and scale effortlessly as your needs grow. That’s it for now. I hope this article gave you some useful insights on the topic. Please feel free to drop a comment, question or suggestion.