How to Set Up Confluent Kafka on Kubernetes for High Availability

Riya

Apache Kafka has become a cornerstone for building robust, scalable, and reliable data streaming platforms. Confluent Kafka extends the capabilities of Apache Kafka by offering additional tools and features, making it even easier to deploy and manage Kafka clusters. Running Confluent Kafka on Kubernetes enhances its scalability and availability, leveraging Kubernetes’ orchestration capabilities. In this blog, we’ll walk through the steps to set up Confluent Kafka on Kubernetes for high availability (HA), complete with code examples.

Why Run Confluent Kafka on Kubernetes?

Deploying Kafka on Kubernetes provides:

Scalability: Scale Kafka brokers up or down seamlessly based on workload.
High Availability: Ensure continuous availability using Kubernetes’ pod management and self-healing capabilities.
Ease of Management: Use Kubernetes’ declarative configuration and tooling.
Resource Efficiency: Optimize resource utilization through containerization.

Prerequisites

Before starting, ensure you have:

A Kubernetes cluster (Minikube, AKS, EKS, GKE, etc.)
kubectl and helm installed and configured.
Confluent’s Helm repository added to your setup: helm repo add confluentinc https://packages.confluent.io/helm helm repo update
A basic understanding of Kubernetes resources (e.g., StatefulSets, ConfigMaps).

Step 1: Prepare Your Kubernetes Cluster

Ensure your cluster has sufficient resources for the Kafka setup. Allocate at least three worker nodes for HA.

Example Configuration (Using Minikube)

minikube start --nodes=4 --cpus=4 --memory=8g

You can verify the nodes:

kubectl get nodes

Step 2: Install Zookeeper

Zookeeper is essential for Kafka’s coordination. Confluent provides a Helm chart for deploying Zookeeper.

Deploy Zookeeper

Create a values file for Zookeeper configuration (zookeeper-values.yaml):

replicas: 3
resources:
  requests:
    memory: "1Gi"
    cpu: "500m"
  limits:
    memory: "2Gi"
    cpu: "1000m"

Deploy Zookeeper using Helm:

helm install zookeeper confluentinc/cp-zookeeper --values zookeeper-values.yaml

Verify the deployment:

kubectl get pods -l app=cp-zookeeper

Step 3: Deploy Confluent Kafka

Kafka’s HA setup involves deploying multiple brokers. Confluent’s Helm chart simplifies this process.

Configure Kafka Brokers

Create a values file for Kafka configuration (kafka-values.yaml):

replicas: 3
resources:
  requests:
    memory: "2Gi"
    cpu: "1000m"
  limits:
    memory: "4Gi"
    cpu: "2000m"
configurationOverrides:
  "confluent.broker": "broker.id=1;default.replication.factor=3;min.insync.replicas=2"

Deploy Kafka using Helm:

helm install kafka confluentinc/cp-kafka --values kafka-values.yaml

Check the status:

kubectl get pods -l app=cp-kafka

Ensure all pods are running before proceeding.

Step 4: Expose Kafka Services

To allow external access to Kafka, create a LoadBalancer service or use NodePorts.

Example NodePort Service

Edit the service to expose Kafka:

apiVersion: v1
kind: Service
metadata:
  name: kafka-service
  labels:
    app: cp-kafka
spec:
  type: NodePort
  ports:
    - port: 9092
      nodePort: 30092
      name: kafka
  selector:
    app: cp-kafka

Apply the configuration:

kubectl apply -f kafka-service.yaml

Verify the service:

kubectl get svc kafka-service

Step 5: Configure Monitoring and Logging

Monitoring Kafka is critical for maintaining HA. Confluent provides metrics exporters compatible with Prometheus and Grafana.

Deploy Prometheus and Grafana

Install Prometheus:

helm install prometheus prometheus-community/prometheus

Install Grafana:

helm install grafana grafana/grafana

Set Up Kafka Metrics Exporter

Add a Kafka exporter to scrape metrics. Update your Kafka Helm values:

metrics:
  jmx:
    enabled: true
  prometheus:
    enabled: true

Redeploy Kafka with updated values:

helm upgrade kafka confluentinc/cp-kafka --values kafka-values.yaml

Step 6: Test the Kafka Cluster

Deploy a Test Producer and Consumer

Create a simple Kafka producer deployment (producer.yaml):

apiVersion: apps/v1
kind: Deployment
metadata:
  name: kafka-producer
spec:
  replicas: 1
  selector:
    matchLabels:
      app: kafka-producer
  template:
    metadata:
      labels:
        app: kafka-producer
    spec:
      containers:
      - name: kafka-producer
        image: confluentinc/cp-kafka:latest
        command: ["bash", "-c"]
        args: ["echo 'Hello, Kafka!' | kafka-console-producer --broker-list kafka-service:9092 --topic test"]

Apply the deployment:

kubectl apply -f producer.yaml

Verify the producer logs:

kubectl logs -l app=kafka-producer

Similarly, create and deploy a Kafka consumer to validate message consumption.

Step 7: Scale and Test High Availability

To test HA, simulate node failures:

Scale down one Kafka pod: kubectl delete pod <kafka-pod-name>
Verify the remaining pods take over message handling.

Monitor the cluster’s self-healing capabilities as Kubernetes spins up a new pod to replace the failed one.

Conclusion

Deploying Confluent Kafka on Kubernetes for high availability enables a resilient, scalable, and manageable setup. Kubernetes’ self-healing and orchestration capabilities complement Kafka’s distributed nature, ensuring robust data streaming even under adverse conditions. With Confluent’s Helm charts and additional tools like Prometheus and Grafana, managing a production-ready Kafka cluster becomes straightforward.

Leverage this setup to build reliable real-time data pipelines and scale effortlessly as your needs grow. That’s it for now. I hope this article gave you some useful insights on the topic. Please feel free to drop a comment, question or suggestion.

Riya

Riya is a DevOps Engineer with a passion for new technologies. She is a programmer by heart trying to learn something about everything. On a personal front, she loves traveling, listening to music, and binge-watching web series.

Solutions

Industry

Our thinking

How to Set Up Confluent Kafka on Kubernetes for High Availability

Riya

Table of Contents

Why Run Confluent Kafka on Kubernetes?

Prerequisites

Step 1: Prepare Your Kubernetes Cluster

Example Configuration (Using Minikube)

Step 2: Install Zookeeper

Deploy Zookeeper

Step 3: Deploy Confluent Kafka

Configure Kafka Brokers

Step 4: Expose Kafka Services

Example NodePort Service

Step 5: Configure Monitoring and Logging

Deploy Prometheus and Grafana

Set Up Kafka Metrics Exporter

Step 6: Test the Kafka Cluster

Deploy a Test Producer and Consumer

Step 7: Scale and Test High Availability

Conclusion

Riya

Leave a Comment Cancel Reply

Suggested Article

NashTech

Solutions

Useful links

Connect with us

Our achievements