Airbyte is a fast-growing ELT tool that helps acquire data from multiple sources. Particularly useful in building data lakes. Airbyte offers pre-built connectors to over 300 sources and 10s of destinations and also allows custom connectors to be built quickly using language SDKs.
Airbyte recently released Opentelemetry-based metrics, however, the documentation has been spotty and incomplete. You can check it out here. In this blog, I will document my learnings through the journey of integrating Airbyte open source, running in GKE to Grafana, and using GCP’s managed Prometheus service. The available metrics can be seen here
Airbyte to Grafana – Via Open Telemetry & Prometheus
The design looks as follows.

Implementation
Step 1 – Deploy Airbyte
Install Airbyte on Kubernetes. This is pretty straightforward. Follow the instructions on this page.
git clone https://github.com/airbytehq/airbyte.git
cd airbyte
kubectl apply -k kube/overlays/stable
You can customize the namespace to fit your needs. Add the following 2 variables to the worker in the .env (Which will translate into configmap and eventually the environment variable of the pod).
METRIC_CLIENT=otel OTEL_COLLECTOR_ENDPOINT=http://otel-collector:4317
We will build open telemetry pod in subsequent steps. Note that, if you add PUBLISH_METRICS=true, currently worker will look for datadog configurations.
Step 2 – Deploy Metric-reporter
Metric reporter queries the metrics from database in batches and pumps it to open telemetry instance. Use the following yaml as an example.
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: airbyte-metrics
namespace: airbyte-dev
labels:
app: airbyte-metrics
spec:
replicas: 1
selector:
matchLabels:
app: airbyte-metrics
template:
metadata:
labels:
app: airbyte-metrics
spec:
serviceAccountName: airbyte-admin
automountServiceAccountToken: true
containers:
- name: metrics
image: airbyte/metrics-reporter:0.39.31-alpha
env:
- name: METRIC_CLIENT
value: "otel"
- name: OTEL_COLLECTOR_ENDPOINT
value: "otel-collector:4317"
- name: PUBLISH_METRICS
value: "true"
Note: Metric reporter need access to the airbyte database. Copy all the configs of Airbyte worker (env: section) in addition to above key value pairs.
Step 3: Create Open Telemetry Collector
Open telemetry collector receives metrics from the metric exporter and writes to prometheus. Its fairly well documented and standard implementation.
---
apiVersion: v1
kind: ConfigMap
metadata:
name: otel-collector-conf
namespace: airbyte-dev
labels:
app: opentelemetry
component: otel-collector-conf
data:
otel-collector-config: |
receivers:
otlp:
protocols:
grpc:
http:
processors:
batch:
memory_limiter:
limit_mib: 1500
spike_limit_mib: 512
check_interval: 5s
extensions:
zpages: {}
memory_ballast:
size_mib: 683
exporters:
logging:
loglevel: debug
prometheusremotewrite:
endpoint: "http://prometheus-test.airbyte-dev.svc:9090/api/v1/write"
service:
extensions: [zpages, memory_ballast]
pipelines:
metrics:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [logging, prometheusremotewrite]
---
apiVersion: v1
kind: Service
metadata:
name: otel-collector
namespace: airbyte-dev
labels:
app: opentelemetry
component: otel-collector
spec:
ports:
- name: otlp-grpc # Default endpoint for OpenTelemetry gRPC receiver.
port: 4317
protocol: TCP
targetPort: 4317
- name: otlp-http # Default endpoint for OpenTelemetry HTTP receiver.
port: 4318
protocol: TCP
targetPort: 4318
- name: metrics # Default endpoint for querying metrics.
port: 8888
selector:
component: otel-collector
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: otel-collector
namespace: airbyte-dev
labels:
app: opentelemetry
component: otel-collector
spec:
selector:
matchLabels:
app: opentelemetry
component: otel-collector
minReadySeconds: 5
progressDeadlineSeconds: 120
replicas: 1 #TODO - adjust this to your own requirements
template:
metadata:
labels:
app: opentelemetry
component: otel-collector
spec:
containers:
- command:
- "/otelcol"
- "--config=/conf/otel-collector-config.yaml"
image: otel/opentelemetry-collector:0.54.0
name: otel-collector
resources:
limits:
cpu: 1
memory: 2Gi
requests:
cpu: 200m
memory: 400Mi
ports:
- containerPort: 55679 # Default endpoint for ZPages.
- containerPort: 4317 # Default endpoint for OpenTelemetry receiver.
- containerPort: 14250 # Default endpoint for Jaeger gRPC receiver.
- containerPort: 14268 # Default endpoint for Jaeger HTTP receiver.
- containerPort: 9411 # Default endpoint for Zipkin receiver.
- name: metrics
protocol: TCP
containerPort: 8888
volumeMounts:
- name: otel-collector-config-vol
mountPath: /conf
volumes:
- configMap:
name: otel-collector-conf
items:
- key: otel-collector-config
path: otel-collector-config.yaml
name: otel-collector-config-vol
Step 4: Deploy Prometheus Proxy
Though we could deploy a full fledged Prometheus, I chose to use Google provided managed prometheus (GMP) service. However, managed prometheus require a proxy to provide the end point for open telemetry instance. The documentation is here. Here is my yaml for it.
---
apiVersion: v1
kind: Service
metadata:
namespace: airbyte-dev
name: prometheus-test
labels:
prometheus: test
spec:
type: ClusterIP
selector:
app: prometheus
prometheus: test
ports:
- name: web
port: 9090
targetPort: web
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
namespace: airbyte-dev
name: prometheus-test
labels:
prometheus: test
spec:
replicas: 1
selector:
matchLabels:
app: prometheus
prometheus: test
serviceName: prometheus-test
template:
metadata:
labels:
app: prometheus
prometheus: test
spec:
automountServiceAccountToken: true
nodeSelector:
kubernetes.io/arch: amd64
kubernetes.io/os: linux
containers:
- name: prometheus
image: gke.gcr.io/prometheus-engine/prometheus:v2.28.1-gmp.7-gke.0
args:
- --config.file=/prometheus/config_out/config.yaml
- --storage.tsdb.path=/prometheus/data
- --storage.tsdb.retention.time=24h
- --web.enable-lifecycle
- --enable-feature=remote-write-receiver
- --storage.tsdb.no-lockfile
- --web.route-prefix=/
ports:
- name: web
containerPort: 9090
readinessProbe:
httpGet:
path: /-/ready
port: web
scheme: HTTP
resources:
requests:
memory: 400Mi
volumeMounts:
- name: config-out
mountPath: /prometheus/config_out
readOnly: true
- name: prometheus-db
mountPath: /prometheus/data
- name: config-reloader
image: gke.gcr.io/prometheus-engine/config-reloader:v0.4.1-gke.0
args:
- --config-file=/prometheus/config/config.yaml
- --config-file-output=/prometheus/config_out/config.yaml
- --reload-url=http://localhost:9090/-/reload
- --listen-address=:19091
ports:
- name: reloader-web
containerPort: 8080
resources:
limits:
cpu: 100m
memory: 50Mi
requests:
cpu: 100m
memory: 50Mi
volumeMounts:
- name: config
mountPath: /prometheus/config
- name: config-out
mountPath: /prometheus/config_out
terminationGracePeriodSeconds: 600
volumes:
- name: prometheus-db
emptyDir: {}
- name: config
configMap:
name: prometheus-test
defaultMode: 420
- name: config-out
emptyDir: {}
---
apiVersion: v1
kind: ConfigMap
metadata:
namespace: airbyte-dev
name: prometheus-test
labels:
prometheus: test
data:
config.yaml: |
global:
scrape_interval: 30s
scrape_configs:
- job_name: otel-collector
static_configs:
- targets: ['otel-collector.airbyte-dev.svc:8888']
There are 2 key points. First, this setup is for open telemetry to write metrics to prometheus vs, Prometheus pulling the metrics, hence needed to add argument –enable-feature=remote-write-receiver . Second is, for pull, Prometheus needs to be configured for scraping, which is not implemented, though added in scrape configs.
Step-5: Install Grafana
This is fairly straightforward. Deploy the pod as below and point the data source to the GMP proxy at http://prometheus-test.airbyte-dev.svc:9090.
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: grafana-pvc
namespace: airbyte-dev
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: airbyte-dev
labels:
app: grafana
name: grafana
spec:
selector:
matchLabels:
app: grafana
template:
metadata:
labels:
app: grafana
spec:
securityContext:
fsGroup: 472
supplementalGroups:
- 0
containers:
- name: grafana
image: grafana/grafana:8.4.4
imagePullPolicy: IfNotPresent
ports:
- containerPort: 3000
name: http-grafana
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /robots.txt
port: 3000
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 30
successThreshold: 1
timeoutSeconds: 2
livenessProbe:
failureThreshold: 3
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
tcpSocket:
port: 3000
timeoutSeconds: 1
resources:
requests:
cpu: 250m
memory: 750Mi
volumeMounts:
- mountPath: /var/lib/grafana
name: grafana-pv
volumes:
- name: grafana-pv
persistentVolumeClaim:
claimName: grafana-pvc
---
apiVersion: v1
kind: Service
metadata:
name: grafana
namespace: airbyte-dev
spec:
ports:
- port: 3000
protocol: TCP
targetPort: http-grafana
selector:
app: grafana
sessionAffinity: None
type: LoadBalancer
The steps may not be in the necessary order, but at the end of it, each instance should discover the endpoints and should function.
The metrics can now be queried either in GMP or Grafana.