Mastering OpenTelemetry in Python: Metrics & Traces

aayushsrivastava11

OpenTelemetry (OTel) provides a flexible, powerful observability framework that enables developers to track application performance across distributed systems. This guide walks you through setting up OpenTelemetry in Python, capturing metrics and traces, and storing them in Jaeger and Prometheus for centralized monitoring and analysis. By the end, you’ll understand the essentials of setting up, configuring, and using OTel for visibility into your Python applications.

What is OpenTelemetry?

OpenTelemetry is an open-source observability framework designed to collect and export application telemetry data (logs, metrics, and traces). It enables developers to understand application performance, monitor distributed systems, and debug issues faster. The OTel framework comprises instrumentation libraries, APIs, and exporters to support various backends. Here, we’ll focus on collecting traces and metrics using Python and exporting data to Jaeger and Prometheus.

Why Use OTelemetry in Python?

Python applications often run across distributed environments, making it crucial to track execution flow, latency, and resource usage. OpenTelemetry offers a standardized way to observe these aspects, helping developers find bottlenecks and improve application performance. By integrating OpenTelemetry, you can gain granular insights into each request and response cycle, visualize application performance, and proactively monitor issues.

Step 1: Setting Up OpenTelemetry in Python

To start, you need to install the necessary OpenTelemetry libraries for Python, along with exporters for Jaeger and Prometheus.These libraries include essential OTel components like Tracer for generating traces and Meter for metrics.

pip install opentelemetry-api
pip install opentelemetry-sdk
pip install opentelemetry-instrumentation

pip install opentelemetry-exporter-jaeger
pip install opentelemetry-exporter-prometheus

Step 2: Configuring Tracing

Tracing in OpenTelemetry involves capturing spans—units of work performed in the application, such as HTTP requests or function executions. Each span has a start time, end time, and optional attributes that provide additional information.

To get started with tracing in Python, import the necessary libraries and create a tracer instance.

Example Code for Tracing

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.jaeger.thrift import JaegerExporter
from opentelemetry.instrumentation.flask import FlaskInstrumentor
from flask import Flask

# Initialize Tracer
trace.set_tracer_provider(TracerProvider())
tracer = trace.get_tracer(__name__)

# Configure Jaeger Exporter
jaeger_exporter = JaegerExporter(
    agent_host_name="localhost",
    agent_port=6831,
)

# Add Span Processor
span_processor = BatchSpanProcessor(jaeger_exporter)
trace.get_tracer_provider().add_span_processor(span_processor)

# Sample Flask Application with Instrumentation
app = Flask(__name__)
FlaskInstrumentor().instrument_app(app)

@app.route("/process")
def process_request():
    with tracer.start_as_current_span("process_request"):
        # Simulate processing
        return "Processing complete"

if __name__ == "__main__":
    app.run()

Tracing Explanation

Tracer Initialization: First, initialize a TracerProvider that manages spans for your application.
Jaeger Exporter Configuration: Set up the Jaeger exporter with agent_host_name and agent_port. This configuration directs traces to a Jaeger agent running locally.
Span Processing: Use BatchSpanProcessor to batch and export spans in a resource-efficient way.
Flask Instrumentation: The FlaskInstrumentor automatically captures and traces incoming HTTP requests.

Step 3: Configuring Metrics

In addition to tracing, OpenTelemetry can capture metrics to measure application performance and resource utilization over time. Common metrics include request count, response time, CPU usage, and memory consumption.

Example Code for Metrics

from opentelemetry import metrics
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.sdk.metrics.export import PeriodicExportingMetricReader
from opentelemetry.exporter.prometheus import PrometheusMetricsExporter
from flask import Flask
import time

# Initialize MeterProvider
metrics.set_meter_provider(MeterProvider())
meter = metrics.get_meter(__name__)

# Configure Prometheus Exporter
prometheus_exporter = PrometheusMetricsExporter()
metric_reader = PeriodicExportingMetricReader(prometheus_exporter)
metrics.get_meter_provider().add_reader(metric_reader)

# Define and record metrics
request_counter = meter.create_counter(
    name="request_count",
    description="Counts the number of requests",
    unit="requests",
)

app = Flask(__name__)

@app.route("/process")
def process_request():
    request_counter.add(1)  # Increment request count
    return "Processing complete"

if __name__ == "__main__":
    app.run()

Metrics Explanation

Meter Initialization: Similar to the tracer, you initialize a MeterProvider, which manages metrics in your application.
Prometheus Exporter: Set up PrometheusMetricsExporter, which makes metrics available on an endpoint that Prometheus can scrape.
Metric Definition: Define metrics, like request_counter, to capture data points during the application’s runtime.

Step 4: Exporting to Jaeger and Prometheus

With the tracing and metric code set up, you need to configure Jaeger and Prometheus to receive and display this data.

Configuring Jaeger

Run the Jaeger backend using Docker:

docker run -d --name jaeger \
  -e COLLECTOR_ZIPKIN_HTTP_PORT=9411 \
  -p 5775:5775/udp \
  -p 6831:6831/udp \
  -p 6832:6832/udp \
  -p 5778:5778 \
  -p 16686:16686 \
  -p 14268:14268 \
  -p 14250:14250 \
  -p 9411:9411 \
  jaegertracing/all-in-one:1.29

Access Jaeger’s UI by navigating to http://localhost:16686. This UI provides search functionality to filter traces by service, operation, and tags.

Reference: https://www.jaegertracing.io/download/

Configuring Prometheus

First, create a Prometheus configuration file (prometheus.yml) and define the scraping job:

scrape_configs:
  - job_name: "python-app"
    scrape_interval: 5s
    static_configs:
      - targets: ["localhost:8000"]

Run Prometheus with the custom configuration file:

bashCopy codedocker run -d --name prometheus \
  -p 9090:9090 \
  -v $(pwd)/prometheus.yml:/etc/prometheus/prometheus.yml \
  prom/prometheus

Visit http://localhost:9090 to access the Prometheus UI, where you can visualize and query metrics data.

Step 5: Visualizing and Analyzing Data in Jaeger and Prometheus

Both Jaeger and Prometheus provide visualization tools to analyze metrics and traces in real-time.

In Jaeger: Use the UI’s trace search functionality to view individual spans. Jaeger displays a timeline of spans for a trace, showing the duration and highlighting slow spans. You can drill into each span for detailed information, including any custom attributes.
In Prometheus: Query metrics by navigating to the “Graph” tab and entering metric names or expressions. Prometheus supports custom queries for slicing and dicing metrics, enabling you to filter by specific dimensions like endpoint or status code.

Putting It All Together: Best Practices

Use Resource Attributes: Define attributes (like service name, environment, version) to add context to traces and metrics. Attributes help organize data and facilitate easier analysis in Jaeger and Prometheus.
Set Up Alerts: In Prometheus, set up alerting rules to notify you of unusual metrics, such as high error rates or latency spikes, to proactively detect issues.
Batch Processing for Traces: The BatchSpanProcessor efficiently exports spans in batches, reducing resource consumption and improving performance.
Instrument Critical Paths: Focus on instrumenting high-impact areas like API endpoints, database queries, and external service calls. Prioritizing these areas maximizes observability’s value by illuminating key bottlenecks and dependencies.

Conclusion

OpenTelemetry in Python offers a structured approach to enhance observability with tracing and metrics collection. Using Jaeger and Prometheus, you gain powerful tools to visualize and analyze application performance data. Setting up OpenTelemetry initially takes effort, but it pays dividends by enabling faster debugging, better resource management, and a deeper understanding of distributed Python applications.

Solutions

Industry

Our thinking

Mastering OpenTelemetry in Python: Metrics & Traces

aayushsrivastava11

Table of Contents

What is OpenTelemetry?

Why Use OTelemetry in Python?

Step 1: Setting Up OpenTelemetry in Python

Step 2: Configuring Tracing

Example Code for Tracing

Tracing Explanation

Step 3: Configuring Metrics

Example Code for Metrics

Metrics Explanation

Step 4: Exporting to Jaeger and Prometheus

Configuring Jaeger

Configuring Prometheus

Step 5: Visualizing and Analyzing Data in Jaeger and Prometheus

Putting It All Together: Best Practices

Conclusion

aayushsrivastava11

Leave a Comment Cancel Reply

Suggested Article

NashTech

Solutions

Useful links

Connect with us

Our achievements