Observability for Python Apps: Logging, Metrics, Tracing with OpenTelemetry
Monitoring asks “is it up?” Observability asks “why is it slow?” This guide shows how to add structured logging, RED/USE metrics, and distributed tracing to Python services using OpenTelemetry, Prometheus, Tempo, and Grafana.
1) Architecture at a Glance
- App (FastAPI/Django + OTel SDK) → OTel Collector
- Logs → Loki · Metrics → Prometheus · Traces → Tempo
- Dashboards/Alerts → Grafana
2) Structured Logging
# logging_setup.py
import logging, sys, json, time
class JsonFormatter(logging.Formatter):
def format(self, record):
base = {
"ts": time.time(),
"level": record.levelname,
"msg": record.getMessage(),
"logger": record.name,
}
if record.exc_info:
base["exc"] = self.formatException(record.exc_info)
return json.dumps(base)
def configure_json_logging(level=logging.INFO):
h = logging.StreamHandler(sys.stdout)
h.setFormatter(JsonFormatter())
root = logging.getLogger()
root.handlers.clear()
root.addHandler(h)
root.setLevel(level)
Emit JSON to stdout; collectors parse and route without regex gymnastics.
3) Metrics with Prometheus Client
from prometheus_client import Counter, Histogram, Gauge
REQS = Counter("http_requests_total", "Total HTTP requests", ["route","method","code"])
LAT = Histogram("http_request_duration_seconds", "Latency", ["route","method"], buckets=[.05,.1,.2,.5,1,2,5])
INFLIGHT = Gauge("http_inflight_requests", "Active requests")
def before_request(route, method):
INFLIGHT.inc()
timer = LAT.labels(route, method).time()
return timer
def after_request(route, method, code, timer):
timer.observe_duration()
INFLIGHT.dec()
REQS.labels(route, method, code).inc()
Expose /metrics via WSGI/ASGI middleware and let Prom scrape.
4) Distributed Tracing with OpenTelemetry
# otel_setup.py
from opentelemetry import trace
from opentelemetry.sdk.resources import SERVICE_NAME, Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
def init_tracing(service_name="api"):
provider = TracerProvider(resource=Resource.create({SERVICE_NAME: service_name}))
processor = BatchSpanProcessor(OTLPSpanExporter(endpoint="http://otel-collector:4318/v1/traces"))
provider.add_span_processor(processor)
trace.set_tracer_provider(provider)
return trace.get_tracer(service_name)
FastAPI integration
from fastapi import FastAPI
from opentelemetry.instrumentation.fastapi import FastAPIInstrumentor
from logging_setup import configure_json_logging
from otel_setup import init_tracing
app = FastAPI()
configure_json_logging()
tracer = init_tracing("redesign-api")
FastAPIInstrumentor.instrument_app(app)
@app.get("/health")
def health():
return {"ok": True}
5) OpenTelemetry Collector Config (single pipeline)
receivers:
otlp:
protocols:
http:
exporters:
otlphttp/tempo:
endpoint: http://tempo:4318
prometheus:
endpoint: 0.0.0.0:9464
processors:
batch: {}
service:
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [otlphttp/tempo]
metrics:
receivers: [otlp]
processors: [batch]
exporters: [prometheus]
6) Dashboards & Alerting
- RED (Rate, Errors, Duration) for user-facing endpoints.
- USE (Utilization, Saturation, Errors) for system resources.
- Set SLOs (e.g., “p95 < 300ms”), alert on burn rate, not single breaches.
7) Cost & Cardinality Control
- Use exemplar sampling for traces; reduce high-cardinality labels.
- Hash or truncate IDs in logs to avoid PII and cardinality explosions.
- Enable tail-based sampling in Collector for “errors-first” traces.
8) Common Pitfalls
- Unstructured logs → impossible correlation.
- Too many metrics → high scrape/TSDB cost; start with RED/USE.
- Ignoring propagation headers → broken traces across services.
“See the system. Hear its signals. Then design with empathy.” — redesign.ir
Tip: Correlate everything via
trace_id/ span_id—add them to log records using a logging filter for one-click jumps from logs → trace.
Comments
Join the discussion. We keep comments private to your device until moderation tooling ships.