Install (single-binary mode)
# docker-compose.yml
services:
tempo:
image: grafana/tempo:latest
container_name: tempo
restart: unless-stopped
command: [ "-config.file=/etc/tempo/tempo.yml" ]
ports:
- "127.0.0.1:3200:3200" # tempo HTTP query
- "127.0.0.1:4317:4317" # OTLP gRPC
- "127.0.0.1:4318:4318" # OTLP HTTP
volumes:
- ./tempo.yml:/etc/tempo/tempo.yml:ro
- tempo-data:/var/tempo
volumes:
tempo-data:
tempo.yml (minimal monolithic-mode config):
server:
http_listen_port: 3200
distributor:
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
jaeger:
protocols:
thrift_http:
endpoint: 0.0.0.0:14268
zipkin:
endpoint: 0.0.0.0:9411
ingester:
trace_idle_period: 10s
max_block_duration: 5m
compactor:
compaction:
block_retention: 720h # 30 days
storage:
trace:
backend: local # or s3 / gcs / azure
local:
path: /var/tempo/traces
wal:
path: /var/tempo/wal
querier:
search:
query_timeout: 30s
metrics_generator:
registry:
external_labels:
source: tempo
storage:
path: /var/tempo/generator/wal
remote_write:
- url: http://prometheus:9090/api/v1/write # service graphs & span metrics
overrides:
defaults:
metrics_generator:
processors: [service-graphs, span-metrics]
Bring it up: docker compose up -d. Tempo listens on OTLP at 4317 / 4318; applications instrumented with OpenTelemetry SDKs can send spans directly. Or front Tempo with an OpenTelemetry Collector (see that tutorial) for filtering, sampling, attribute manipulation.
Object-storage backend (production)
storage:
trace:
backend: s3
s3:
bucket: tempo-traces
endpoint: s3.example.com
access_key: ${S3_KEY}
secret_key: ${S3_SECRET}
insecure: false
wal:
path: /var/tempo/wal
pool:
max_workers: 100
For MinIO (see that tutorial), Backblaze B2, AWS S3, Cloudflare R2 — same shape, different endpoint. Object storage cost for trace data is typically pennies per GB-month; Tempo's storage cost-per-trace is dramatically lower than Jaeger-on-Elasticsearch.
Wire Grafana
In Grafana → Data Sources → Add → Tempo. URL: http://tempo:3200. The Explore tab now has a Tempo tab where you can:
- Look up by trace ID
- Run TraceQL queries
- See service graphs (auto-generated from the metrics-generator)
TraceQL: actually queryable traces
# All traces from the payment service taking > 1 second
{ service.name = "payment-service" } && { duration > 1s }
# Traces that hit /api/checkout AND had an error
{ name = "POST /api/checkout" && status = error }
# Spans with a specific HTTP status code
{ span.http.status_code = 500 }
# Traces involving these services in this order
{ service.name = "frontend" } >> { service.name = "api" } >> { service.name = "db" }
# Aggregations
{ name="GET /products" } | avg(span.http.duration) by(span.http.status_code)
TraceQL replaces the "find a trace ID, hope it's the right one" debugging story with proper span-attribute search.
Span metrics + service graph (the killer add-on)
With metrics_generator enabled (above), Tempo auto-generates two streams of Prometheus metrics from incoming spans:
- Span metrics — per-service / per-operation RED metrics (rate, error rate, duration histogram). Free observability for any traced service without manually adding instrumentation.
- Service graphs — pairs of services that talked to each other, with edge weights for traffic + error rate. Auto-discovered architecture diagram.
Send the generated metrics to Prometheus / Mimir / VictoriaMetrics (see that tutorial) via remote_write; query alongside your other metrics.
Instrument an app (OpenTelemetry SDK)
# Python example
pip install opentelemetry-distro opentelemetry-exporter-otlp
opentelemetry-bootstrap --action=install # auto-installs SDK + instrumentations
# Run the app with auto-instrumentation
OTEL_SERVICE_NAME=payment-service \
OTEL_EXPORTER_OTLP_ENDPOINT=http://tempo:4317 \
OTEL_EXPORTER_OTLP_PROTOCOL=grpc \
OTEL_TRACES_SAMPLER=parentbased_traceidratio \
OTEL_TRACES_SAMPLER_ARG=0.1 \
opentelemetry-instrument python main.py
10% sampling by default; tune based on traffic volume. Auto-instrumentation captures HTTP server / client + database client + Redis + etc. with zero code changes. For custom spans within app code, add OpenTelemetry SDK calls.
Tail-based sampling
Random sampling at the app loses interesting outliers (the slow request, the error). Tail-based sampling: send everything to a collector, decide which traces to keep after seeing all the spans:
# In an OpenTelemetry Collector config feeding Tempo
processors:
tail_sampling:
decision_wait: 10s
num_traces: 100000
policies:
- { name: errors, type: status_code, status_code: { status_codes: [ERROR] } }
- { name: slow, type: latency, latency: { threshold_ms: 1000 } }
- { name: sample-1pct, type: probabilistic, probabilistic: { sampling_percentage: 1 } }
exporters:
otlp:
endpoint: tempo:4317
tls: { insecure: true }
Keeps 100% of errors, 100% of slow traces, 1% baseline. Captures every interesting failure; reduces storage cost.
Tempo's architecture (cluster mode)
For high-throughput production, run Tempo in microservices mode:
- distributor — receives spans, routes to ingesters
- ingester — buffers in memory, periodically flushes blocks to object storage
- querier — serves trace lookups + TraceQL queries
- compactor — merges small blocks in object storage
- query-frontend — splits large queries across queriers
- metrics-generator — computes span metrics + service graphs
Each component scales independently. Helm chart available for Kubernetes deployments.
Tempo vs alternatives
- Jaeger — the elder; the canonical OpenTelemetry-compat traces backend. Cassandra / Elasticsearch backends are operationally heavy.
- Zipkin — older still; simpler.
- SigNoz — ClickHouse-backed observability platform (metrics + logs + traces from one DB). Compelling if you want one tool for all three.
- Honeycomb / Datadog APM / Lightstep / NewRelic — commercial SaaS. Better UX for analysis; per-event pricing.
- OpenSearch / Elasticsearch — Jaeger's traditional backend.
For "I run Grafana, I want traces, I want them stored cheaply on object storage I already have," Tempo is the right pick in 2026.
Worth knowing
- OpenTelemetry is the standard. Tempo speaks OTLP natively; Jaeger / Zipkin are supported for compatibility but new instrumentation should use OTel.
- Traces are expensive without sampling. A 100 req/sec service generating 100% traces is millions of spans per day. Sample aggressively; keep errors / outliers.
- Loki / Tempo / Mimir share the same Grafana-native object-storage architecture. Operating all three together is mostly the same skill set.