ObservabilityMay 2025·8 min

Tracing 10M agent runs without melting your warehouse

Agent traces are the highest-cardinality data most teams have ever stored. Here is how Capsule keeps full-fidelity replay fast and cheap.

The cardinality problem

A single agent run can fan out into dozens of tool calls and sub-agent handoffs, each with its own inputs, outputs, and timing. Multiply by millions of runs and naive row-per-span storage falls over — both on cost and query latency.

Columnar storage + smart sampling

We store spans in a columnar format partitioned by agent and day, keep hot traces fully resolved, and tail-sample the long tail. Errors and budget breaches are always retained at full fidelity, because those are the runs you actually need to replay.

Replaying a run

Every trace is reconstructable end to end — inputs, each tool call, every handoff, and the final outcome — so debugging a production incident takes minutes instead of hours.

Contain your agents with Capsule.Book a demo →