Load a trace
Or drop a file below, or paste JSONL OTel spans directly:
Browse AutomationBench
Zapier's 800+ benchmark tasks — see exactly what each task asks an agent to do. Multi-step workflows in finance, sales, support, HR, marketing, operations. Or visit the dedicated browser →
automationbench task
—
—
system prompt
user prompt
declared tools
0
assertions (success criteria the agent must satisfy)
0
initial world state (seeded data the agent starts with)
trace
—
—
scene events
0
step
0
/
0
just changed:
—
—