Select Agent
Choose what Wendell should evaluate. Start with built-in baselines, connect a custom agent endpoint, or run through Pi as an agent harness.
HTTP Endpoint
Production agents behind an API
Wendell sends each scenario observation to your hosted agent and records the response, tool calls, and latency.
POST /respond → { message, tool_calls }SDK
CI/eval pipelines and internal platforms
Embed Wendell directly in Python or TypeScript and evaluate your own agent function against scenario suites.
wendell.evaluate({ agent, scenarios })CLI / Command
Local agents, prototypes, LangChain, CrewAI
Any command that accepts JSON on stdin and prints an agent response can be tested by Wendell.
--agent-command "python my_agent.py"Browser Target
Browser/UI agents
Wendell presents a simulated customer or workflow page that browser agents interact with like a real app.
Launch target URL + session tokenMCP / Tool Server
Tool-using agents and desktop assistants
Expose simulations as tools: start, observe, send message, call tool, finish run, and get report.
start_simulation, observe, finish_runWebhook Events
Observability, dashboards, QA systems
Stream scenario, trajectory, assessment, and critical-failure events into your own systems.
scenario.completed, assessment.completed