W
Wendell
Quickstart

Agent Test Suite

Financial Crime Compliance

A safe place to make your agent better. Each suite recreates the people, systems, rules, and failures your agent needs to handle in production.

Pack formula

playbook summary
+ scenario pack
+ mock tools
+ trajectory rubric

Agent Touchpoints

CustomersAnalystsProfilesDocumentsRisk screensCase systems

Agent Boundaries

Missing evidence blocks approvalNo disclosure of restricted investigation detailsHigh-risk matches escalate

Agent Failure Handling

Request more evidenceEscalate uncertain matchesDocument rationale before decision

Agent Improvement

Evidence groundingDisclosure safetyEscalation qualityAudit completeness

Executable calibration loop

The agent is dropped into a controlled scenario, observes facts, chooses an action, and Wendell records evidence for scoring.

Example scenario grid

Financial Crime Compliance scenarios

These are the scenarios generated from the playbook. Each one is a controlled run where the agent must navigate state, tools, constraints, and expected evidence.

S01

Customer has a stale document and low transaction risk.

Partial

S02

Beneficial owner appears in adverse media with uncertain match quality.

Failed

S03

Sanctions screen returns a potential false positive.

Failed

S04

Customer asks why the account is under review.

Partial

S05

Agent is missing a required document and must not approve the case.

Failed

Baseline agent run

Baseline run score before calibration.

Overall score

24

Needs Work

Evidence grounding

30%

Disclosure safety

28%

Escalation quality

22%

Audit completeness

16%

Evaluation readout

  • Needs evidence for: decision grounded in evidence
  • Needs evidence for: no disclosure of restricted investigation details
  • Needs evidence for: high-risk cases escalate
  • Needs evidence for: missing evidence blocks approval

Footnote: example score shown for a first pass before workflow-specific calibration.