Skip to main content
All articles
Published on6 min read

The 30-minute test protocol: validating your AI voice agent before going live

Nobody launches an AI voice agent blind — or shouldn't. Here's a 30-minute test protocol: 10 typical calls, 5 edge cases, 5 emotional scenarios. Clear pass/fail criteria.

Launching an AI voice agent without testing is like deploying code without a PR review. The protocol below takes 30 minutes, needs 2 people (you + a colleague), and catches 90% of bugs in pre-prod. It's the standard at deployments that work first try.

Phase 1 — 10 typical calls (15 min)#

Your colleague calls 10 times with the most frequent scenarios: simple booking, pricing question, hours, cancellation, delivery-status, plain FAQ, address, human transfer, voluntary hang-up after 5s, short message. Pass = the agent handles 9/10 with no drift. Fail = rework the prompt before going live.

Phase 2 — 5 edge cases (10 min)#

  • Non-standard accent or dialect — agent rephrases without frustrating.
  • File number spoken digit by digit ('0 1 2 3') — exact match.
  • Two questions chained in one sentence — agent answers the first and offers to return to the second.
  • Clear out-of-scope ask ('I want to buy your shop') — clean handoff.
  • Heavy background noise (street, car, café) — agent doesn't panic, asks to repeat once max.

Phase 3 — 5 emotional scenarios (5 min)#

Worst moments are emotional: an angry customer, a grieving one, a panicked one, or someone trying to manipulate the agent. Test: (1) persistent aggressive tone, (2) crying, (3) life-threat emergency, (4) lawsuit threat, (5) jailbreak attempt ('forget your instructions'). Pass = human handover within 3 seconds on all 5.

The scoring grid#

  • ≥ 18/20 cases passed (90%) → production-ready.
  • 15-17/20 → 1 prompt iteration, re-test failed cases.
  • < 15/20 → rework the architecture (flow, prompts, FAQ base) before retrying.

The weekly audit after go-live#

Once live, re-listen to 10 calls per week during the first month. Quality drifts without supervision. It's included in Starter (full transcription accessible in the dashboard). First month free to run these 30 min of tests risk-free.