Stress-testing your AI voice agent at 5× normal volume: the protocol before the seasonal spike
Black Friday, reverse holiday, TV campaign: your AI voice agent faces 5× its usual volume. If it breaks, you lose the month. Here's the stress-test protocol 48h before the spike.
- agent vocal ia
- test
- charge
- pic
- volume
An AI voice agent that holds at 20 calls/hour can collapse at 80 — not because it's bad, but because no one tested. Black Friday, the day after a holiday, post-TV-campaign, seasonal sales: your volume can be 3-5× without warning. Here's the stress-test protocol that saves you from disaster.
4 bottlenecks to know#
- LLM RPS (Requests Per Second) — capped by your provider quota. At 50 calls/min, the agent does ~30 RPS against the LLM.
- Concurrent STT streams — each active call consumes 1 stream. If capped at 20, the 21st call doesn't transcribe.
- Simultaneous SIP trunks — capacity bought from your SIP carrier. Often 10-50 default, must be tuned up.
- Database I/O — each turn writes to DB. 100 calls/min × 6 turns = 600 writes/min. Untuned DB → latency cascade.
3-step stress-test protocol#
(1) Generate 5× your monthly peak volume synthetically via Loadero, k6 or Hammer.io — 30 minutes of simulated calls. (2) Watch the 4 bottlenecks in real time on the provider dashboard. (3) Note the threshold where quality degrades: latency > 1s, dropped transcription, timed-out calls. That threshold is your real capacity — not the marketing number.
Acceptable thresholds under load#
- Agent latency < 800ms at 100% of tested capacity.
- STT failure rate < 2% at peak.
- SIP drop rate < 1% at peak.
- Dashboard response time < 3s even under load.
Graceful-degradation strategy past the threshold#
When you exceed capacity (never stop — always degrade gracefully): (1) route excess calls to a hold message + scheduled callback, (2) prioritize VIPs (regulars, known numbers) on the AI agent, (3) transfer the rest to human or voicemail. The caller never lands in silence.
The classic mistake: averaged testing#
Many test 200 calls spread over 1 hour. Useless. The real test: 200 calls in the same minute. Bursts break agents, not averages. Concentrating the test in the first 90 seconds simulates the post-TV-ad spike, the viral tweet, or the power cut that resolves all at once.
The monthly checkpoint#
Redo the stress-test 1×/month if you have seasonal volume, 1×/quarter otherwise. Cost: $50-200 for simulation. Benefit: never discover a collapse in a real peak. First month VocazAI free to run your first stress-test risk-free.