Skip to main content
All articles
Published on6 min read

99.9% uptime on your AI voice agent: what it really means and how to verify it

'99.9% SLA' is on every vendor's website. But 99.9% = 8h 45min downtime/year. Here's the grid to tell real resilience from marketing — and the graceful degradation when it fails.

  • agent vocal ia
  • resilience
  • uptime
  • sla

Every vendor displays '99.9% uptime'. But 99.9% allows 8h 45min of downtime per year — and if those 8 hours land on a Monday between 9am and 5pm, you lose 80% of the day's pickups. Here's what that number hides, how to verify real resilience, and the graceful degradation that separates a good AI voice agent from a fragile one.

Allowed-downtime math#

  • 99.0% SLA = 87.6 h/year = 3.65 days/year of allowed downtime.
  • 99.5% SLA = 43.8 h/year = 1.8 days/year.
  • 99.9% SLA = 8.76 h/year = one bad Monday.
  • 99.95% SLA = 4.38 h/year = the bar for critical services.
  • 99.99% SLA = 52 min/year = hospital or banking standard.

4 technical questions to ask#

  • How many active cloud regions in parallel? (1 = single point of failure, 2+ = real multi-region.)
  • Are LLM, STT and TTS on different vendors? (All on OpenAI = OpenAI outage = full outage.)
  • What's the SIP failover strategy? (DNS round-robin = slow, anycast = fast, floating ASR = immediate.)
  • Is the phone number routed via SIP REDIRECT on failure? (Guarantees human redirect in < 200ms.)

Graceful degradation — what happens when it falls?#

Good scenario: the call automatically routes to (1) configured human number, (2) prerecorded voice message informing of the incident, (3) automatic SMS to the customer when the service is back. Bad scenario: radio silence, call rings 30 times then hangs up, customer hangs up for good. Ask to see the incident runbook before signing.

Verifying real uptime, not marketing#

Ask the vendor for the URL of their public status page (style status.vendor.com, powered by StatusPage or Statuspage.io). Check 90 days of history. No public status page = sign that incidents aren't public. Avoid. A public page with listed and resolved incidents = serious vendor.

Contractual compensation#

  • Service credit < 99.5% monthly uptime: 10% of the month's invoice refunded.
  • Service credit < 99.0% monthly uptime: 50% refunded.
  • Service credit < 95% monthly uptime: immediate cancellation, no penalty.
  • If nothing is written, the vendor has no obligation. Red flag at signing.

The planned-incident test#

Ask the vendor to simulate a 5-minute outage on your flow. If degradation is graceful (human redirect, SMS, visible status), good partner. If everything breaks and no one knows what to do, you know what will happen on a real incident day. First month VocazAI free to run that test risk-free.

Set up in 48h · no setup fees

Try VocazAI for free

First month free · no credit card · cancel anytime

CALLBook a demo