Skip to main content
All articles
Published on6 min read

One number, three languages: how it actually works under the hood

One number that picks up in French, Arabic or English depending on the caller — not an IVR menu. Here's how it really works and why it's what your customers expect.

"For French, press 1. For English, press 2." Everyone hates that. A trilingual AI voice agent works differently: it picks up, listens to the first sentence, detects the language, and sticks with it. Here's the actual mechanic.

The neutral pickup#

The agent opens with a phrase that works in all three languages. Often a short trilingual line: 'VocazAI, bonjour, hello, السلام عليكم.' 3 seconds, no more. No menu, no key to press.

Real-time detection#

  • First sentence captured streaming by the trilingual STT engine.
  • Language detection model assigns a per-language confidence score.
  • Above 80 %: lock onto the detected language.
  • Between 60-80 %: a neutral question to confirm.
  • Below 60 %: human transfer or fall back to MSA / FR by default.

Soft lock#

Once the language is detected, the agent stays in it — but remains sensitive to a genuine switch. A loanword ('okay', 'شكرا') doesn't trigger a switch. Two full sentences in the new language do. That's what makes it feel natural to bilingual callers.

Number integration#

On the telecom side it's a single DID. On the software side, the agent routes calls by detected language to the right system prompt — a slightly different script per language (adapted politeness, neutralized cultural references). You manage everything from one dashboard.

Why it beats an IVR#

An IVR forces the caller to classify their request before speaking. A trilingual agent lets them speak the way they'd speak to a human. Net effect: -40 % drop-offs at pickup, 25 % higher completion. First month VocazAI free to test on your calls.