Car, café, construction site: how your AI voice agent handles noisy calls without asking 3 times
40% of mobile calls have significant background noise. If your AI voice agent asks for a repeat every time, you lose the call. Here's the technical stack that decodes through the chaos.
- agent vocal ia
- bruit
- environnement
- noisy
Four mobile calls in ten happen in a noisy environment: moving car, café, train station, supermarket, construction site. The caller just wants to book a slot; but if the AI voice agent asks them to repeat 3 times, they hang up. The fix isn't magic — it's a 4-layer stack that decodes through chaos. Here it is.
Layer 1 — input-side noise suppression#
RNNoise / Krisp / WebRTC NS algorithms applied to the inbound audio BEFORE transcription starts. Reduces up to -20dB of ambient noise while preserving human voice. Without this layer, models like Whisper or Voxtral drop to 60-65% accuracy in noisy environments. With it: 85-90%.
Layer 2 — noise-trained models#
Pick a speech-to-text model specifically trained on mobile and noisy audio — Voxtral, Whisper Large v3, or Deepgram Nova. They've seen hundreds of hours of café and car audio in training. Vs a 'clean' model: +10-15 accuracy points on brouhaha.
Layer 3 — explicit confirmation of critical fields#
The agent never guesses on numbers (phone, file number, postal code) in noisy environments. Scripted line: 'Just to be sure — is your number 06 12 34 56 78?'. Explicit confirmation guarantees the booking even if the transcription had 1-2 errors. Cost: 5 extra seconds. Benefit: zero bookings created with a wrong number.
Layer 4 — SMS fallback after > 2 repeat requests#
Prompt rule: if the agent has already asked for a repeat 2 times in the same conversation, fall back to SMS. 'I can't hear you well. I'll text you a link to confirm your booking.' The caller continues on a silent channel. Recovery: ~80% of these calls instead of a hang-up.
Pitfalls to avoid#
- Output TTS too loud — saturates the caller's mic and amplifies their own noise back. Calibrate to -3dB from standard.
- Complex polite phrasing in heavy noise — use short sentences (< 10 words) that survive even chopped.
- Infinite 'I didn't understand' loop — always set a limit (3 max attempts) with a clean exit.
The weekend test#
Call your agent from a car on the highway (windows down), from a café at 3pm, and from a busy street. 3 environments, 3 calls. Measure: completion rate without handoff, number of 'please repeat', booking creation rate. If you pass all 3 without frustration, you're ahead of 90% of the market. First month VocazAI free to calibrate.