Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.talkturo.ai/llms.txt

Use this file to discover all available pages before exploring further.

The voice your assistant uses shapes how callers experience the conversation. A well-matched voice — the right accent, pace, and emotional tone — makes the assistant feel natural and builds caller confidence. The Voice tab gives you control over the TTS provider, which voice model and voice ID to use, how quickly the assistant speaks, and how the assistant handles background noise on the line.

Choose a TTS provider

Talkturo supports the following text-to-speech providers:
ProviderStrengths
CartesiaLow latency, emotional tone control, broad voice library
ElevenLabsHigh naturalness, large library of cloned and studio voices
OpenAIFast, consistent, good for straightforward business calls
DashScope (Qwen)Strong multilingual support, optimized for East Asian languages
When you change provider, the Voice model and Voice ID fields reset so you can pick from that provider’s library.

Select a voice model and voice

After choosing a provider, select a Voice model and then a Voice. The voice selector lets you filter by language and browse available voices. Each provider offers a different library — Cartesia and ElevenLabs have the largest catalogs with options across accents, genders, and styles.
Use the voice preview feature to listen to a sample before committing. A voice that sounds good in isolation may feel too slow or too energetic on an actual call, so test it with a real conversation if possible.

Control emotion (Cartesia only)

When you select Cartesia as your provider, an Voice emotion dropdown appears. This maps to Cartesia’s built-in emotion parameter and adjusts the overall emotional quality of the voice.
EmotionWhen to use
DefaultCartesia’s standard neutral delivery
NeutralCalm, professional — good for support and healthcare
ExcitedUpbeat, enthusiastic — good for outbound sales
SadSofter, more empathetic tone
AngryAssertive delivery — use sparingly
Emotion control is only available when Cartesia is selected as the voice provider. The dropdown does not appear for other providers.

Set speaking speed

Use the Speed slider to control how fast the assistant speaks. The range is 0.5× (slow) to 2.0× (fast). Keep in mind that very fast speech (above 1.5×) can reduce comprehension, especially for older callers or in languages where listeners are less familiar. A speed of 0.9–1.1× works well for most professional call scenarios.

Add background ambience

The Background ambience dropdown lets you play ambient audio behind the assistant’s voice during the call. Options include:
  • None — silent background (default)
  • Office ambience — background chatter, simulates a busy office
  • Keyboard typing / Keyboard typing 2 — light typing sounds
  • Hold music — plays while the caller waits
Use the Ambience volume slider (0–1) to set how loud the background audio is relative to the assistant’s voice. Ambience can make the assistant feel less robotic and more like a human on a call, but too much volume can make the assistant harder to hear.

Enable a thinking sound

The Thinking sound dropdown plays a subtle audio cue while the assistant is processing a response (the pause between the caller finishing their sentence and the assistant speaking). This helps callers understand that the assistant is still active rather than silent. Options include keyboard typing sounds or none.

Configure noise cancellation

The Noise cancellation dropdown controls how the assistant handles background sound it picks up from the caller’s end of the line.
ModeWhat it does
OffNo noise filtering — raw audio passed to transcription
Standard noise cancellationFilters background noise like traffic, wind, or HVAC
Background voice cancellationRemoves other speakers in the caller’s environment
Background voice cancellation (the default, labeled “bvc”) is the recommended setting for most deployments. It prevents cross-talk from other people in the room from interfering with the conversation.
Turning noise cancellation off may improve transcription accuracy in very quiet environments, but in noisy settings it can cause the assistant to transcribe background noise as speech, leading to unexpected responses.