Voice Agent Setup
Configure how your AI agent handles phone calls, including voice selection, speech patterns, and custom pronunciations.
Access voice settings from Agent Settings → Call Settings.
Basic Settings
Voice Selection
Choose from a variety of Inworld voices for your agent. Each voice has distinct characteristics suited for different brand personalities. Use the Preview Voice button to hear how your selected voice sounds before saving.
Speaking Speed
Adjust how fast the agent speaks using the slider:
| Speed | Description |
|---|---|
| 0.5x | Slower pace, helpful for complex information |
| 1.0x | Normal conversational speed (default) |
| 1.5x | Faster pace, suitable for quick interactions |
Escalation Phone Number
Enter the phone number (with country code, e.g., +12345678901) where calls
should be transferred when the agent escalates to a human representative.
Messages
Configure what the agent says in different situations:
| Message | Purpose |
|---|---|
| Greeting Message | First thing the agent says when answering a call |
| Farewell Message | Said when ending the call |
| Unavailable Message | Played when no agents are available |
| Out of Hours Message | Played when calling outside business hours |
Advanced Settings
Custom Keyterms
Add terms to improve speech recognition accuracy. This is useful for:
- Brand names
- Product codes or SKUs
- Industry-specific terminology
- Uncommon proper nouns
The agent will be biased toward recognizing these terms when customers speak them.
Max Call Duration
Set a maximum call length (up to 20 minutes). Calls will automatically end when this limit is reached. The default is 10 minutes.
IVR / Screening Support
Enable this for outbound calls to detect:
- Voicemail greetings
- Call screening prompts
- Phone trees (IVR menus)
The system will navigate these before handing control back to the agent.
Pronunciation Settings
Use pronunciation mappings to correct how the agent pronounces specific words. This is essential for brand names, product names, or technical terms that text-to-speech engines might mispronounce.
Word-to-Word Mapping
Enter the word to match and the phonetic spelling the agent should use instead.
Example:
| Word to Match | Pronunciation |
|---|---|
| Acme | Ack-mee |
| GIF | Jiff |
| SQL | Sequel |
Using IPA for Precise Pronunciation
For more precise control, you can use the International Phonetic Alphabet (IPA) in the pronunciation field. IPA provides an exact specification of how words should be pronounced, eliminating ambiguity.
Common IPA examples:
| Word | IPA Pronunciation | Description |
|---|---|---|
| Nike | ˈnaɪki | Rhymes with “spiky” |
| Adidas | ˈædɪdæs | Emphasis on first syllable |
| Porsche | ˈpɔːrʃə | Two syllables, not one |
| Giphy | ˈdʒɪfi | Soft G sound |
To use IPA:
- Find the correct IPA transcription for your word (resources like Wiktionary provide IPA for many words)
- Enter the IPA string in the “Pronunciation” field
- Use the Preview button to verify the pronunciation sounds correct
⚠️ IPA support depends on the underlying TTS engine. Test pronunciations before deploying to ensure they render correctly.
Tips for Pronunciation Mappings
- Preview before saving: Always use the play button to hear how the pronunciation sounds
- Case-insensitive matching: “Acme”, “ACME”, and “acme” will all be matched
- Whole word matching: Only complete words are replaced, not partial matches
- Keep it simple: Start with phonetic spellings before trying IPA
Verify Your Setup
After configuring voice settings:
- Make a test call to hear the greeting message
- Verify the agent’s voice and speed match your brand
- Test pronunciation of key brand terms and product names
- Confirm escalation transfers work correctly
- Test out-of-hours and unavailable scenarios