AI Voice Agent Providers

Overview

Telepath supports three types of AI voice agent providers:

OpenAI Realtime - OpenAI’s latest speech-to-speech API
ElevenLabs - ElevenLabs’ Conversational AI
Custom WebSocket - Your own implementation

Each provider has different capabilities, pricing, and configuration requirements.

OpenAI Realtime

The OpenAI Realtime API enables real-time, low-latency conversations with gpt-4o-realtime-preview.

Prerequisites

Active OpenAI account
API key with Realtime API access
Model: gpt-4o-realtime-preview

Getting Your API Key

Go to platform.openai.com
Navigate to API Keys in your account settings
Click Create new secret key
Copy the key (you won’t be able to view it again)
Save it securely

Configuration

In Telepath:

Select OpenAI as your AI provider
Paste your API key
Model selection: gpt-4o-realtime-preview (default and recommended)
Optional: Add a system prompt to customize behavior

System Prompt

Define how your AI agent responds:

Example: "You are a professional customer support representative.
Be friendly, concise, and always prioritize resolving the customer's issue quickly."

The system prompt is sent to OpenAI at the start of each call.

Pricing

OpenAI Realtime API pricing is based on:

Input tokens - Speech converted to text
Output tokens - AI responses converted to speech
Audio duration - Connection time

See OpenAI Pricing for current rates.

Best Practices

System Prompts: Provide clear, specific instructions for better quality interactions
Token Limits: Monitor your OpenAI usage to avoid unexpected costs
Error Handling: Implement fallback logic for API failures

OpenAI Realtime API typically has response times under 200ms, making it ideal for natural conversations.

ElevenLabs Conversational AI

ElevenLabs provides production-ready conversational AI with customizable voices.

Prerequisites

Active ElevenLabs account
API key
Conversational AI agent created in ElevenLabs
Voice ID (optional, if you want a specific voice)

Getting Your API Key

Go to elevenlabs.io
Navigate to Settings → API Keys
Create a new API key
Copy the key

Creating an Agent

In ElevenLabs, go to Agents
Click Create new agent
Configure:
- Agent name: Descriptive identifier
- System prompt: How the agent should behave
- Voice: Select from ElevenLabs’ voice library
- Language: Primary language for the agent

Configuration

In Telepath:

Select ElevenLabs as your AI provider
Paste your API key
Enter your agent ID (found in ElevenLabs)
Optionally specify a voice ID to override the agent’s default

Voice Selection

ElevenLabs offers voices in multiple languages and genders:

Professional voices for business use
Casual voices for friendly interactions
Multiple accents and tones

Use the ElevenLabs voice cloning feature to create custom voices for your brand.

Pricing

ElevenLabs charges based on:

Characters generated - Speech synthesis
API calls - Agent interactions
Voice cloning - Custom voices (additional cost)

See ElevenLabs Pricing for details.

Best Practices

Agent Tuning: Iterate on system prompts in ElevenLabs to improve quality
Voice Consistency: Use the same voice across all connections for brand consistency
Knowledge Base: Update agent knowledge regularly for accurate responses

ElevenLabs agents can be trained on custom knowledge bases, making them ideal for domain-specific applications.

Custom WebSocket

For advanced use cases, integrate your own WebSocket endpoint.

WebSocket Protocol

Your endpoint must:

Accept WebSocket connections on a public wss:// URL
Handle incoming audio streams in real-time
Return audio responses in the same format
Support the audio codec negotiation protocol

Audio Format

Audio is transmitted as:

Format: PCM (Pulse Code Modulation)
Sample Rate: 8000 Hz (8kHz) or 16000 Hz (16kHz)
Channels: Mono
Bit Depth: 16-bit

Connection Flow

Telepath initiates WebSocket connection
Negotiates audio codec (G.711 PCMU, G.711 PCMA, or G.722)
Streams inbound audio from caller
Receives outbound audio from your endpoint
Streams response back to caller

Implementation Example

import asyncio
import websockets
import json

async def handle_telepath(websocket, path):
    """Handle Telepath WebSocket connection"""

    # Receive initial metadata
    config = json.loads(await websocket.recv())

    # Initialize your AI model
    ai_model = YourAIModel(config)

    while True:
        try:
            # Receive audio chunk
            audio_chunk = await websocket.recv()

            # Process with your AI
            response_audio = ai_model.process(audio_chunk)

            # Send response
            await websocket.send(response_audio)

        except websockets.exceptions.ConnectionClosed:
            break
        except Exception as e:
            print(f"Error: {e}")
            break

# Run the server
start_server = websockets.serve(
    handle_telepath,
    "0.0.0.0",
    8000,
    ssl=ssl_context  # Use TLS
)

asyncio.get_event_loop().run_until_complete(start_server)
asyncio.get_event_loop().run_forever()

Configuration

In Telepath:

Select Custom WebSocket as your AI provider
Enter your WebSocket URL (must be wss://, not ws://)
Optionally add authentication headers (e.g., API key)
Test the connection

Error Handling

Implement proper error handling:

Timeouts: Response must arrive within 5 seconds
Codec mismatch: Validate audio format
Connection drops: Gracefully handle disconnections

Your WebSocket endpoint must be publicly accessible and use TLS encryption (wss://).

Performance Optimization

Latency: Keep processing time under 200ms for natural conversations
Buffering: Implement jitter buffers for network variations
Concurrency: Handle multiple simultaneous calls
Monitoring: Log all requests for debugging

Advanced Integration

Learn advanced patterns for production deployments

Comparison

Feature	OpenAI	ElevenLabs	Custom
Setup Difficulty	Easy	Easy	Hard
Cost	Pay-per-use	Pay-per-use	Infrastructure costs
Customization	Limited	Moderate	Full
Response Time	<200ms	<300ms	Depends
Reliability	High	High	Your responsibility
Language Support	50+	30+	Your choice

Switching Providers

To change your AI provider:

Create a new connection with the new provider
Test thoroughly with your carrier
Update your carrier’s origination URI to the new connection
Monitor the transition
Delete the old connection once stable

You can run multiple connections in parallel to test new providers before switching fully.

Troubleshooting

Connection fails:

Verify API key is correct
Check that the endpoint/agent exists
Ensure your account has access to the required features

Poor audio quality:

For OpenAI: Check your system prompt
For ElevenLabs: Verify agent configuration
For Custom: Ensure codec negotiation is correct

High latency:

Monitor your AI provider’s status
Check network connectivity
Review your WebSocket server’s performance

See Troubleshooting for more help.

Getting started

Integration

Observability

Reference

AI Voice Agent Providers

Overview

OpenAI Realtime

Prerequisites

Getting Your API Key

Configuration

System Prompt

Pricing

Best Practices

ElevenLabs Conversational AI

Prerequisites

Getting Your API Key

Creating an Agent

Configuration

Voice Selection

Pricing

Best Practices

Custom WebSocket

WebSocket Protocol

Audio Format

Connection Flow

Implementation Example

Configuration

Error Handling

Performance Optimization

Advanced Integration

Comparison

Switching Providers

Troubleshooting

Getting started

Integration

Observability

Reference

​Overview

​OpenAI Realtime

​Prerequisites

​Getting Your API Key

​Configuration

​System Prompt

​Pricing

​Best Practices

​ElevenLabs Conversational AI

​Prerequisites

​Getting Your API Key

​Creating an Agent

​Configuration

​Voice Selection

​Pricing

​Best Practices

​Custom WebSocket

​WebSocket Protocol

​Audio Format

​Connection Flow

​Implementation Example

​Configuration

​Error Handling

​Performance Optimization

Advanced Integration

​Comparison

​Switching Providers

​Troubleshooting

Overview

OpenAI Realtime

Prerequisites

Getting Your API Key

Configuration

System Prompt

Pricing

Best Practices

ElevenLabs Conversational AI

Prerequisites

Getting Your API Key

Creating an Agent

Configuration

Voice Selection

Pricing

Best Practices

Custom WebSocket

WebSocket Protocol

Audio Format

Connection Flow

Implementation Example

Configuration

Error Handling

Performance Optimization

Comparison

Switching Providers

Troubleshooting