Skip to main content

Overview

Telepath supports three types of AI voice agent providers:
  • OpenAI Realtime - OpenAI’s latest speech-to-speech API
  • ElevenLabs - ElevenLabs’ Conversational AI
  • Custom WebSocket - Your own implementation
Each provider has different capabilities, pricing, and configuration requirements.

OpenAI Realtime

The OpenAI Realtime API enables real-time, low-latency conversations with gpt-4o-realtime-preview.

Prerequisites

  • Active OpenAI account
  • API key with Realtime API access
  • Model: gpt-4o-realtime-preview

Getting Your API Key

  1. Go to platform.openai.com
  2. Navigate to API Keys in your account settings
  3. Click Create new secret key
  4. Copy the key (you won’t be able to view it again)
  5. Save it securely

Configuration

In Telepath:
  1. Select OpenAI as your AI provider
  2. Paste your API key
  3. Model selection: gpt-4o-realtime-preview (default and recommended)
  4. Optional: Add a system prompt to customize behavior

System Prompt

Define how your AI agent responds:
Example: "You are a professional customer support representative.
Be friendly, concise, and always prioritize resolving the customer's issue quickly."
The system prompt is sent to OpenAI at the start of each call.

Pricing

OpenAI Realtime API pricing is based on:
  • Input tokens - Speech converted to text
  • Output tokens - AI responses converted to speech
  • Audio duration - Connection time
See OpenAI Pricing for current rates.

Best Practices

  • System Prompts: Provide clear, specific instructions for better quality interactions
  • Token Limits: Monitor your OpenAI usage to avoid unexpected costs
  • Error Handling: Implement fallback logic for API failures
OpenAI Realtime API typically has response times under 200ms, making it ideal for natural conversations.

ElevenLabs Conversational AI

ElevenLabs provides production-ready conversational AI with customizable voices.

Prerequisites

  • Active ElevenLabs account
  • API key
  • Conversational AI agent created in ElevenLabs
  • Voice ID (optional, if you want a specific voice)

Getting Your API Key

  1. Go to elevenlabs.io
  2. Navigate to SettingsAPI Keys
  3. Create a new API key
  4. Copy the key

Creating an Agent

  1. In ElevenLabs, go to Agents
  2. Click Create new agent
  3. Configure:
    • Agent name: Descriptive identifier
    • System prompt: How the agent should behave
    • Voice: Select from ElevenLabs’ voice library
    • Language: Primary language for the agent

Configuration

In Telepath:
  1. Select ElevenLabs as your AI provider
  2. Paste your API key
  3. Enter your agent ID (found in ElevenLabs)
  4. Optionally specify a voice ID to override the agent’s default

Voice Selection

ElevenLabs offers voices in multiple languages and genders:
  • Professional voices for business use
  • Casual voices for friendly interactions
  • Multiple accents and tones
Use the ElevenLabs voice cloning feature to create custom voices for your brand.

Pricing

ElevenLabs charges based on:
  • Characters generated - Speech synthesis
  • API calls - Agent interactions
  • Voice cloning - Custom voices (additional cost)
See ElevenLabs Pricing for details.

Best Practices

  • Agent Tuning: Iterate on system prompts in ElevenLabs to improve quality
  • Voice Consistency: Use the same voice across all connections for brand consistency
  • Knowledge Base: Update agent knowledge regularly for accurate responses
ElevenLabs agents can be trained on custom knowledge bases, making them ideal for domain-specific applications.

Custom WebSocket

For advanced use cases, integrate your own WebSocket endpoint.

WebSocket Protocol

Your endpoint must:
  • Accept WebSocket connections on a public wss:// URL
  • Handle incoming audio streams in real-time
  • Return audio responses in the same format
  • Support the audio codec negotiation protocol

Audio Format

Audio is transmitted as:
  • Format: PCM (Pulse Code Modulation)
  • Sample Rate: 8000 Hz (8kHz) or 16000 Hz (16kHz)
  • Channels: Mono
  • Bit Depth: 16-bit

Connection Flow

1. Telepath initiates WebSocket connection
2. Negotiates audio codec (G.711 PCMU, G.711 PCMA, or G.722)
3. Streams inbound audio from caller
4. Receives outbound audio from your endpoint
5. Streams response back to caller

Implementation Example

import asyncio
import websockets
import json

async def handle_telepath(websocket, path):
    """Handle Telepath WebSocket connection"""

    # Receive initial metadata
    config = json.loads(await websocket.recv())

    # Initialize your AI model
    ai_model = YourAIModel(config)

    while True:
        try:
            # Receive audio chunk
            audio_chunk = await websocket.recv()

            # Process with your AI
            response_audio = ai_model.process(audio_chunk)

            # Send response
            await websocket.send(response_audio)

        except websockets.exceptions.ConnectionClosed:
            break
        except Exception as e:
            print(f"Error: {e}")
            break

# Run the server
start_server = websockets.serve(
    handle_telepath,
    "0.0.0.0",
    8000,
    ssl=ssl_context  # Use TLS
)

asyncio.get_event_loop().run_until_complete(start_server)
asyncio.get_event_loop().run_forever()

Configuration

In Telepath:
  1. Select Custom WebSocket as your AI provider
  2. Enter your WebSocket URL (must be wss://, not ws://)
  3. Optionally add authentication headers (e.g., API key)
  4. Test the connection

Error Handling

Implement proper error handling:
  • Timeouts: Response must arrive within 5 seconds
  • Codec mismatch: Validate audio format
  • Connection drops: Gracefully handle disconnections
Your WebSocket endpoint must be publicly accessible and use TLS encryption (wss://).

Performance Optimization

  • Latency: Keep processing time under 200ms for natural conversations
  • Buffering: Implement jitter buffers for network variations
  • Concurrency: Handle multiple simultaneous calls
  • Monitoring: Log all requests for debugging

Advanced Integration

Learn advanced patterns for production deployments

Comparison

FeatureOpenAIElevenLabsCustom
Setup DifficultyEasyEasyHard
CostPay-per-usePay-per-useInfrastructure costs
CustomizationLimitedModerateFull
Response Time<200ms<300msDepends
ReliabilityHighHighYour responsibility
Language Support50+30+Your choice

Switching Providers

To change your AI provider:
  1. Create a new connection with the new provider
  2. Test thoroughly with your carrier
  3. Update your carrier’s origination URI to the new connection
  4. Monitor the transition
  5. Delete the old connection once stable
You can run multiple connections in parallel to test new providers before switching fully.

Troubleshooting

Connection fails:
  • Verify API key is correct
  • Check that the endpoint/agent exists
  • Ensure your account has access to the required features
Poor audio quality:
  • For OpenAI: Check your system prompt
  • For ElevenLabs: Verify agent configuration
  • For Custom: Ensure codec negotiation is correct
High latency:
  • Monitor your AI provider’s status
  • Check network connectivity
  • Review your WebSocket server’s performance
See Troubleshooting for more help.